## September 26, 2016

### Chris Siebenmann

#### How I live without shell job control

In my comments on yesterday's entry, I mentioned that my shell doesn't support job control. At this point people who've only used modern Unix shells might manage how you get along without such a core tool as job control. The answer, at least for me, is surprisingly easily (at least most of the time).

Job control is broadly useful for three things: forcing programs to pause (and then un-pausing them), pushing programs into the background to get your shell back, and calling backgrounded programs back into the foreground. In other words, job control is one part suspending and restarting programs and one part multiplexing a single session between multiple programs.

It's possible that I'm missing important uses of being able to easily pause and unpause programs. However, I'm not missing the ability in general, because you can usually use `SIGSTOP` and `SIGCONT` by hand. I sometimes wind up doing this, although it's not something I feel the need for very often.

(I do sometimes Ctrl-C large `make`s if I want to do something else with my machine; with job control it's possible that I'd suspect the `make` instead and then have it resume afterwards.)

My approach to the 'recover my shell' issue is to start another shell. That's what windows are for (and `screen`), and I have a pretty well developed set of tools to make new shells cheap and easy; in my opinion, multiple windows are the best and most flexible form of multiplexing. I do sometimes preemptively clone a new window before I run a command in the foreground, and I'll admit that there are occasions when I start something without backgrounding it when I really should have done otherwise. A classical case is running '`emacs file`' (or some other GUI program) for what I initially think is going to be a quick use and then realizing that I want to keep that `emacs` running while getting my shell back.

(This is where my habit of using vim in a terminal is relevant, since that takes over the terminal anyways. I can't gracefully multiplex such a terminal between, say, vim and `make`; I really want two terminals no matter what.)

So far I can't think of any occasions where I've stuck a command into the background and then wanted it to be in the foreground instead. I tend not to put things in the background very much to start with, and when I do they're things like GNU Emacs or GUI programs that I can already interact with in other ways. Perhaps I'm missing something, but in general I feel that my environment is pretty good at multiplexing things outside of job control.

(At the same time, if someone added job control to my shell of choice, I wouldn't turn my nose up at it. It just seems rather unlikely at this point, and I'm not interested in switching shells to get job control.)

### Sidebar: multiplexing and context

One of the things that I like about using separate windows instead of multiplexing several things through one shell is that separate windows clearly preserve and display the context for each separate thing I'm doing. I don't have to rebuild my memory of what a command is doing (and what I'm doing with it) when I foreground it again; that context is right there, and stays right there even if I wind up doing multiple commands instead of just one.

(Screen sessions are somewhat less good at this than terminal windows, because scrollback is generally more awkward. Context usually doesn't fit in a single screen.)

PS: the context is not necessarily just in what's displayed, it's also in things like my history of commands. With separate windows, each shell's command history is independent and so is for a single context; I don't have commands from multiple contexts mingled together. But I'm starting to get into waving my hands a lot, so I'll stop here.

## September 25, 2016

### Chris Siebenmann

#### A surprising benefit of command/program completion in my shell

I've recently been experimenting with a variant of my usual shell that extends its general (filename) completion to also specifically complete program names from your `\$PATH`. Of course this is nothing new in general in shells; most shells that have readline style completion at all have added command completion as well. But it's new to me, so the experience has been interesting.

Of course the obvious benefit of command completion is that it makes it less of a pain to deal with long command names. In the old days this wasn't issue because Unix didn't have very many long command names, but those days are long over by now. There are still a few big new things that have short names, such as `git` and `go`, but many other programs and systems give themselves increasingly long and annoying binary names. Of course you can give regularly used programs short aliases via symlinks or cover scripts, but that's only really worth it in some cases. Program completion covers everything.

(An obvious offender here is Google Chrome, which has the bland name of `google-chrome` or even `google-chrome-stable`. I have an alias or two for that.)

But command completion turned out to have a much more surprising benefit for me: it's removed a lot of guesswork about what exactly a program is called, especially for my own little scripts and programs. If I use a program regularly I remember its full name, but if I don't I used to have to play a little game of 'did I call it `decodehdr` or `decodehdrs` or `decode-hdr`?'. Provided that I can remember the start of the command, and I usually can, the shell will now at least guide me to the rest of it and maybe just fill it in directly (it depends on whether the starting bit uniquely identifies the command).

One of the interesting consequences of this is that I suspect I'm going to wind up changing how I name my own little scripts. I used to prioritize short names, because I had to type the whole thing and I don't like typing long names. But with command completion, it's probably better to prioritize a memorable, unique prefix that's not too long and then a tail that makes the command's purpose obvious. Calling something `dch` might have previously been a good name (although not for something I used infrequently), but now I suspect that names like '`decode-mail-header`' are going to be more appealing.

(I'll have to see, and the experiment is a little bit precarious anyways so it may not last forever. But I'll be sad to be without command completion if it goes.)

## September 24, 2016

### Chris Siebenmann

#### You probably want to start using the `-w` option with `iptables`

The other day, I got notified that my office workstation had an exposed portmapper service. That was frankly weird, because while I had `rpcbind` running for some NFS experiments, I'd carefully used `iptables` to block almost all access to it. Or at least I thought I had; when I looked at '`iptables -vnL INPUT`', my blocks on tcp:111 were conspicuously missing (although it did have the explicit allow rules for the good traffic). So I went through systemd's logs from when my own service for installing all of my IP security rules was starting up and, well:

```Sep 22 10:14:30 <host> blocklist[1834]: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?```

I have mixed feelings about this message. On the one hand, it's convenient when programs tell you exactly how they've made your life harder. On the other hand, it's nicer if they don't make your life harder in the first place.

So, the short version of what went wrong is that (modern) Linux iptables only allows one process to be playing around with iptables at any given time. If this happens to you, by default `iptables` just errors out, printing a helpful message about how it knows what you probably want to do but it's not going to do it because of reasons (I'm sure they're good reasons, honest).

(It also applies to `ip6tables`, and it appears that `iptables` and `ip6tables` share the same lock. The lock is global, not per-chain or per-table or anything.)

Now, you might think that I was foolishly running two sets of `iptables` commands at the same time. It turns out that I probably was, but it's not obvious, so let's follow along. According to the logs, the other thing happening at this point during boot was that my IKE daemon was starting. It was starting in parallel because this is a Fedora machine, which means systemd, and systemd likes to do things in parallel whenever it can (which in practice means whenever you don't prevent it from doing so). As part of starting up, the Fedora `ipsec.service` has:

```# Check for nflog setup
ExecStartPre=/usr/sbin/ipsec --checknflog
```

This exists to either set up or disable 'iptables rules for the nflog devices', and it's implemented in the `ipsec` shell script by running various `iptables` commands. Even if you don't have any nflog settings in your `ipsec.conf` and there aren't any devices configured, `ipsec` runs at least one `iptables` command to verify this. This takes the lock, which collided with my own IP security setup scripts.

(If you guessed that the `ipsec` script does not use '```iptables -w```', you win a no-prize. From casual inspection, the script just assumes that all `iptables` commands work all the time, so it isn't at all prepared for them to fail due to locking problems.)

This particular iptables change seems to have been added in 2013, in this commit (via). Either many projects haven't noticed or many projects have the problem that they need to be portable to `iptables` versions that don't have a `-w` argument and so will fail completely if you try to use '`iptables -w`'. I suspect it's a bit of both, honestly.

(Of the supported Linux versions that we still use, Ubuntu 12.04 LTS and RHEL/CentOS 6 don't have '`iptables -w`'. Ubuntu 14.04 and RHEL 7 have it.)

PS: My solution was to serialize IKE IPSec startup so that it was forced to happen after my IP security stuff had finished; this was straightforward with a systemd override via '```systemctl edit ipsec.service```'. I also went through my own stuff to add '`-w`' to all of my `iptables` invocations, because it can't hurt and it somewhat protects me against any other instances of this.

## September 23, 2016

Update: 2016-09-25: Someone pointed out that a better title considering what I'm saying is, "Google should save Twitter as an act of charity".

Twitter isn't a good "MBA runs the numbers" acquisition. However could be used as a showcase for GCE. It would more than justify itself. In fact, the financial losses might be off-set by the marketing value it provides to GCE.

As part of integrating it into the internal Google stack, they should require their engineers to rebuild it on the Google Cloud Engine platform. GCE scales crazy-good. Twitter has a history of scaling problems. If Google could run it on the Google Cloud Engine, and show that it scales, it would be great advertising.

Google needs GCE to succeed (but that's for another blog post... or you can read Appendix B ofhttp://the-cloud-book.com .. especially the last few paragraphs.)

How difficult would it be to rebuild Twitter on GCE? I think it would be easier than you'd imagine. Every talk I've seen by Twitter engineers at conferences is about technology that (and I don't mean this with any disrespect) is reproducing something in Google's stack. Most of those technologies being re-invented are available in GCE now, and the rest really should be. In fact, if they aren't available in GCE they should be. The project of porting Twitter to GCE would generate a list of high-quality feature requests. Interestingly enough, the re-invented technologies don't seem to be as scalable as Google's original. Oh, and it seems like a lot of people re-implementing those technologies at Twitter are ex-Google employee so ... you have that.

Sadly the few Google executives that I know think that Twitter is a joke, nobody uses it, and isn't worth saving. I disagree. I think it is "the world's chatroom". If you think Twitter "doesn't matter" then why does every news program, TV show, and billboard list a Twitter handle? (Hint: they don't list G+ handles... does G+ even have handles?)

So, in summary:

• It would help save this important resource that the world finds very useful.
• It would be the best showcase of GCE evah.... which is something Google needs more than the revenue of Twitter.
• Sadly Google executives dis Twitter as a niche application that a very small number of people find compelling. (Spoiler alert: I think they're wrong)

I wonder what will happen.

Tom

NOTE: This article was written by Thomas Limoncelli and included no involvement by current or past co-authors or friends.

### Raymii.org

These small snippets create password strings you can put in /etc/shadow when you need to reset a password on a system.

## September 22, 2016

### Racker Hacker

#### Power 8 to the people

IBM Edge 2016 is almost over and I’ve learned a lot about Power 8 this week. I’ve talked about some of the learnings in my recaps of days one and two. The performance arguments sound really interesting and some of the choices in AIX’s design seem to make a lot of sense.

However, there’s one remaining barrier for me: Power 8 isn’t really accessible for a tinkerer.

## Tinkering?

attempt to repair or improve something in a casual or desultory way,
often to no useful effect.
“he spent hours tinkering with the car”

When I come across a new piece of technology, I really enjoy learning how it works. I like to find its strengths and its limitations. I use that information to figure out how I might use the technology later and when I would recommend the technology for someone else to use it.

To me, tinkering is simply messing around with something until I have a better understanding of how it works. Tinkering doesn’t have a finish line. Tinkering may not have a well-defined goal. However, it’s tinkering that leads to a more robust community around a particular technology.

For example, take a look at the Raspberry Pi. There were plenty of other ARM systems on the market before the Pi and there are still a lot of them now. What makes the Pi different is that it’s highly accessible. You can get the newest model for \$35 and there are tons of guides for running various operating systems on it. There are even more guides for how to integrate it with other items, such as sprinkler systems, webcams, door locks, and automobiles.

Another example is the Intel NUC. Although the NUC isn’t the most cost-effective way to get an Intel chip on your desk, it’s powerful enough to be a small portable server that you can take with you. This opens up the door for software developers to test code wherever they are (we use them for OpenStack development), run demos at a customer location, or make multi-node clusters that fit in a laptop bag.

### What makes Power 8 inaccessible to tinkerers?

One of the first aspects that most people notice is the cost. The S821LC currently starts at around \$6,000 on IBM’s site, which is a bit steep for someone who wants to learn a platform.

I’m not saying this server should cost less — the pricing seems quite reasonable when you consider that it comes with dual 8-core Power 8 processors in a 1U form factor. It also has plenty of high speed interconnects ready for GPUs and CAPI chips. With all of that considered, \$6,000 for a server like this sounds very reasonable.

There are other considerations as well. A stripped down S821LC with two 8-core CPUs will consume about 406 Watts at 50% utilization. That’s a fair amount of power draw for a tinkerer and I’d definitely think twice about running something like that at home. When you consider the cooling that’s required, it’s even more difficult to justify.

AIX provides some nice benefits on Power 8 systems, but it’s difficult to access as well. Put “learning AIX” into a Google search and look at the results. The first link is a thread on LinuxQuestions.org where the original poster is given a few options:

• Get in some legal/EULA gray areas with VMware
• Find an old Power 5/6 server that is coming offline at a business that is doing a refresh

Having access to AIX is definitely useful for tinkering, but it could be very useful for software developers. For example, if I write a script in Python and I want to add AIX support, I’ll need access to a system running AIX. It wouldn’t necessarily need to be a system with tons of performance, but it would need the functionality of a basic AIX environment.

## Potential solutions

I’d suggest two solutions:

1. Get AIX into an accessible format, perhaps on a public cloud
2. Make a more tinker-friendly Power 8 hardware platform

Let’s start with AIX. I’d gladly work with AIX in a public cloud environment where I pay some amount for the virtual machine itself plus additional licensing for AIX. It would still be valuable even if the version of AIX had limiters so that it couldn’t be used for production workloads. I would be able to access the full functionality of a running AIX environment.

The hardware side leads to challenges. However, if it’s possible to do a single Power 8 SMT2 CPU in a smaller form factor, this could become possible. Perhaps these could even be CPUs with some type of defect where one or more cores are disabled. That could reduce cost while still providing the full functionality to someone who wants to tinker with Power 8.

Some might argue that this defeats the point of Power 8 since it’s a high performance, purpose-built chip that crunches through some of the world’s biggest workloads. That’s a totally valid argument.

However, that’s not the point.

The point is to get a fully-functional Power 8 CPU — even if it has serious performance limitations — into the hands of developers who want to do amazing things with it. My hope would be that these small tests will later turn into new ways to utilize POWER systems.

It could also be a way for more system administrators and developers to get experience with AIX. Companies would be able to find more people with a base level of AIX knowledge as well.

## Final thoughts

IBM has something truly unique with Power 8. The raw performance of the chip itself is great and the door is open for even more performance through NVlink and CAPI accelerators. These features are game changers for businesses that are struggling to keep up with customer demands. A wider audience could learn about this game-changing technology if it becomes more accessible for tinkering.

Photo credit: Wikipedia

The post Power 8 to the people appeared first on major.io.

## September 21, 2016

### Racker Hacker

#### IBM Edge 2016: Day 2 Recap

Day two of IBM Edge 2016 is all done, and the focus has shifted to the individual. Let’s get right to the recap:

# General session

One of the more memorable talks during the general session was Hortonworks. They’ve helped a transport company do more than simply track drivers. They assemble and analyze lots of information about each driver, the truck, the current road conditions, and other factors. From there, they apply a risk rating to that particular truck and provide updates to the driver about potential hazards. It reduced their insurance costs by 10%.

Florida Blue shared some insights from their POWER deployments and how they were able to get customers serviced faster. One of the more memorable quotes was:

The best way to get a customer happy is to get them off the phone.

They were able to rework how the backend systems retrieved data for their customer service personnel and cut average phone call durations from 9 minutes to 6.

Jason Pontin came on stage with three technology innovators under 35. They shared some of their latest work with the audience and it was amazing to see the problems they’re trying to solve. Lisa DeLuca introduced her new children’s book that helps to explain technology in new ways:

# Breakouts

My first breakout session was Getting Started with Linux Performance on IBM POWER8 from Steve Nasypany. This was a highly informative session and you’ll definitely want to grab the slides from this talk whether you use POWER or not.

Steve dove into how to measure and adjust performance on POWER systems. He also gave some insight on how AIX and Linux differ when it comes to performance measurements. There are quite a few differences in how AIX and Linux refer to processors and how they measure memory usage. He took quite a bit of time to explain not only the what, but the why. It was a great session.

My second breakout was Bringing the Deep Learning Revolution into the Enterprise from Michael Gschwind. He kicked off with the basics of machine learning and how it matches up with the functions of a human brain. He provided some examples of objects that the human brain can quickly identify but a computer cannot.

The math is deep. Really deep. One of the interesting topics was stochastic gradient descent (warning: highly nerdy territory). It measures how well the computer has been trained on a particular machine learning task. The goal is to reduce errors and do less brute-force training with the computer so it can begin working independently. It’s oddly similar to raising children.

# theCUBE interview

My breakouts were cut a little short because I was invited to be on theCUBE! It was completely nerve-wracking, but I had a great time. The hosts were fun to work with and the conversation seemed to flow quite well.

We talked about OpenStack, OpenPOWER, and Rackspace. You can watch my interview below if you can put up with my Texas accent:

# Train concert

We headed outside in the evening for a poolside reception. The weather was in the 80’s and it felt great outside!

Everyone made their way inside to see Train perform live!

The concert was great. They played plenty of their older hits and shared a new single that hasn’t been released yet. We even heard some covers of Led Zeppelin and Rolling Stones songs! Some attendees were dragged up on stage to help with the singing and they loved it.

The post IBM Edge 2016: Day 2 Recap appeared first on major.io.

## September 20, 2016

### Racker Hacker

#### IBM Edge 2016: Day 1 Recap

I am here in Las Vegas for IBM Edge 2016 to learn about the latest developments in POWER, machine learning, and OpenStack. It isn’t just about learning — I’m sharing some of our own use cases and challenges from my daily work at Rackspace.

I kicked off the day with a great run down the Las Vegas strip. There are many more staircases and escalators than I remember, but it was still a fun run. The sunrise was awesome to watch as all of the colors began to change:

Without further ado, let’s get to the recap.

## General Session

Two of the talks in the general session caught my attention: OpenPOWER and Red Bull Racing.

The OpenPOWER talk showcased the growth in the OpenPOWER ecosystem. It started with only five members, but it’s now over 250. IBM is currently offering OpenPOWER-based servers and Rackspace’s Barreleye design is available for purchase from various vendors.

Red Bull Racing kicked off with an amazing video about the car itself, the sensors, and what’s involved with running in 21 races per year. The highlight of the video for me was seeing the F1 car round corners on a mountain while equipped with snow chains.

The car itself has 100,000 components and the car is disassembled and reassembled for each race based on the race conditions. Due to restrictions on how often they can practice, they run over 50,000 virtual simulations per year to test out different configurations and parts. Each race generates 8 petabytes of data and it is live-streamed to the engineers at the track as well as an engineering team in the UK. They can make split second choices on what to do during the race based on this data.

They gave an example of a situation where something was wrong with the car and the driver needed to make a pit stop. The engineers looked over the data that was coming from the car and identified the problem. Luckily, the driver could fix the issue by flipping a switch on the steering wheel. The car won the race by less than a second.

## Breakouts

My first stop on breakouts was Trends and Directions in IBM Power Systems. We had a high-level look at some of the advancements in POWER8 and OpenPOWER. Two customers shared their stories around why POWER was a better choice for them than other platforms, and everyone made sure to beat up on Moore’s Law at every available opportunity. Rackspace was applauded for its leadership on Barreleye!

The most interesting session of the day was the IBM POWER9 Technology Advanced Deep Dive. Jeff covered the two chips in detail and talked about some of the new connections between the CPU and various components. I’m interested in the hardware GZIP acceleration, NVLINK, and CAPI advancements. The connections to CAPI will be faster, thanks to the Power Service Layer (PSL) moving from the CAPI chip to the CPU itself. This reduces latency when communicating with the accelerator chip.

POWER9 has 192GB/sec on the PCIe Gen4 bus (that’s 48 lanes) and there’s 300GB/sec (25Gbit/sec x 48 lanes) of duplex bandwidth available for what’s called Common Link. Common Link is used to communicate with accelerators or remote SMP and it will likely be called “Blue Link” at a later date. Very clever, IBM.

I wrapped the day with Calista Redmond’s OpenPower Revolution in the Datacenter. She talked about where the OpenPOWER foundation is today and where it’s going in the future.

## EXPO

As you might expect, IBM has most of the EXPO floor set aside for themselves and they’re showing off new advances in POWER, System z, and LinuxONE. I spent a while looking at some of the new POWER8 chassis offerings and had a good conversation with some LinuxONE experts about some blockchain use cases.

IBM hired DJ Andrew Hypes and DJ Tim Exile to make some unique music by sampling sounds in a datacenter. They sampled sounds from IBM servers and storage devices and created some really unique music. It doesn’t sound anything like a datacenter, though (thank goodness for that).

The Red Bull Racing booth drew a fairly large crowd throughout the evening. They had one of their F1 cars on site with its 100+ sensors:

## Summary

The big emphasis for the first day was on using specialized hardware for specialized workloads. Moore’s Law took a beating throughout the day as each presenter assured the audience that 2x performance gains won’t come in the chip itself for much longer.

It won’t be possible to achieve the performance we want in the future on the backs of software projects alone. We will need to find ways to be smarter about how we run software on our servers. When something is ripe for acceleration, especially CPU-intensive, repetitive workloads, we should find a way to accelerate it in hardware. There are tons of examples of this already, like AES encryption acceleration, but we will need more acceleration capabilities soon.

The post IBM Edge 2016: Day 1 Recap appeared first on major.io.

## September 19, 2016

### Errata Security

#### Why Snowden won't be pardoned

Edward Snowden (NSA leakerblower) won’t be pardoned. I’m not arguing that he shouldn’t be pardoned, but that he won’t be pardoned. The chances are near zero, and the pro-pardon crowd doesn't seem to be doing anything to cange this. This post lists a bunch of reasons why. If your goal is to get him pardoned, these are the sorts of things you’ll have to overcome.

The tl;dr list is this:
• Obama hates whistleblowers
• Obama loves the NSA
• A pardon would be betrayal
• Snowden leaked because he was disgruntled, not because he was a man of conscience (***)
• Snowden hasn’t yet been convicted
• Snowden leaked too much
• Snowden helped Russian intelligence
• Nothing was found to be illegal or unconstitutional

Obama hates whistleblowers

Obama campaigned promising to be the most transparent president in history. Among his campaign promises are:

Protect Whistleblowers: Often the best source of information about waste, fraud, and abuse in government is an existing government employee committed to public integrity and willing to speak out. Such acts of courage and patriotism, which can sometimes save lives and often save taxpayer dollars, should be encouraged rather than stifled as they have been during the Bush administration. We need to empower federal employees as watchdogs of wrongdoing and partners in performance. Barack Obama will strengthen whistleblower laws to protect federal workers who expose waste, fraud, and abuse of authority in government. Obama will ensure that federal agencies expedite the process for reviewing whistleblower claims and whistleblowers have full access to courts and due process.

That sounds like it was tailor made for Snowden, right? But Obama actual actions as president have been the opposite, at least where national security is concerned. Obama has prosecuted more whistleblowers under the espionage act than any other president – indeed, more than all previous presidents combined [**]. Moreover, Obama's prosecutions [**] have clearly politically motivated. Others, like Petraeus and Clinton, have not been prosecuted with the same fervor for mishandling classified information. Obviously, Obama's actions here have not been based on any principle.

If Obama was willing to prosecute those for minor leaks, he’s certainly motivated to prosecute Snowden for his huge leak. That politicians are never punished for their failures to follow through on campaign promises means that Obama doesn’t care. Obama hasn’t closed down Gitmo after 8 years, despite promising that’d be his first task in office.

In order for the pro-pardon campaign to succeed, they are going to have to repeatedly hold Obama’s feet to the fire. They need to keep pointing out Obama’s many transparency promises. They’ll have to point out how Obama’s campaign promises inspired Snowden, and that it was Obama’s failure to uphold his campaign promises that led Snowden to his actions. Blame Obama.

Obama loves the NSA

I think it was William Gates in his book who noted that Presidents, even the left-wing ones, quickly get subverted by the military. The military is apolitical, and takes the concept of “Commander in Chief” seriously. When the President says “jump”, they say “how high”. Presidents love that. In contrast, the President struggles with civilian departments under his nominal control, who passively resist his orders.

The NSA is a military organization (as opposed to the CIA, which is civilian). Therefore, the President loves the NSA. It's one of the few organizations that does what he wants.

Possibly more important is the fact that Obama will go down in history as the first President where “cyberwar” became a reality. All that spying infrastructure revealed by Snowden feeds into an enormous, and effective, cyberwar machine. The events in the Snowden movie, where drone strikes takes out somebody identified by their cellphone, are real.

Bush started it, but Obama presided over the development of this capability. In 50 years when the documents become declassified, future historians will point to this as one of the most enduring parts of Obama’s legacy, more even than Obamacare. Snowden damaged this legacy. Thus, Obama is going to be very much on the NSA’s side on this.

I have no clue how the pro-pardon people are going to answer this point, but they need to address it somehow.

A pardon would be betrayal

I’ve talked to a bunch of people in intelligence. Some understand that it’s just politics, and wouldn’t take a pardon personally. Others, though, would see it as a betrayal of the principles they stand for. That a junior disgruntled employee created such harm to their work, and then was pardoned, would betray them.

As I pointed out above, Obama loves the NSA. That he would offend them and demotivate them will be an important part of his decision making .

The NSA is a military unit, and thus above politics. Pardons are a political matter. The pro-pardon crowd needs to stress this – that those offended by a pardon are probably those too involved in politics anyway. They shouldn’t be that involved in politics.

Snowden was disgruntled

This is by far the most important issue. Snowden leaked because he was a disgruntled employee, angry at the lack of recognition and career growth that his skills/accomplishments deserved.

Indeed, the NSA/government doesn’t really believe in the concept of “whistleblowers” driven by matters of conscience. They believe that whistleblowing always comes from angry employees who want to strike back at organizations in revenge.

Thomas Drake, for example, was the proponent of two competing projects. His side lost, so in anger, he leaked about the side who won the internal political battle.

Bradley/Chelsea Manning hated the military because she didn’t fit in. Her justification for leaking is a bit incoherent. She leaked because she was angry and wanted to strike back at the system.

The Watergate leaker “Deep Throat”, Mark Felt, was angry that he was passed over for promotion to succeed J Edgar Hoover. He wasn’t clean himself, being party to Hoover’s decades of dirty tricks, illegal wire taps, and violations of constitutional rights. His disclosure of the Watergate break-in was not based on "conscience".

After Snowden, the NSA created a profile to identify similar people who might leak. This profile doesn’t include those who have EFF stickers on their laptops. Instead, it identifies people who might be disgruntled in the same way Snowden was.

Snowden’s profile is common in the computer/cyber field. The field is full of people without high-school diplomas, without college degrees, or if they have a college degree, in a non-computer major. These people are smart and self-taught. They follow their interests, so they have are extremely skilled at some narrow area that strikes their fancy -- although not necessarily in the areas of their job responsibilities.

It’s a common in IT/software-development that your otherwise unremarkable coworker is actually a rock-star in the community, doing minor system management during the day, but contributing Linux kernel patches at night. Or doing something else notable.

Those treated as junior employees at work, but who see themselves as rock-stars, are going to have an enormous chip on their shoulder, and will become extremely disgruntled at work. (Well, some rockstars understand they can’t get recognized at work for their skills, so mature rockstars aren’t a problem – just immature ones).

At some point I’m going to write up a larger post on this “Snowden profile”, the short point here that I’m trying to make is that the NSA overwhelming sees this as a problem of “disgruntlement” and not “conscience”. Thus, he won’t get pardoned for acting on his conscience, because that would be tantamount to pardoning the disgruntled.

For those of you arguing against this, it’d be useful to point out that Snowden’s own justifications are more coherent than the average leakers. He brings American founding principles and documents into the discussion. He’s obviously spent a lot more time thinking about the underlying principles than most leakers. Whether or not his disgruntlement played a part, conscience was clearly more a part of his reasons for leaking than the NSA would like to think.

Snowden hasn’t yet been convicted

This is a minor nit, but most pardons are for people who have already been convicted. In other words, justice has taken its course, and the president afterwards, through commutations or pardons, can adjust the result.

Even if Obama were willing to entertain the issue, what he’d be looking for as an ideal would be for Snowden to go through the court system, serve a couple years, then get his sentence merely commuted (leaving the technicality of a felony conviction intact). Whether or not you want to encourage people whistleblowers of the future by reducing Snowden’s sentence, you still want enough of a punishment to discourage future disgruntled employees from doing harm.

That Obama hasn’t negotiated with Snowden to come back and accept a plea deal is strong evidence that Obama has no intention of pardoning Snowden. Or, we might see a semi-pardon, something along the lines that would pardon him for any espionage charge that contains a death penalty, but which would leave Snowden open to lesser charges.

I suggest this because the pro-pardon crowd might think about a partial-pardon. They’d need lawyers experienced in the subject to analyze the possible crimes and come up with text for this. Such a pardon could allow Snowden to come home and be tried for lesser crimes that would only result in a few years jail time.

Snowden leaked too much

PRISM, phone metadata, smiley-face (data center unencrypted links), and that bulk-collection document (counting messages captured in the United States) all showed unacceptable domestic spying by the United States.

Yet, most of the Snowden revelations do not. They show expected spying on foreign countries. As I write this today The Intercept has a lengthy article [**] based on Snowden leaks, but far from any expressing outrage or abuse, the article documents how effective it has been getting terrorists. This leak helps terrorists and harms our national interests.

Of all the documents I’ve seen, maybe five show something worth whistleblowing, the other 100 don’t. Maybe you can get the President to pardon Snowden for those 5 documents, but getting a pardon for the other 100 is going to be much more difficult.

Personally, I’m of the opinion “fuck them”. They (those in the intelligence community) were caught doing too much, surveilling innocent American citizens, so I really don’t care if Snowden goes too far exposing them. They deserve to be “punished” for their excesses.

For you pro-pardon folks, point out that they can’t criticize Snowden for going “too far” without tacitly admitting there’s a point where he went “far enough”. In other words, they can’t argue some of the disclosures were bad without agreeing that some disclosures were good.

Snowden helped Russian intelligence

Everyone I talk to in the intelligence community is absolutely convinced Snowden has helped the FSB (Russian intelligence). They claim there’s proof.

I remain unconvinced. Snowden gets unreasonable worship from one side, and unreasonable hate from the other. This makes me skeptical of both sides. Unless I see this “evidence” they are talking about, to evaluate it for myself precisely what it means, I’m not going to believe it.

But the fact remains that those talking to Obama are going to tell him that they believe Snowden helped the Russians. This is going to make a pardon essentially impossible. The pro-pardon folks are going to have to figure out an answer to this problem. If there’s concrete evidence, like a film of Snowden explicitly telling an FSB agent some important secret, then you are toast and no pardon will ever happen. So you have to assume any evidence would be inconclusive, like a picture of Snowden meeting with a top FSB agent, or an audio recording of Snowden talking casually with the FSB – but not revealing important secrets at that time. Talk about pardon's assuming this sort of evidence.

The leaks resulted in no meaningful reform

Yes, the leak resulted in the USA FREEDOM ACT, but that was just a white-wash. Instead of the NSA collecting all the phone metadata, a private consortium of phone companies does the collect. Indeed, the situation is now worse. Previously, the NSA restricted searches of that data for national security (terrorism) reasons. Now every law enforcement agency, from the FBI, to the DEA, to the ATF, to the IRS, is querying that database. The number of phone records been searched by the government has exploded, for reasons unrelated to national security.

You'd think that around the world, countries would've gotten angry at the NSA, and kicked them out. The opposite has happened. After Snowden advertised our awesome capabilities, countries have lined up to establish partnerships with us, to get access to the NSA. And, many (especially despotic countries) have sought to build their own mass surveillance programs, based on the Snowden model.

The pro-Snowden crowd claims that "none of the reforms would have occurred without Snowden". Since none of the reforms were meaningful, or went in the wrong direction, Obama isn't going to respect this as a meaningful argument. Activists will have argue that Snowden deserves a pardon, despite the lack of significant interest by the public, and despite the lack of reforms.

Nothing was found to be illegal or unconstitutional

The Supreme Court didn’t rule on Snowden, finding anything he revealed either unconstitutional or illegal. QED: Snowden is not a whistleblower. That’s how everyone in government sees him. (Yes, a district court ruled Patriot Act 215 didn't cover it -- but the ruling ultimately had no effect).

The pro-Snowden position is going to have to point out that while not technically illegal, there was malfeasance. The intelligence community was doing things that the American people deserves to know about. Moreover, in response to his revelations, Congress acted and changed the law. You keep saying "whistleblower" as if it's a term the other side accepts. They don't (that's why I used "leakblower" at the top of this document :). The obtuse continual use of this word in the face of such opposition just makes them not listen to you.

Conclusion

At the end of his presidency, Bill Clinton pardoned his brother for cocaine charges and his friend for tax charges. That means anything is possible, and maybe Obama will pardon Snowden.

But as I see it, the chances of this are essentially nil. I think you pro-Snowden people are way to optimistic. You spend all your time talking to other pro-Snowden people, and not enough time talking to the anti-Snowden crowd. You cherry pick the stupidest bits of the anti-Snowden crowd (like that congressional report) to convince yourself of your superior position. You don't talk to the reasonable people who oppose Snowden. You don't believe reasonable opposing viewpoints exist.

You have no clue why Obama won’t pardon Snowden, and thus, are doing nothing to change his mind. You think, instead, that getting celebrities like Susan Sarandon on your side is going to promote your cause. Obama isn’t seeking re-election. He therefore doesn’t care what they think. Your attempt on stirring up public support will have no effect.

This is the decision of one man, Obama. It’s a free decision, one that will have no consequences for him either way. It’s one of the few decisions in his career where he will decide what’s right, not what’s popular. You have to address what his concerns are. In this document, I’m only guessing as an outsider what some of those concerns might be. But it behooves you, the pro-pardon activist, to figure what Obama’s real concerns are, and address them. Otherwise you don’t have a prayer of changing his mind.

#### The origins of on-call work

On September 6th, Susan Fowler posted an article titled, "Who's on-call?", talking about evolving on-call duties between development teams and SRE teams. She has this quote at the top:

I'm not sure when in the history of software engineering separate operations organizations were built and run to take on the so-called "operational" duties associated with running software applications and systems, but they've been around for quite some time now (by my research, at least the past twenty years - and that's a long time in the software world).

My first job was with a city government, and many of the people I was working with started at that city when they decided to computerize in 1978. Most of them have retired or died off by now. In 1996, when I started there, the original dot-com boom was very much on the upswing, and that city was still doing things the way they'd been done for years.

I got into the market in time to see the tail end of that era. One of the things I learned there was the origins of many of the patterns we see today. To understand the origins of on-call in IT systems, you have to go back to the era of serial networking, when the term 'minicomputer' was distinct from 'microcomputer', which were marketing terms to differentiate from 'mainframe'.

IT systems of the era employed people to do things we wouldn't even consider today, or would work our damnedest to automate out of existence. There were people who had, as their main job, duties such as:

• Entering data into the computer from paper forms.
• Really. All you did all day was punch in codes. Computer terminals were not on every desk, so specialists were hired to do it.
• The worst part is: there are people still doing this today.
• Kick off backups.
• Change backup tapes when the computer told them to.
• Load data-tapes when the computer told them to.
• Tape stored more than spinning rust, so it was used as a primary storage medium. Disk was for temp-space.
• I spent a summer being a Tape Librarian. My job was roboticized away.
• Kick off the overnight print-runs.
• Colate printer output into reports, for delivery to the mailroom.
• Execute the overnight batch processes.
• Your crontab was named 'Stephen,' and you saw him once a quarter at the office parties. Usually very tired-looking.
• Monitor by hand system usage indicators, and log them in a paper logbook.
• Keep an Operations Log of events that happened overnight, for review by the Systems Programmers in the morning.
• Follow runbooks given to them by Systems Programming for performing updates overnight.
• Be familiar with emergency procedures, and follow them when required.

Many of these things were only done by people working third shift. Which meant computer-rooms had a human on-staff 24/7. Sometimes many of them.

There was a side-effect to all of this, though. What if the overnight Operator had an emergency they couldn't handle? They had to call a Systems Programmer to advise a fix, or come in to fix it. In the 80's, when telephone modem came into their own, they may even be able to dial in and fix it from home.

On-Call was born.

There was another side-effect to all of this: it happened before the great CompSci shift in the colleges, so most Operators were women. And many Systems Programmers were too. This was why my first job was mostly women in IT management and senior technical roles. This was awesome.

A Systems Programmer, as they were called at the time, is less of a Software Engineering role as we would define it today. They were more DevOps, if not outright SysAdmin. They had coding chops, because much of systems management at the time required that. Their goal was more wiring together purchased software packages to work coherently, or modifying purchased software to work appropriately.

Time passed, and more and more of the overnight Operator's job was automated away. Eventually, the need for an overnight Operator exceeded requirements. Or you simply couldn't hire one to replace the Operator that just quit. However, the systems were still running 24/7, and you needed someone ready to respond to disasters. On-call got more intense, since you no longer had an experienced hand in the room at all times.

The Systems Programmers earned new job-titles. Software Engineering started to be a distinct skill-path and career, so was firewalled off in a department called Development. In those days, Development and Systems people spoke often; something you'll hear old hands grumble about with DevOps not being anything actually new. Systems was on-call, and sometimes Development was if there was a big thing rolling out.

Time passed again. Management culture changed, realizing that development people needed to be treated and managed differently than systems people. Software Engineering became known as Software Engineering, and became its own career-track. The new kids getting into the game never knew the close coordination with Systems that the old hands had, and assumed this separation was the way it's always been. Systems became known as Operations; to some chagrin of the old Systems hands who resented being called an 'Operator', which was typically very junior. Operations remained on-call, and kept informal lists of developers who could be relied on to answer the phone at o-dark-thirty in case things went deeply wrong.

More time, and the separation between Operations and Software Engineering became deeply entrenched. Some bright sparks realized that there were an awful lot of synergies to be had with close coordination between Ops and SE. And thus, DevOps was (re)born in the modern context.

Operations was still on-call, but now it was open for debate about how much of Software Engineering needed to be put on the Wake At 3AM In Case Of Emergency list.

And that is how on-call evolved from the minicomputer era, to the modern era of cloud computing.

You're welcome.

## September 17, 2016

### Errata Security

#### Review: "Snowden" (2016)

tldr:

• If you are partisan toward Snowden, you'll like the movie.
• If you know little about Snowden, it's probably too long/slow -- you'll be missing the subtext.
• If you are anti-Snowden, you'll hate it of course.

The movie wasn't bad. I was expecting some sort of over-dramatization, a sort of Bourne-style movie doing parkour through Hong Kong ghettos. Or, I expected a The Fifth Estate sort of movie that was based on the quirky character of Assange. But instead, the movie was just a slight dramatization of the events you (as a Snowden partisan) already know. Indeed, Snowden is a boring protagonist in the movie -- which makes the movie good. All the other characters in the movie are more interesting than the main character. Even the plot isn't all that interesting -- it's just a simple dramatization of what happens -- it's that slow build-up of tension toward the final reveal that keeps your attention.

In other words, it's clear that if you like Snowden, understand the subtext, you'll enjoy riding along on this slow buildup of tension.

Those opposed to Snowden, however, will of course gag on the one-side nature of the story. There's always two sides to every story. While the film didn't go overboard hyping Snowden's side, it was still partisan, mostly ignoring the opposing side. I can imagine all my friends who work for government walking out in anger, not being able to tolerate this one-sided view of the events. I point this out because with the release of this movie, there's also been a surge in the "Pardon Snowden" movement. No, the chances of that are nil. Even though such a thing seems obvious to you, it's only because you haven't seen the other side.

So if you don't like Snowden, at best you'll be bored, at worst you'll be driven out of the theater in anger.

I don't think the movie stands alone, without all this subtext we already know. So if you haven't been following along with the whole story, I don't think you'll enjoy it.

Finally, there's watching everyone else in the audience. They seemed to like it, and they seemed to "get" the key points the director was trying to make. It was a rather slow Friday night for all the movies being shown, so the theater wasn't empty, but neither was it very full. I'd guess everyone there already had some interest in Snowden. Obviously, from the sign out front, they don't expect as much interest in this film as they do in Bridget Jones' Baby and Blair Witch 2.

Anyway, as I said, if you like Edward Snowden, you'll like Snowden. It's not over the top; it's a fair treatment of his story. I'm looking forward to the sequel.

### HolisticInfoSec.org

#### Toolsmith In-depth Analysis: motionEyeOS for Security Makers

It's rather hard to believe, unimaginable even, but here we are. This is the 120th consecutive edition of toolsmith; every month for the last ten years, I've been proud to bring you insights and analysis on free and open source security tools. I hope you've enjoyed the journey as much as I have, I've learned a ton and certainly hope you have too. If you want a journey through the past, October 2006 through August 2015 are available on my web site here, in PDF form, and many year's worth have been published here on the blog as well.
I labored a bit on what to write about for this 10th Anniversary Edition and settled on something I have yet to cover, a physical security topic. To that end I opted for a very slick, maker project, using a Raspberry Pi 2, a USB web cam, and motionEyeOS. Per Calin Crisan, the project developer, motionEyeOS is a Linux distribution that turns a single-board computer into a video surveillance system. The OS is based on BuildRoot and uses motion as a backend and motionEye for the frontend.
• Buildroot "is a simple, efficient and easy-to-use tool to generate embedded Linux systems through cross-compilation."
• Motion (wait for it) is a program that monitors the video signal from cameras and is able to detect if a significant part of the picture has changed; in other words, it can detect motion.
• motionEye is also Calin's project and is web frontend for the motion daemon.

Installation was insanely easy, I followed Calin's installation guidelines and used Win32DiskImager to write the image to the SD card. Here's how straightforward it was in summary.
1) Download the latest motionEyeOS image. I used build 20160828 for Raspberry Pi 2.
2) Write the image to SD card, insert the SD into your Pi.
3) Plug a supported web camera in to your Pi, power up the Pi. Give it a couple minutes after first boot per the guidelines: do not disconnect or reboot your board during these first two minutes. The initialization steps:
• prepare the data partition on the SD card
• configure SSH remote access
• auto-configure any detected camera devices
4) Determine the IP addressed assigned to the Pi, DHCP is default. You can do this with a monitor plugged in the the Pi's HDMI port, via your router's connected devices list, or with a network scan.
For detailed installation instructions, refer to PiMyLifeUp's Build a Raspberry Pi Security Camera Network. It refers to a dated, differently named (motionPie) version of motionEyeOS, but provides great detail if you need it. There are a number of YouTube videos too, just search motionEyeOS.

Configuration is also ridiculously simple. Point your browser to the IP address for the Pi, http://192.168.248.20 for me on my wired network, and http://192.168.248.64 once I configured motionEyeOS to use my WiFi dongle.
The first time you login, the password is blank so change that first. In the upper left corner of the UI you'll see a round icon with three lines, that's the setting menu. Click it, change your admin and user (viewer) passwords STAT. Then immediately enable Advanced Settings.
 Figure 1: Preferences

You'll definitely want to add a camera, and keep in mind, you can manage multiple cameras with on motionEyeOS devices, and even multiple motionEyeOS systems with one master controller. Check out Usage Scenarios for more.

Once your camera is enabled, you'll see its feed in the UI. Note that there are unique URLs for snapshots, streaming and embedding.

 Figure 3: Active camera and URLs
When motion detection has enabled the camera, the video frame in the UI will be wrapped in orange-red. You can also hover over the video frame for additional controls such as full screen and immediate access to stored video.

There are an absolute plethora of settings options, the most important of which, after camera configuration, is storage. You can write to local storage or a network share, this quickly matters if you choose and always-on scenario versus motion enabled.
 Figure 4: Configure file storage
You can configure text overlay, video streaming, still images, schedules, and more.
 Figure 5: Options, options, options
The most important variable of all us how you want to be notified.
There are configuration options that allow you to run commands so you script up a preferred process or use one already devised.
 Figure 6: Run a command for notification

Best of all, you can make uses of a variety of notification services including email, as well as Pushover, and IFTTT via Web Hooks.
There is an outstanding article on using Pushover and IFTTT on Pi Supply's Maker Zone. It makes it easy to leverage such services even if you haven't done so before.
The net result, after easy installation, and a little bit of configuration is your on motion-enabled CCTV system that costs very little compared to its commercial counterparts.
 Figure 8: Your author entering his office under the watchful eye of Camera1
Purists will find image quality a bit lacking perhaps, but with the right camera you can use Fast Network Camera. Do be aware of the drawbacks though (lost functionality).

In closing, I love this project. Kudos to Calin Crisan for this project. Makers and absolute beginners alike can easily create a great motion enabled video/still camera setup, or a network of managed cameras with always on video. The hardware is inexpensive and readily available. If you've not explored Raspberry Pi this is a great way to get started. If you're looking for a totally viable security video monitoring implementation, motionEyeOS and your favorite IoT hardware (the project supports other boards too) are a perfect combo. Remember too that there are Raspberry Pi board-specific camera modules available.

Ping me via email or Twitter if you have questions (russ at holisticinfosec dot org or @holisticinfosec).
Cheers…until next time.

## September 15, 2016

#### Discounts on The Practice books until Sept 19!

Pearson is doing their annual "Back to Business" sale until Monday, September 19. You can save 35-45%, which is a big deal IMHO.

The Practice of Cloud System Administration is 35% off, or 45% off if you buy 2 copies. Buy one for yourself and get a copy for a friend for their birthday. Just use this link to receive the discount.

You can also get a copy of the Cloud Administration book plus the new 3rd Edition of The Practice of System and Network Administration (when it ships in November) and save 45% if you use this link and enter offer code "B2B".

These offers end on Monday. Act now!

(NOTE: I'm not saying our books are great birthday gifts. However think about it this way. What's the worst that could happen? If your friend is offended and never talks to you again at least you'll have more time to read your book. Isn't that what you wanted in the first place?)

### ma.ttias.be

#### Varnish Cache 5.0 Released

The post Varnish Cache 5.0 Released appeared first on ma.ttias.be.

This just got posted to the varnish-announce mailing list.

We have just released Varnish Cache 5.0:

http://varnish-cache.org/docs/5.0/whats-new/relnote-5.0.html

This release comes almost exactly 10 years after Varnish Cache 1.0,
and also marks 10 years without any significant security incidents.

Next major release (either 5.1 or 6.0) will happen on March 15 2017.

Enjoy!

---
Poul-Henning Kamp

Lots of changes (HTTP/2, Shard director, ban lurker improvements, ...) and info on upgrading to Varnish 5!

I'll be working on updated configs for Varnish 5 (as I did for Varnish 4) as soon as I find some time for it.

The post Varnish Cache 5.0 Released appeared first on ma.ttias.be.

## September 13, 2016

### Raymii.org

#### Mouse movement via the keyboard with xdotool and xbindkeys

I had a request from a friend to figure out how she could use her mouse via the keyboard. Normally you would use Mouse Keys, but she uses a kinesis freestyle2 keyboard which has no numpad. By using xbindkeys together with xdotool we can use our own key combination to move the mouse keys, in any window manager.

## September 12, 2016

### Steve Kemp's Blog

#### If your code accepts URIs as input..

There are many online sites that accept reading input from remote locations. For example a site might try to extract all the text from a webpage, or show you the HTTP-headers a given server sends back in response to a request.

If you run such a site you must make sure you validate the schema you're given - also remembering to do that if you're sent any HTTP-redirects.

Really the issue here is a confusion between `URL` & `URI`.

The only time I ever communicated with Aaron Swartz was unfortunately after his death, because I didn't make the connection. I randomly stumbled upon the `html2text` software he put together, which had an online demo containing a form for entering a location. I tried the obvious input:

``````file:///etc/passwd
``````

The software was vulnerable, read the file, and showed it to me.

The site gives errors on all inputs now, so it cannot be used to demonstrate the problem, but on Friday I saw another site on Hacker News with the very same input-issue, and it reminded me that there's a very real class of security problems here.

The site in question was http://fuckyeahmarkdown.com/ and allows you to enter a URL to convert to markdown - I found this via the hacker news submission.

The following link shows the contents of `/etc/hosts`, and demonstrates the problem:

`http://fuckyeahmarkdown.example.com/go/?u=file:///etc/hosts&read=1&preview=1&showframe=0&submit=go`

The output looked like this:

``````..
127.0.0.1 localhost
::1 localhost
fe80::1%lo0 localhost
127.0.0.1 stage
127.0.0.1 files
127.0.0.1 brettt..
..
``````

In the actual output of '/etc/passwd' all newlines had been stripped. (Which I now recognize as being an artifact of the markdown processing.)

UPDATE: The problem is fixed now.

### Errata Security

#### What's the testimonial of passwords?

In this case described by Orin Kerr, the judge asks if entering a password has any testimonial other than "I know the password". Well, rather a lot. A password is content. While it's a foregone conclusion that this encrypted drive here in this case belongs to the suspect, the password may unlock other things that currently cannot be tied to the suspect. Maybe the courts have an answer to this problem, but in case they haven't, I thought I'd address this from a computer-science point of view.

Firstly, we have to address the phrasing of entering a password, rather than disclosing the password. Clearly, the court is interested in only the content of the disk drive the password decrypts, and uninterested in the password itself. Yet, entering a password is the same as disclosing it. Technically, there's no way to enter a password in such a way that it can't be recorded. I don't know the law here, and whether courts would protect this disclosure, but for the purposes of this blog post, "entering" is treated the same as "disclosing".

Passwords have content. This paper focuses on one real, concrete example, but let's consider some hypothetical cases first.

As is well-known, people often choose the birth dates of their children as the basis for passwords. Imagine a man has a password "emily97513" -- and that he has an illegitimate child named "Emily" who was born on May 13, 1997. Such a password would be strong evidence in a paternity suite.

As is well-known, people base passwords on sports teams. Imagine a password is "GoBears2017", strong evidence the person is a fan of the Chicago Bears, despite testimony in some case that he's never been to Chicago.

Lastly, consider a password "JimmyHoffaDieDieDie" in a court case where somebody is suspected of having killed Jimmy Hoffa.

But these are hypotheticals; now let's consider a real situation with passwords. Namely, good passwords are unique. By unique we mean that good passwords are chosen such that they are they so strange that nobody else would ever have chosen that password.

For example, Wikileaks published many "insurance" files -- encrypted files containing leaks that nobody could decrypt. This allowed many people to mirror leak data without actually knowing the contents of the leaks. In a book on Wikileaks, the Guardian inadvertently disclosed that the password to the Manning leaks was ACollectionOfDiplomaticHistorySince_1966_ToThe_PresentDay#. It was then a simple matter of attempting to decrypt the many Wikileaks insurance files until the right one was found.

In other words, the content of the password was used to discover the files it applied to.

Another example is password leaks. Major sites like LinkedIn regularly get hacked and get account details dumped on the Internet. Sites like HaveIBennPwned.com track such leaks. Given a password, it's possible to search these dumps for corresponding email addresses. Thus, hypothetically, once law enforcement knows a person's password, they can then search for email accounts the user might hold that they might not previously have know about.

Statistically, passwords are even more unique (sic) than fingerprints, DNA testing, and other things police regularly relying upon (though often erroneously) as being "unique". Consider the password kaJVD7VqcR. While it's only 10 character long, it's completely unique. I just googled it to make sure -- and got zero hits. The chances of another random 10 character password matching this one is one in 1018 chances. In other words, if a billion people each chose a billion random passwords, only then would you have a chance that somebody would pick this same random password.

Thus consider the case where the court forces a child porn suspect to enter the password in order to decrypt a TrueCrypt encrypted drive found in his house. The court may consider it a foregone conclusion that the drive is his, and thus Fifth Amendment protections may not apply. However, the content of the password is itself testimonial about all sorts of other things. For example, maybe child pornographers swap drives, so law enforcement tests this password against all other encrypted drives in their possession. They then test this password against all user account information in their possession, such as hidden Tor forums or public LinkedIn-style account dumps. The suspect's unique password is testimonial about all these other things which, before the disclosure of the password, could not be tied to the suspect.

### Raymii.org

#### IPSEC VPN on Ubuntu 16.04 with StrongSwan

This is a guide on setting up an IPSEC VPN server on Ubuntu 16.04 using StrongSwan as the IPsec server and for authentication. It has a detailed explanation with every step. We choose the IPSEC protocol stack because of vulnerabilities found in pptpd VPNs and because it is supported on all recent operating systems by default. More than ever, your freedom and privacy when online is under threat. Governments and ISPs want to control what you can and can't see while keeping a record of everything you do, and even the shady-looking guy lurking around your coffee shop or the airport gate can grab your bank details easier than you may think. A self hosted VPN lets you surf the web the way it was intended: anonymously and without oversight.

## September 10, 2016

### HolisticInfoSec.org

#### Best toolsmith tool of the last ten years

As we celebrate Ten Years of Toolsmith and 120 individual tools covered in detail with the attention they deserve, I thought it'd be revealing to see who comes to the very top of the list for readers/voters.
I've built a poll from the last eight Toolsmith Tools of the Year to help you decide, and it's a hell of a list.
Amazing, right? The best of the best.

You can vote in the poll to your right, it'll be open for two weeks.

## September 09, 2016

### Sarah Allen

#### five dysfunctions of a team

The Five Dysfunctions of a Team by Patrick Lencioni tells the story of a fictional team leading a believable tech company. It illustrates how we bring our humanity and our flaws to work, even as good leaders who contribute exceptionally well, we can still fail if we can’t collaborate effectively on a shared goal.

The storytelling was particularly good at showing how it can be so hard to facilitate change. Everyone won’t be in the same place in the same moment. Someone gets it, while others are struggling. The best teams have a culture of helping each other, and we need to create that culture intentionally.

I tend to prefer non-fiction that conveys research when I read books about work, but decided to read it after it was recommended twice in one week (thx Luna Comerford and Adria Richards). I found this corporate fable to be compelling and thought-provoking. The audio book was particularly entertaining and was surprised to finish it in just a day mostly during dog walking, muni rides, and evening errands.

I enjoyed the book and agree with the five dysfunctions presented as a pyramid that illustrates how each weakness leads to the next. In thinking through my reflections on the book, I found it helpful to review Abi Noda’s book notes. I liked his re-creation of the pyramid of dysfunctions, along with annotations, which inspired me to create my own alternate annotated pyramid:

At the top of the pyramid, the book talks about “inattention to results,” though the discussion is really about focusing on the right results, which are based on shared priorities and shared decisions. Individuals on the executive team might inadvertently jeopardize the company’s success because of their own status and ego. No one wants the company to fail because of their own functional area, and thus, well-meaning people can become defensive and fail to deliver on shared goals because they are too focused on their specific deliverable. Certainly we all need to deliver great results within our area of expertise, but that must be in support of a primary shared goal.

At the base of the pyramid is absence of trust. The book had an interesting anecdote on how “politics” is common on teams when people don’t trust each other. People use that word to describe different behaviors, but the CEOs definition works well in this setting: “Politics is when people choose their words and actions based on how they want others to react rather than based on what they really think.” When we work in an environment where we can just say what we think, trusting that people will believe we have good intentions and are working toward our shared goals, makes everything go faster and leads to better solutions.

I felt like I wanted to remember the functioning team and the positive state we’re striving for when we work well together. As the story progressed, I imagined the opposite pyramid. At its base is psychological safety, which allows people to be vulnerable; this creates an environment where people can speak up, allowing for and supporting healthy conflict. If people can have productive discussions and have respectful arguments, the team can make decisions that the each individual buys into. People hold each other accountable, but also support each other and collaborate, consistently seeking alignment on delivering on shared priorities. Working as a team requires repeated course corrections. We need to make time to help each other and if we’re all focused on different things, we just won’t have time to do it all. If we’re all focused on the same goal, supporting parts of the same deliverable, then collaboration helps each team member deliver on their individual work and the shared goal.

One key element to making good teams work well is what I call a shared understanding of reality. In the book, the new CEO starts each day of their executive offsite by saying: “We have more money, better technology, more talented and experienced executives, and yet we’re behind our competitors.” She was creating a shared understanding of reality. She was making it clear to the team that she believed in them and in the business opportunity. A lot of what we do when we create alignment on a team is to build that shared reality. Shared goals are built on shared information and shared conclusions, which are based on shared values.

There are lots of ways to think about how to create a healthy team. I like how Abi Noda reflects on the pyramid of dysfunctions presented in this book: “Teamwork deteriorates if even a single dysfunction is allowed to flourish, like a chain with just one link broken.”

The post five dysfunctions of a team appeared first on the evolving ultrasaurus.

### Sean's IT Blog

#### Horizon 7.0 Part 10–Building Your Desktop Golden Images

A virtual desktop environment is nothing without virtual desktops.  Poorly performing virtual desktops and applications, or virtual desktops and remote desktop session hosts that aren’t configured properly for the applications that are being deployed, can turn users off to modern end user computing solutions and sink the project.

How you configure your desktop base image can depend on the type of desktop pools that you plan to deploy.  The type of desktop pools that you deploy can depend on the applications and how you intend to deploy them.  This part will cover how to configure a desktop base image for non-persistent desktop pools, and the next part in this series will cover how to set up both linked and instant clone desktop pools.

###### Before You Begin, Understand Your Applications

Before we begin talking about how to configure the desktop base image and setting up the desktop pools, its very important to understand the applications that you will be deploying to your virtual desktops.  The types of applications and how they can be deployed will determine the types of desktop pools that can be used.

A few factors to keep in mind are:

• Application Delivery – How are the applications going to be delivered to the desktop or RDSH host?
• User Installed Applications – Will users be able to install their own applications?  If so, how are applications managed on the desktop?
• User Profiles – How are the user profiles and settings being managed?  Is there any application data or setting that you want to manage or make portable across platforms?
• Licensing – How are the applications licensed?  Are the licenses locked to the computer in some way, such as by computer name or MAC address?  Is a hardware key required?
• Hardware – Does the application require specific hardware in order to function, or does it have high resource requirements?  This is usually a consideration for high-end CAD or engineering applications that require a 3D card, but it could also apply to applications that need older hardware or access to a serial port.

Application management and delivery has changed significantly since I wrote the Horizon 6.0 series.  When that series was written, VMware had just purchased Cloud Volumes, and it hadn’t been added into the product suite.  Today, App Volumes is available in the Horizon Suite Enterprise SKU, and it provides application layering capabilities in Horizon.  Application layering allows administrators to place applications into virtual disk files that get attached at logon, and this allows you to create a single master golden image that has applications added when the user logs in.  If you don’t have the Horizon Suite Enterprise SKU, there are a few other players in the application layering space such as Liquidware Labs FlexApp and Unidesk, and these tools also provide the ability to abstract your applications from the underlying operating system.

Application layering isn’t the only delivery mechanism.  App Virtualization, using tools like ThinApp, Microsoft AppV, or Turbo, is one option for providing isolated applications.  Reverse layering has all applications installed into the golden template, and applications are exposed on a per-user basis. This is the concept behind tools like FSLogix.  Publishing applications to virtual desktops using XenApp or Horizon Published Applications is an option that places the applications on a centralized server, or you could just install some or all of your applications into the golden image and manage them with tools like Altiris or SCCM.

All of these options are valid ways to deliver applications to virtual desktops, and you need to decide on which methods you will use when designing your desktop golden images and desktop pools.  There may not be a single solution for delivering all of your applications, and you may need to rely on multiple methods to meet the needs of your users.

###### Supported Desktop Operating Systems

Horizon 7.0 supports desktops running Windows and Linux.  The versions of Windows that are supported for full clone and linked clone desktops are:

• Windows 10 Enterprise (including the Long Term Servicing Branch and Anniversary Update in Horizon 7.0.2)
• Windows 8.1 Enterprise or Professional
• Windows 8 Enterprise or Professional
• Windows 7 SP1 Enterprise or Professional
• Windows Server 2008 R2 (RDSH and Server-based Desktop)
• Windows Server 2012 R2 (RDSH and Server-based Desktop)

In order to run desktops on Windows Server-based OSes, you need to enable the “Enable Windows Server desktops” setting under View Configuration –> Global Settings and install the Desktop Experience feature after installing the OS.  There are some benefits to using Windows Server for your desktop OS including avoiding the Microsoft VDA tax on desktop VDI.  The choice to use a server OS vs. a desktop OS must be weighed carefully, however, as this can impact management and application compatibility.

Instant clone desktops are supported on the following operating systems:

• Windows 10 Enterprise
• Windows 7 SP1 Enterprise or Professional

The Horizon Linux agent is supported on the following 64-bit versions:

• Ubuntu 14.04 (note: VMware recommends disabling Compviz due to performance issues)
• Ubuntu 12.04
• RHEL and CentOS 6.6
• RHEL and CentOS 7.2
• NeoKylin 6 Update 1
• SLES 11 SP3/SP4
• SLES 12 SP1

The Linux component supports both full clone and linked clone desktops in Horizon 7.0.1.  However, there are a number of additional requirements for Linux desktops, so I would recommend reading the Setting Up Horizon 7 Version 7.0.1 for Linux Desktops guide.

For this part, we’re going to assume that we’re building a template running a desktop version of Windows.  This will be more of a high-level overview of creating a desktop template for Horizon, and I won’t be doing a step-by-step walkthrough of any of the steps for this section.  Once the desktop image is set up, I’ll cover some of the ways to optimize the desktop templates.

###### Configure the VM

Building a desktop VM isn’t much different than building a server VM.  The basic process is create the VM, configure the hardware, install the operating system, and then install your applications.  Although there are a few additional steps, building a desktop VM doesn’t deviate from this.

You should base the number of vCPUs and the amount of RAM assigned to your virtual desktops on the requirements for of the applications that you plan to run and fine tune based on user performance and resource utilization.   Horizon doesn’t allow you to set the CPU and RAM allocation when deploying desktop pools, so these need to be set on the template itself.

The recommended hardware for a virtual desktop is:

• SCSI Controller – LSI SAS
• Hard Disk – At least 40GB Thin Provisioned
• NIC – VMXNET3
• Remove Floppy Drive, and disable parallel and serial ports in BIOS
• Remove the CD-ROM drive if you do not have an alternative method for installing Windows.

Note: You cannot remove the CD-ROM drive until after Windows has been installed if you are installing from an ISO.

BIOS screen for disabling Serial and Parallel ports and floppy controller

You’ll notice that I didn’t put minimums for vCPUs and RAM.  Sizing these really depends on the requirements of your user’s applications.  I’ve had Windows 7 64-bit desktops deployed with as little as 1GB of RAM for general office workers up to 4GB of RAM for users running the Adobe Suite.  Generally speaking, customers are deploying knowledge or task worker desktops with at least 2 vCPUs and between 2-4 GB of RAM, however the actual sizing depends on your applications.

###### Install Windows

After you have created a VM and configured the VM’s settings, you need to install Windows.  Again, it’s not much different than installing Windows Server into a VM or installing a fresh copy of Windows onto physical hardware.  You can install Windows using the ISO of the disk or by using the Microsoft Deployment Toolkit and PXE boot to push down an image that you’ve already created.

When installing Windows for your desktop template, you’ll want to make sure that the default 100 MB system partition is not created.  This partition is used by Windows to store the files used for Bitlocker.  Since Bitlocker is not supported on virtual machines by either Microsoft or VMware, there is no reason to create this partition.  This will require bypassing the installer and manually partitioning the boot drive.  The steps for doing this when installing from the DVD/ISO are:

1. Boot the computer to the installer
2. Press Shift-F10 to bring up the command prompt
3. Type DiskPart
4. Type Select Disk 0
5. Type Create Partition Primary
6. Type Exit twice.

Once you’ve set up the partition, you can install Windows normally.  If you’re using something like the Microsoft Deployment Toolkit, you will need to configure your answer file to set up the proper hard drive partition configuration.

###### Install VMware Tools and Join the Template to a Domain

After you have installed Windows, you will need to install the VMware tools package.  The tools package is required to install the View Agent.  VMware Tools also includes the VMXNET3 driver, and your template will not have network access until this is installed.   The typical installation is generally all that you will need unless you’re using Guest Introspection as part of  NSX or your antivirus solution.

After you have installed VMware Tools and rebooted the template, you should join it to your Active Directory domain.  The template doesn’t need to be joined to a domain, but it makes it easier to manage and install software from network shares.  I’ve also heard that there are some best practices around removing the computer from the domain before deploying desktop pools.  This is an optional task, and it’s not required.  I’ve never removed the machines from the domain before provisioning, and I haven’t experienced any issues.

###### Install The Horizon Agent

After you have installed the VMware tools package and joined your computer to the domain, you will need to install the VMware Horizon Agent.  There are a number of new features in the Horizon 7 Agent install, and not all features are enabled by default.  Be careful when enabling or disabling features as this can have security implications.

One thing to note about the Horizon 7 agent is that there is a Composer component and an Instant Clones component.  These items cannot be installed together.  A desktop template can only be used for Linked Clones or Instant Clones.

###### Installing Applications on the Template

After you install the Horizon Agent, you can begin to install the applications that your users will need when they log into Horizon View.

With tools like Thinapp available to virtualize Windows applications or layering software like FlexApp, Unidesk and App Volumes, it is not necessary to install all of your applications in your template or to create templates for all of the different application combinations.  You can create a base template with your common applications that all users receive and then either virtualize or layer your other applications so they can be delivered on demand.

###### “Finalizing” the Image

Once you have the applications installed, it is time to finalize the image to prepare it for Horizon.  This step involves disabling unneeded services and making configuration settings changes to ensure a good user experience.   This may also involve running antivirus or other malware scans to ensure that only new or changed files are scanned after the image is deployed (Symantec…I’m looking at you for this one).

VMware has released a white paper that covers how to optimize a Windows-based virtual desktop or RDSH server.  Previous versions of this white paper have focused on making changes using a PowerShell or batch script.   VMware has also released a fling, the OS Optimization Tool, with a graphical interface that can simplify the optimization process.  Whenever possible, I recommend using the fling to optimize virtual desktop templates.  It not only provides an easy way to select which settings to apply, but it contains templates for different operating systems.  It also provides a way to log which changes are made and to roll back unwanted changes.

Prior to optimizing your desktops, I recommend taking a snapshot of the virtual machine.  This provides a quick way to roll back to a clean slate.  I recommend applying most of the defaults, but I also recommend reading through each change to understand what changes are being made.  I do not recommend disabling the Windows Firewall at all, and I don’t recommend disabling Windows Update as this can be controlled by Group Policy.

Before you shut the virtual machine down to snapshot it, verify that any services required for applications are enabled.  This includes the Windows Firewall service which is required for the Horizon Agent to function properly.

###### Shutdown and Snapshot

After you have your applications installed, you need to shut down your desktop template and take a snapshot of it.  If you are using linked clones, the linked clone replica will be based on the snapshot you select.

That’s a quick rundown of setting up a desktop template to be used with Horizon desktops.

In the next part of this series, I’ll cover how to create desktop pools.

## September 07, 2016

### Sean's IT Blog

#### Horizon 7.0 Part 9–Configuring Horizon for the First Time

Now that the Connection Server and Composer are installed, it’s time to configure the components to actually work together with vCenter to provision and manage desktop pools.

###### Logging into the Horizon Administrator

Before anything can be configured, though, we need to first log into the Horizon Administrator management interface.  This management interface is based on the Adobe Flex platform, so Flash will need to be installed on any endpoints you use to administer the environment.

The web browsers that VMware currently supports, with Adobe Flash 10.1 or later are:

• Internet Explorer 9-11
• Firefox
• Chrome
• Safari 6
• Microsoft Edge

2. Navigate to https://<FQDN of connection server>/admin

3. Log in with the Administrator Account you designated (or with an account that is a member of the administrator group you selected) when you installed the Connection Server.

Note:  The license keys are retrieved from your MyVMware site.  If you do not input a license key, you will not be able to connect to desktops or published applications after they are provisioned.  You can add or change a license key later under View Configuration –> Product Licensing and Usage.

Configuring Horizon for the First Time

Once you’ve logged in and configured your license, you can start setting up the Horizon environment.  In this step, the Connection Server will be configured to talk to vCenter and Composer.

1.   Expand View Configuration and select Servers.

2.  Select the vCenter Servers tab and select Add…

3, Enter your vCenter server information.  The service account that you use in this section should be the vCenter Service Account that you created in Part 6.

Note: If you are using vCenter 5.5 or later, the username should be entered in User Principal Name format – username@fqdn.

4. If you have not updated the certificates on your vCenter Server, you will receive an Invalid Certificate Warning.  Click View Certificate to view and accept the certificate.

5.  Select the View Composer option that you plan to use with this vCenter.  The options are:

A. Do not use View Composer – View Composer and Linked Clones will not be available for desktop pools that use this vCenter.

B. View Composer is co-installed with vCenter Server – View Composer is installed on the vCenter Server, and the vCenter Server credentials entered on the previous screen will be used for connecting.  This option is only available with the Windows vCenter Server.

C. Standalone View Composer Server – View Composer is installed on a standalone Windows Server, and credentials will be required to connect to the Composer instance.  This option will work with both the Windows vCenter Server and the vCenter Server virtual appliance.

Note: The account credentials used to connect to the View Composer server must have local administrator rights on the machine where Composer is installed.  If they account does not have local administrator rights, you will get an error that you cannot connect.

6. If Composer is using an untrusted SSL certificate, you will receive a prompt that the certificate is invalid.  Click View Certificate and then accept.

7. The next step is to set up the Active Directory domains that Composer will connect to when provisioning desktops.  Click Add to add a new domain.

8. Enter the domain name, user account with rights to Active Directory, and the password and click OK.  The user account used for this step should be the account that was set up in Part 6.

Once all the domains have been added, click Next to continue.

9. The next step is to configure the advanced storage settings used by Horizon.  The two options to select on this screen are:

• Reclaim VM Disk Space – Allows Horizon to reclaim disk space allocated to linked-clone virtual machines.
• Enable View Storage Accelerator – View Storage Accelerator is a RAMDISK cache that can be used to offload some storage requests to the local system.  Regenerating the cache can impact IO operations on the storage array, and maintenance blackout windows can be configured to avoid a long train of witnesses.  The max cache size is 2GB.

10. Review the settings and click finish.

###### Configuring the Horizon Events Database

The last thing that we need to configure is the Horizon Events Database.  As the name implies, the Events Database is a repository for events that happen with the View environment.  Some examples of events that are recorded include logon and logoff activity and Composer errors.

Part 6 described the steps for creating the database and the database user account.

1. In the View Configuration section, select Event Configuration.

2. In the Event Database section, click Edit.

3. Enter the following information to set up the connection:

• Database Server (if not installed to the default instance, enter as servername\instance)
• Database Type
• Port
• Database name
• Table Prefix (not needed unless you have multiple Connection Server environments that use the same events database – IE large “pod” environments)

Note: The only SQL Server instance that uses port 1433 is the default instance.  Named instances use dynamic port assignment that assigns a random port number to the service upon startup.  If the Events database is installed to a named instance, it will need to have a static port number.  You can set up SQL Server to listen on a static port by using this TechNet article.  For the above example, I assigned the port 1433 to the Composer instance since I will not have a named instance on that server.

If you do not configure a static port assignment and try to connect to a named instance on port 1433, you may receive the error below.

5. If setup is successful, you should see a screen similar to the one below.  At this point, you can change your event retention settings by editing the event settings.

## September 06, 2016

### Sean's IT Blog

#### Horizon 7.0 Part 8 – Installing The First Connection Server

Connection Servers are one of the most important components in any Horizon environment, and they come in three flavors – the standard Connection Server, the Replica Connection Server, and the Security Server.

You may have noticed that I listed two types of connection servers.  The Standard and Replica Connection Servers have the same feature set, and the only difference between the two is that the standard connection server is the first server installed in the pod.  Both connection server types handle multiple roles in the Horizon infrastructure.   They handle primary user authentication against Active Directory, management of desktop pools, provide a portal to access desktop pools and published applications, and broker connections to desktops, terminal servers, and applications.  The connection server’s analog in Citrix environments would be a combination of Storefront and the Delivery Controller.

The Security Server is a stripped down version of the regular Connection Server designed to provide secure remote access.  It is designed to operate in a DMZ network and tunnel connections back to the Connection server, and it must be paired with a specific Connection Server in order for the installation to complete successfully.  Unlike previous versions of this walkthrough, I won’t be focusing on the Security Server in the remote access section as VMware now provides better tools.

###### Installing the First Connection Server

Before you can begin installing the Horizon View, you will need to have a server prepared that meets the minimum requirements for the Horizon View Connection Server instance.  The basic requirements, which are described in Part 2, are a server running Windows Server 2008 R2 or Server 2012 R2 with 2 CPUs and at least 4GB of RAM.

Note:  If you are going have more than 50 virtual desktop sessions on a Connection Server, it should be provisioned with at least 10GB of RAM.

Once the server is provisioned, and the Connection Server installer has been copied over, the steps for configuring the first Connection Server are:

1. Launch the Connection Server installation wizard by double-clicking on VMware-viewconnectionserver-x86_64-7.x.x-xxxxxxx.exe.

2. Click Next on the first screen to continue.

3.  Accept the license agreement and click Next to continue.

4.  If required, change the location where the Connection Server files will be installed and click Next.

5. Select the type of Connection Server that you’ll be installing.  For this section, we’ll select the Horizon 7 Standard Server.  If you plan on allowing access to desktops through an HTML5 compatible web browser, select “Install HTML Access.”  Select the IP protocol that will be used to configure the Horizon environment.  Click Next to continue.

6. Enter a strong password for data recovery.  This will be used if you need to restore the Connection Server’s LDAP database from backup.  Make sure you store this password in a secure place.  You can also enter a password reminder or hint, but this is not required.

7. Horizon View requires a number of ports to be opened on the local Windows Server firewall, and the installer will prompt you to configure these ports as part of the installation.  Select the “Configure Windows Firewall Automatically” to have this done as part of the installation.

Note: Disabling the Windows Firewall is not recommended.  If you plan to use Security Servers to provide remote access, the Windows Firewall must be enabled on the Connection Servers to use IPSEC to secure communications between the Connection Server and the Security Server.  The Windows Firewall should not be disabled even if Security Servers and IPSEC are not required.

8. The installer will prompt you to select the default Horizon environment administrator.  The options that can be selected are the local server Administrator group, which will grant administrator privileges to all local admins on the server, or to select a specific domain user or group.  The option you select will depend on your environment, your security policies, and/or other requirements.

If you plan to use a specific domain user or group, select the “Authorize a specific domain user or domain group” option and enter the user or group name in the “domainname\usergroupname” format.

Note: If you plan to use a custom domain group as the default Horizon View administrator group, make sure you create it and allow it to replicate before you start the installation.

9.  Chose whether you want to participate in the User Experience Improvement program.  If you do not wish to participate, just click Next to continue.

10. Click Install to begin the installation.

11. The installer will install and configure the application and any additional windows roles or features that are needed to support Horizon View.

12. Once the install completes, click Finish.  You may be prompted to reboot the server after the installation completes.

Now that the Connection Server and Composer are installed, it’s time to begin configuring the Horizon application so the Connection Server can talk to both vCenter and Composer as well as setting up any required license keys and the events database.  Those steps will be covered in the next part.

## September 05, 2016

### R.I.Pienaar

#### Puppet 4 Sensitive Data Types

You often need to handle sensitive data in manifests when using Puppet. Private keys, passwords, etc. There has not been a native way to deal with these and so a cottage industry of community tools have spring up.

To deal with data at rest various Hiera backends like the popular hiera-eyaml exist, to deal with data on nodes a rather interesting solution called binford2k-node_encrypt exist. There are many more but less is more, these are good and widely used.

The problem is data leaks all over the show in Puppet – diffs, logs, reports, catalogs, PuppetDB – it’s not uncommon for this trusted data to show up all over the place. And dealing with this problem is a huge scope issue that will require adjustments to every component – Puppet, Hiera / Lookup, PuppetDB, etc.

But you have to start somewhere and Puppet is the right place, lets look at the first step.

## Sensitive[T]

Puppet 4.6.0 introduce – and 4.6.1 fixed – a new data type that decorates other data telling the system it’s sensitive. And this data cannot by accident become logged or leaked since the type will only return a string indicating it’s redacted.

It’s important to note this is step one of a many step process towords having a unified blessed way of dealing with Sensitive data all over. But lets take a quick look at them. The official specification for this feature lives here.

In the most basic case we can see how to make sensitive data, how it looks when logged or leaked by accident:

```\$secret = Sensitive("My Socrates Note") notice(\$secret)```

This prints out the following:

`Notice: Scope(Class[main]): Sensitive [value redacted]`

```\$secret = Sensitive(hiera("secret"))   \$unwrapped = \$secret.unwrap |\$sensitive| { \$sensitive } notice("Unwrapped: \${unwrapped}")   \$secret.unwrap |\$sensitive| { notice("Lambda: \${sensitive}") }```

Here you can see how to assign it unwrapped to a new variable or just use it in a block. Important to note you should never print these values like this and ideally you’d only ever use them inside a lambda if you have to use them in .pp code. Puppet has no concept of private variables so this \$unwrapped variable could be accessed from outside of your classes. A lambda scope is temporary and private.

The output of above is:

```Notice: Scope(Class[main]): Unwrapped: Too Many Secrets Notice: Scope(Class[main]): Lambda: Too Many Secrets```

So these are the basic operations, you can now of course pass the data around classes.

```class mysql ( Sensitive[String] \$root_pass ) { # somehow set the password }   class{"mysql": root_pass => Sensitive(hiera("mysql_root")) }```

Note here you can see the class specifically wants a String that is sensitive and not lets say a Number using the Sensitive[String] markup. And if you attempted to pass Sensitive(1) into it you’d get a type missmatch error.

## Conclusion

So this appears to be quite handy, you can see down the line that lookup() might have a eyaml like system and emit Sensitive data directly and perhaps some providers and types will support this. But as I said it’s early days so I doubt this is actually useful yet.

I mentioned how other systems like PuppetDB and so forth also need updates before this is useful and indeed today PuppetDB is oblivious to these types and stores the real values:

```\$ puppet query 'resources[parameters] { type = "Class" and title = "Test" }' ... { "parameters": { "string": "My Socrates Note" } }, ...```

So this really does not yet serve any purpose but as a step one it’s an interesting look at what will come.

## September 04, 2016

### Carl Chenet

#### Retweet 0.9: Automatically retweet & like

Retweet 0.9, a self-hosted Python app to automatically retweet and like tweets from another user-defined Twitter account, was released this September, 2nd.

Retweet 0.9 is already in production for Le Journal du hacker, a French-speaking Hacker News-like website, LinuxJobs.fr, a French-speaking job board and this very blog.

## What’s the purpose of Retweet?

Let’s face it, it’s more and more difficult to communicate about our projects. Even writing an awesome app is not enough any more. If you don’t appear on a regular basis on social networks, everybody thinks you quit or that the project is stalled.

But what if you already have built an audience on Twitter for, let’s say, your personal account. Now you want to automatically retweet and like all tweets from the account of your new project, to push it forward.

Sure, you can do it manually, like in the old good 90’s… or you can use Retweet!

## Twitter Out Of The Browser

Have a look at my Github account for my other Twitter automation tools:

• db2twitter, get data from SQL database (several supported), build tweets and send them to Twitter
• Twitterwatch, monitor the activity of your Twitter timeline and warn you if no new tweet appears

What about you? Do you use tools to automate the management of your Twitter account? Feel free to give me feedback in the comments below.

### HolisticInfoSec.org

#### Toolsmith Release Advisory: Kali Linux 2016.2 Release

On the heels of Black Hat and DEF CON, 31 AUG 2016 brought us the second Kali Rolling ISO release aka Kali 2016.2. This release provides a number of updates for Kali, including:
• New KDE, MATE, LXDE, e17, and Xfce builds for folks who want a desktop environment other than Gnome.
• Kali Linux Weekly ISOs, updated weekly builds of Kali that will be available to download via their mirrors.
• Bug Fixes and OS Improvements such as HTTPS support in busybox now allowing the preseed of Kali installations securely over SSL.
All details available here: https://www.kali.org/news/kali-linux-20162-release/
Thanks to Rob Vandenbrink for calling out this release.

## September 01, 2016

#### TPOSANA launched 15 years ago today!

15 years ago today (or August 24, 2001 depending on who you talk to) the first edition of The Practice of System and Network Administration reached bookstores.

We had been working on the book for 2+ years, having first met during a Usenix conference in 1999. Writing it was quite an experience, especially since this was before voice-chat on the internet was common, and we were on different continents (Christine in London and Tom in New Jersey). We collaborated via email, used CVS for our source code repository, and we had monthly phone calls (which Tom dialed from work... Thanks, Lucent!). At the time collaborating this way was considered quite radical. Most authors emailed chapters back and forth, and had a hell of a time with merge conflicts. Our publisher was amazed at our ability to collaborate so seamlessly. This kind of collaboration is now commonplace.

The book did quite well. We've sold more than 38,000 copies, in 3 editions (2001, 2007, and 2016), and many printings. It is available on softcover, ebook, and as a web page. It has been translated into Chinese and Russian. In 2005 we received the Usenix SAGE/LISA Outstanding Achievement Award. The 2nd and 3rd editions added an additional co-author, Strata R. Chalup. Strata's experience and project management skills have been a real asset. We've worked with many editors and other production people at Pearson / Addison-Wesley, starting with Karen Gettman who originally recruited us. Thanks Karen, Catherine, Mark, Debra, Kim, Michael, Julie and many others!

We've had a number of surprises along the way. Our favorite was visiting Google (before Tom worked there) and shown a supply closet full of copies of the book. It turned out all new Sysops members were issued a copy. Wow!

Most importantly, we've received a lot of fan mail. Hearing how the book helped people is the biggest joy of all.

In November, the 3rd edition will reach bookstores. We're very excited about the new edition. It has over 300 pages of new material. Dozens of new chapters. It is more modern, better organized, and has a lot of great new stories. You can pre-order the book today. You can read drafts online at SBO. Visit the-sysadmin-book.com for more info. (This is not to be confused with the sequel book, The Practice of Cloud System Administration.)

Thank you to everyone that has purchased a copy of The Practice of System and Network Administration. We really appreciate it!

### Carl Chenet

#### Dans le Libre : manger ce que l’on prépare

Dans le précédent article de cette série Dans le Libre : gratter ses propres démangeaisons, nous avons vu pourquoi dans une démarche de Libriste visant à régler un problème par l’utilisation ou l’écriture d’un programme, il était important d’identifier les tâches qui pouvaient être automatisées, d’établir un plan d’action, puis de rechercher des solutions existantes avant d’éventuellement s’attaquer soi-même à écrire un programme pouvant régler le problème.

Nous verrons aujourd’hui pourquoi il est important d’utiliser soi-même son propre logiciel.

## L’usage créé de nouveaux besoins

Ce faisant, il s’est avéré qu’utiliser notre propre programme levait peu à peu des nouveaux besoins. La première version de notre programme réglait le principal problème que nous avions, mais nous avons rapidement eu envie de rajouter des fonctionnalités pour répondre encore plus précisément à notre besoin et ce de manière complètement automatisée.

Manger ce que l’on prépare (Eat Your Own Dog Food en anglais, d’où l’image ci-dessous) – dans notre cas utiliser notre propre programme – nous a permis de mieux cerner le domaine des possibles et les évolutions futures à réaliser pour ce programme.

Manger ce que l’on prépare (librement traduit de Eat Your Own Dog Food en anglais), c’est s’assurer la résolution de son problème et peut-être d’autres utilisateurs

Un exemple de ce processus est l’application db2twitter, un logiciel capable de piocher des données dans une base de données SQL avant de créer des tweets avec et de les poster sur le réseau social Twitter.

Arrivé assez rapidement à une version 0.1, le besoin d’identifier les nouvelles entrées dans la base SQL et de tweeter en conséquence m’a rapidement poussé à écrire la 0.2. Puis le besoin de modifier la requête SQL à la source des interrogations vers la base m’a orienté vers une nouvelle fonctionnalité pour la version 0.3. La version 0.4 a apporté la possibilité de ré-émettre régulièrement des tweets déjà envoyés et ainsi de suite  jusqu’à la version actuelle 0.6.

db2twitter gère MySQL, mais aussi PostgreSQL, SQLite et plusieurs bases de données propriétaires

Avec une description systématique des changements des nouvelles versions, l’histoire des besoins du projet est simple à suivre. L’usage de votre propre programme amène donc à des nouveaux besoins.

## Tout est simple les premiers jours

Les premiers jours, tout est simple. Vous identifiez un nouveau besoin, vous mettez à jour vos sources dans votre dépôt, vous clonez  le code depuis votre compte Gitlab.com  sur votre serveur, une petite ligne dans la crontab et hop, c’est en production. C’est en effet le moment d’itérer très rapidement dans votre développement. Vous n’être pas encore ralenti par l’existant. Mais parfois aller trop vite peut vous jouer des tours. Quelques règles à retenir :

• vous n’avez pas créé dès la version 0.1 de votre projet un dépôt de sources pour rien.  Contraignez-vous à sortir de nouvelles versions complètes et pas seulement de pousser des nouveaux commits dans votre dépôt
• ouvrez systématiquement des rapports de bugs dès que vous avez une idée, quitte à fermer le rapport si vous trouvez ça inutile ensuite. Cela vous permet de suivre le développement de vos nouvelles fonctionnalités et de potentiellement communiquer dessus, en particulier auprès de vos utilisateurs ou potentiels utilisateurs
• lorsque vous souhaitez sortir une nouvelle version, générez une marque (tag) avec le numéro de la version en question, signez-le avec GPG éventuellement, puis poussez le vers votre dépôt
• au niveau du dépôt, avec la marque précédemment créée, générez une archive que vos utilisateurs pourront télécharger, facilitant ainsi la vie des empaqueteurs des différentes distributions intéressés par votre programme
• N’oubliez pas de définir une liste des changements qu’apporte la nouvelle version. Concentré dans un fichier appelé CHANGELOG, les nouveautés de votre nouvelle version peuvent prendre la forme d’un fichier dans vos sources ou vous pouvez remplir la description de la nouvelle version sur votre gestionnaire de sources. Attention toutefois si vous utilisez un service en ligne pour faire cela. Si le service s’arrête un jour, vous risquez de perdre cette liste des changements depuis la création de votre logiciel

Le site keepachangelog.com donne de très bonne lignes directrices pour créer votre fichier CHANGELOG

• Communiquez autour de vos nouvelles versions ! Ces communications autour de vos nouvelles versions sont la ligne de vie de votre existence publique. Cela inclue une dépêche sur un ou plusieurs sites web communautaires, un billet sur votre blog, des communications sur les réseaux sociaux et contacter vos empaqueteurs des différentes distributions Linux et/ou BSD si vous en avez.

Si ces bonnes pratiques peuvent paraître un peu lourdes pour un projet dont vous êtes le seul utilisateur au début, elles vont vous donner un cadre de publication qu’il faudra impérativement respecter dans le futur sous peine de rater des publications, d’être incompris ou ignoré de vos éventuels utilisateurs.

Un excellent exemple de cette évolution est Retweet, un outil pour retweeter automatiquement selon de nombreux critères. Dès la première version, une marque a été utilisée pour marquer la sortie d’une nouvelle version, l’archive et le CHANGELOG étant générés via l’interface web du dépôt Github de Retweet. On peut ainsi fournir le maximum d’informations aux utilisateurs dans le temps, et éventuellement fournir des repères chronologiques pour une chasse aux bugs ou de l’aide à de nouveaux contributeurs.

Github présente des outils pour générer les archives de versions et tracer les nouveautés associées, outils malheureusement propriétaires. Préférez-lui Gitlab.

## … mais avec plusieurs instances d’installées en production, ça se complique

Vous avez maintenant plusieurs versions à votre actif et une dizaine d’instances installées sur deux ou trois serveurs. Le temps entre le développement d’une nouvelle fonctionnalité et son arrivée en production a augmenté. Vous devez en effet coder la nouvelle fonctionnalité, la publier, la mettre en place sur différents serveurs, donc dans différents contextes servant des buts différents.

Dans l’idée principale de cet article et toujours en suivant notre exemple, utiliser votre propre programme a amené à un besoin de flexibilité. Très certainement le besoin d’un fichier de configuration s’est fait sentir, pour s’adapter à différentes configurations. Un besoin de stockage avec la gestion de fichiers ou d’une base de données s’est ajouté à votre projet, pour stocker les informations d’une version à l’autre.

Prenons un exemple : le projet Feed2tweet, petit projet qui récupère un flux RSS pour l’envoyer vers Twitter, est en production sur le Journal du hacker mais aussi LinuxJobs.fr, ce blog et bien d’autres aujourd’hui. Il a donc été indispensable au bout d’un moment d’automatiser le déploiement de nouvelles versions. Il est important de comprendre qu’utiliser son propre produit créé de nouveaux besoins, mais qu’une fois satisfaits, les nouvelles fonctionnalités répondant aux différents besoins permettent de pousser la flexibilité et le confort d’utilisation et de toucher ainsi des nouveaux cas d’utilisation et un public plus large.

LinuxJobs.fr, le site d’emploi de la communauté du Logiciel Libre, contribue au Logiciel Libre et à Feed2tweet

## Manger ce que l’on prépare mène à tout automatiser

Comme nous l’avons vu, notre petit projet devient s’étoffe et devient de plus en plus complexe. Il couvre désormais différents cas d’utilisation. Il devient difficile de le tester manuellement et de prendre en cas tous les cas d’utilisation. Mais aussi d’assurer un système fiable de publication des nouvelles versions ainsi qu’un système de déploiement assurant un minimum d’impact sur nos différents systèmes désormais en production.

Pour nous protéger de tout ces risques et continuer à fournir le meilleur service tout d’abord à nous, mais aussi désormais aux différents contributeurs/utilisateurs de notre programmeur, il est nécessaire d’automatiser les différents points vus ci-dessus. Nous aborderons différents moyens d’y arriver dans le prochain article de notre série

Et de votre côté ? Avez-vous amélioré votre propre programme par l’identification de nouveaux besoins en l’utilisant régulièrement ? N’hésitez pas à en parler dans les commentaires !

### Anton Chuvakin - Security Warrior

#### Monthly Blog Round-Up – August 2016

Here is my next monthly "Security Warrior" blog round-up of top 5 popular posts/topics this month:
1. Why No Open Source SIEM, EVER?” contains some of my SIEM thinking from 2009. Is it relevant now? You be the judge.  Succeeding with SIEM requires a lot of work, whether you paid for the software, or not. BTW, this post has an amazing “staying power” that is hard to explain – I suspect it has to do with people wanting “free stuff” and googling for “open source SIEM” …
2. “New SIEM Whitepaper on Use Cases In-Depth OUT!” (dated 2010) presents a whitepaper on select SIEM use cases described in depth with rules and reports [using now-defunct SIEM product]; also see this SIEM use case in depth and this for a more current list of popular SIEM use cases. Finally, see our 2016 research on security monitoring use cases here!
3. Simple Log Review Checklist Released!” is often at the top of this list – this aging checklist is still a very useful tool for many people. “On Free Log Management Tools” is a companion to the checklist (updated version)
4. My classic PCI DSS Log Review series is always popular! The series of 18 posts cover a comprehensive log review approach (OK for PCI DSS 3+ as well), useful for building log review processes and procedures , whether regulatory or not. It is also described in more detail in our Log Management book and mentioned in our PCI book (now in its 4th edition!)
5. “SIEM Resourcing or How Much the Friggin’ Thing Would REALLY Cost Me?” is a quick framework for assessing the costs of a SIEM project (well, a program, really) at an organization (much more details on this here in this paper).
In addition, I’d like to draw your attention to a few recent posts from my Gartner blog [which, BTW, now has about 5X of the traffic of this blog]:

Current research on SOC:
Current research on threat intelligence:
Miscellaneous fun posts:

(see all my published Gartner research here)
Also see my past monthly and annual “Top Popular Blog Posts” – 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015.
Disclaimer: most content at SecurityWarrior blog was written before I joined Gartner on Aug 1, 2011 and is solely my personal view at the time of writing. For my current security blogging, go here.

Previous post in this endless series:

## August 29, 2016

### That grumpy BSD guy

#### The Voicemail Scammers Never Got Past Our OpenBSD Greylisting

We usually don't see much of the scammy spam and malware. But that one time we went looking for them, we found a campaign where our OpenBSD greylisting setup was 100% effective in stopping the miscreants' messages.

During August 23rd to August 24th 2016, a spam campaign was executed with what appears to have been a ransomware payload. I had not noticed anything particularly unusual about the bsdly.net and friends setup that morning, but then Xavier Mertens' post at isc.sans.edu Voice Message Notifications Deliver Ransomware caught my attention in the tweetstream, and I decided to have a look.

The first step was, as always, to grep the spamd logs, and sure, there were entries with from: addresses of voicemail@ in several of the domains my rigs are somehow involved in handling mail for.

But no message from voicemail@bsdly.net had yet reached any mailbox within my reach at that point. However, a colleague checked the quarantine at one of his private mail servers, and found several messsages from voicemail@ aimed at users in his domains.

Dissecting a random sample confirmed that the message came with an attachment with a .wav.zip filename that was actually a somewhat obfuscated bit of javascript, and I take others at their word that this code, if executed on your Microsoft system, would wreak havoc of some sort.

At this point, before I start presenting actual log file evidence, it is probably useful to sketch how the systems here work and interact. The three machines skapet, deliah and portal are all OpenBSD systems that run spamd in greylisting mode, and they sync their spamd data with each other via spamd's own synchronization mechanism.

All of those machines do greytrapping based on the bsdly.net list of spamtraps, and skapet has the additional duty of dumping the contents of its greytrapping generated blacklist to a downloadable text file once per hour. Any message that makes it past spamd is then fed to a real mail server that performs content filtering before handing the messages over a user's mailbox or, in the case of domains we only do the filtering for, forwards the message to the target domain's mail server.

The results of several rounds of 'grep voicemail \$logfile' over the three spamd machines are collected here, or with the relatively uninteresting "queueing deletion of ..." messages removed, here.

From those sources we can see that there were a total of 386 hosts that attempted delivery, to a total of 396 host and target email pairs (annotated here in a .csv file with geographic origin according to whois).

The interesting part came when I started looking at the mail server logs to see how many had reached the content filtering or had even been passed on in the direction of users' mailboxes.

There were none.

The number of messages purportedly from voicemail@ in any of the domains we handle that made it even to the content filtering stage was 0.

Zero. Not a single one made it through even to content filtering.

That shouldn't have been a surprise.

After all I've spent significant time over the years telling people how effective greylisting is, and that the OpenBSD spamd version is the best of the breed.

You could take this episode as a recent data point that you are free to refer to in your own marketing pushes if you're doing serious business involving OpenBSD.

And if you're into those things, you will probably be delighted to learn, if you hadn't figured that out already, that a largish subset of the attempted deliveries were to addresses that were already in our published list of spamtrap addresses.

That means our miscreants automatically had themselves added to the list of trapped spammer IP addresses as intended.

If you're interested in how this works and why, I would suggest taking a peek at the OpenBSD web site, and of course I have a book out (available at that link and via better bookstores everywhere) that explains those things as well.

Relevant blog posts of mine include Keep smiling, waste spammers' time, Maintaining A Publicly Available Blacklist - Mechanisms And Principles, In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes In Play - A Full Recipe and a few others, including the somewhat lengty Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools . To fully enjoy the experience of what these articles describe, you may want to get hold of your own CD set from the OpenBSD store.

And again, if you're doing business involving OpenBSD, please head over to the project's donations page and use one or more of the methods there to send the developers some much needed cash.

In addition to the files directly referenced in this article, some related files are  available from this directory. I'll be happy to answer any reasonable queries related to this material.

Good night and good luck.

Update 2016-08-30: I've been getting questions about the currently active campaign that has document@ as its sender. The same story there: I see them in the greylist and spamd logs, no trace whatsoever in later steps. Which means they're not getting anyhwere.

Update 2016-09-13: A quick glance at a tail -f'ed spamd log file reveals that today's fake sender of choice is CreditControl@. Otherwise same story as before, no variations. And of course, there may have been dozens I haven't noticed in the meantime.

## August 26, 2016

### ma.ttias.be

#### Podcast: Application Security, Cryptography & PHP

The post Podcast: Application Security, Cryptography & PHP appeared first on ma.ttias.be.

In the latest episode I talk to Scott Arciszewski to discuss all things security: from the OWASP top 10 to cache timing attacks, SQL injection and local/remote file inclusion. We also talk about his secure CMS called Airship, which takes a different approach to over-the-air updates.

The post Podcast: Application Security, Cryptography & PHP appeared first on ma.ttias.be.

## August 24, 2016

### OpenSSL

#### The SWEET32 Issue, CVE-2016-2183

Today, Karthik Bhargavan and Gaetan Leurent from Inria have unveiled a new attack on Triple-DES, SWEET32, Birthday attacks on 64-bit block ciphers in TLS and OpenVPN. It has been assigned CVE-2016-2183.

This post gives a bit of background and describes what OpenSSL is doing. For more details, see their website.

Because DES (and triple-DES) has only a 64-bit block size, birthday attacks are a real concern. With the ability to run Javascript in a browser, it is possible to send enough traffic to cause a collision, and then use that information to recover something like a session Cookie. Their experiments have been able to recover a cookie in under two days. More details are available at their website. But the take-away is this: triple-DES should now be considered as “bad” as RC4.

Triple-DES, which shows up as “DES-CBC3” in an OpenSSL cipher string, is still used on the Web, and major browsers are not yet willing to completely disable it.

If you run a server, you should disable triple-DES. This is generally a configuration issue. If you run an old server that doesn’t support any better ciphers than DES or RC4, you should upgrade.

Within the OpenSSL team, we discussed how to classify this, using our security policy, and we decided to rate it LOW. This means that we just pushed the fix into our repositories. Here is what we did:

• For 1.0.2 and 1.0.1, we removed the triple-DES ciphers from the “HIGH” keyword and put them into “MEDIUM.” Note that we did not remove them from the “DEFAULT” keyword.

• For the 1.1.0 release, which we expect to release tomorrow, we will treat triple-DES just like we are treating RC4. It is not compiled by default; you have to use “enable-weak-ssl-ciphers” as a config option. Even when those ciphers are compiled, triple-DES is only in the “MEDIUM” keyword. In addition, because this is a new release, we also removed it from the “DEFAULT” keyword.

When you have a large installed base, it is hard to move forward in a way that will please everyone. Leaving triple-DES in “DEFAULT” for 1.0.x and removing it from 1.1.0 is admittedly a compromise. We hope the changes above make sense, and even if you disagree and you run a server, you can explicitly protect your users through configuration.

Finally, we would like to thank Karthik and Gaeten for reaching out to us, and working closely to coordinate our releases with their disclosure.

### Geek and Artist - Tech

This is the product of only about 5 minutes worth of thought, so take it with a grain of salt. When it comes to how to write maintainable, understandable code, there are as many opinions out there as there are developers. Personally I favour simple, understandable, even “boring” method bodies that don’t try to be flashy or use fancy language features. Method and class names should clearly signal intent and what the thing is or does. And, code should (IMHO) include good comments.

This last part is probably an area I’ve seen the most dissent. For some reason people hate writing comments, and think that the code should be “self-documenting”. I’ve rarely, perhaps never seen this in practice. While perhaps the intent was for it to be self-documenting, that never arose in practice.

Recently (and this is related, I promise), I watched a lot of talks (one, in person) and read a lot about the Zalando engineering principles. They base their engineering organisation around three pillars of How, What and Why. I think the same thing can be said for how you should write code and document it:

``` ```

``````class Widget
def initialize
@expires_at = Time.now + 86400
end

# Customer X was asking for the ability to expire     #  <--- Why
# widgets, but some may not have an expiry date or
# do not expire at all. This method handles these
# edge cases safely.
def is_expired?()                                     #  <--- What
return !!@expires_at && Time.now() > @expires_at    #  <--- How
end
end
``````
``` ```

This very simple example shows what I mean (in Ruby, since it's flexible and lends itself well to artificial examples like this). The method body itself should convey the How of the equation. The method name itself should convey the intent of the method - What does this do? Ultimately, the How and What can probably never fully explain the history and reasoning for their own existence. Therefore I find it helpful to accompany these with the Why in a method comment to this effect (and a comment above the method could also be within the method, or distributed across the method - it's not really important).

You could argue that history and reasoning for having the method can be determined from version control history. This turns coding from what should be a straightforward exercise into some bizarre trip through the Wheel of Time novels, cross-referencing back to earlier volumes in order to try to find some obscure fact that may or may not actually exist, so that you can figure out the reference you are currently reading. Why make the future maintainer of your code go through that? Once again, it relies entirely on the original committer having left a comprehensive and thoughtful message that is also easy to find.

The other counter argument is that no comments are better than out of date or incorrect comments. Again, personally I haven't run into this (or at least, not nearly as frequently as comments missing completely). Usually it will be pretty obvious where the comment does not match up with the code, and in this (hopefully outlier) case you can then go version control diving to find out when they diverged. Assessing contents of the code itself is usually far easier than searching for an original comment on the first commit of that method, so it seems like this should be an easier exercise.

Writing understandable code (and let's face it, most of the code written in the world is probably doing menial things like checking if statements, manipulating strings or adding/removing items from arrays) and comments is less fun than hacking out stuff that just works when you are feeling inspired, so no wonder we've invented an assortment of excuses to avoid doing it. So if you are one of the few actually doing this, thank you.

### Cryptography Engineering

#### Attack of the week: 64-bit ciphers in TLS

A few months ago it was starting to seem like you couldn’t go a week without a new attack on TLS. In that context, this summer has been a blessed relief. Sadly, it looks like our vacation is over, and it’s time to go back to school.

Today brings the news that Karthikeyan Bhargavan and Gaëtan Leurent out of INRIA have a new paper that demonstrates a practical attack on legacy ciphersuites in TLS (it’s called “Sweet32”, website here). What they show is that ciphersuites that use 64-bit blocklength ciphers — notably 3DES — are vulnerable to plaintext recovery attacks that work even if the attacker cannot recover the encryption key.

While the principles behind this attack are well known, there’s always a difference between attacks in principle and attacks in practice. What this paper shows is that we really need to start paying attention to the practice.

## So what’s the matter with 64-bit block ciphers?

Block ciphers are one of the most widely-used cryptographic primitives. As the nameimplies, these are schemes designed to encipher data in blocks, rather than a single bit at a time.

The two main parameters that define a block cipher are its block size (the number of bits it processes in one go), and its key size. The two parameters need not be related. So for example, DES has a 56-bit key and a 64-bit block. Whereas 3DES (which is built from DES) can use up to a 168-bit key and yet still has the same 64-bit block. More recent ciphers have opted for both larger blocks and larger keys.

When it comes to the security provided by a block cipher, the most important parameter is generally the key size. A cipher like DES, with its tiny 56-bit key, is trivially vulnerable to brute force attacks that attempt decryption with every possible key (often using specialized hardware). A cipher like AES or 3DES is generally not vulnerable to this sort of attack, since the keys are much longer.

However, as they say: key size is not everything. Sometimes the block size matters too.

You see, in practice, we often need to encrypt messages that are longer than a single block. We also tend to want our encryption to be randomized. To accomplish this, most protocols use a block cipher in a scheme called a mode of operation. The most popular mode used in TLS is CBC mode. Encryption in CBC looks like this:

The nice thing about CBC is that (leaving aside authentication issues) it can be proven (semantically) secure if we make various assumptions about the security of the underlying block cipher. Yet these security proofs have one important requirement. Namely, the attacker must not receive too much data encrypted with a single key.

The reason for this can be illustrated via the following simple attack.

Imagine that an honest encryptor is encrypting a bunch of messages using CBC mode. Following the diagram above, this involves selecting a random Initialization Vector ($IV$) of size equal to the block size of the cipher, then XORing $IV$ with the first plaintext block ($P$), and enciphering the result ($P \oplus IV$). The $IV$ is sent (in the clear) along with the ciphertext.

Most of the time, the resulting ciphertext block will be unique — that is, it won’t match any previous ciphertext block that an attacker may have seen. However, if the encryptor processes enough messages, sooner or later the attacker will see a collision. That is, it will see a ciphertext block that is the same as some previous ciphertext block. Since the cipher is deterministic, this means the cipher’s input ($P \oplus IV$) must be identical to the cipher’s previous input $(P' \oplus IV')$ that created the previous block.

In other words, we have $(P \oplus IV) = (P' \oplus IV')$, which can be rearranged as $(P \oplus P') = (IV \oplus IV')$. Since the IVs are random and known to the attacker, the attacker has (with high probability) learned the XOR of two (unknown) plaintexts!

What can you do with the XOR of two unknown plaintexts? Well, if you happen to know one of those two plaintext blocks — as you might if you were able to choose some of the plaintexts the encryptor was processing — then you can easily recover the other plaintext. Alternatively, there are known techniques that can sometimes recover useful data even when you don’t know both blocks.

The main lesson here is that this entire mess only occurs if the attacker sees a collision. And the probability of such a collision is entirely dependent on the size of the cipher block. Worse, thanks to the (non-intuitive) nature of the birthday bound, this happens much more quickly than you might think it would. Roughly speaking, if the cipher block is b bits long, then we should expect a collision after roughly $2^{b/2}$ encrypted blocks.

In the case of a 64-bit blocksize cipher like 3DES, this is somewhere in the vicinity of $2^{32}$, or around 4 billion enciphered blocks.

(As a note, the collision does not really need to occur in the first block. Since all blocks in CBC are calculated in the same way, it could be a collision anywhere within the messages.)

## Whew. I thought this was a practical attack. 4 billion is a big number!

It’s true that 4 billion blocks seems like an awfully large number. In a practical attack, the requirements would be even larger — since the most efficient attack is for the attacker to know a lot of the plaintexts, in the hope that she will be able to recover one unknown plaintext when she learns the value (P ⊕ P’).

However, it’s worth keeping in mind that these traffic numbers aren’t absurd for TLS. In practice, 4 billion 3DES blocks works out to 32GB of raw ciphertext. A lot to be sure, but not impossible. If, as the Sweet32 authors do, we assume that half of the plaintext blocks are known to the attacker, we’d need to increase the amount of ciphertext to about 64GB. This is a lot, but not impossible.

The Sweet32 authors take this one step further. They imagine that the ciphertext consists of many HTTPS connections, consisting of 512 bytes of plaintext, in each of which is embedded the same secret 8-byte cookie — and the rest of the session plaintext is known. Calculating from these values, they obtain a requirement of approximately 256GB of ciphertext needed to recover the cookie with high probability.

That is really a lot.

But keep in mind that TLS connections are being used to encipher increasingly more data. Moreover, a single open browser frame running attacker-controlled Javascript can produce many gigabytes of ciphertext in a single hour. So these attacks are not outside of the realm of what we can run today, and presumably will be very feasible in the future.

How does the TLS attack work?

While the cryptographic community has been largely pushing TLS away from ciphersuites like CBC, in favor of modern authenticated modes of operation, these modes still exist in TLS. And they exist not only for use not only with modern ciphers like AES, but they are often available for older ciphersuites like 3DES. For example, here’s a connection I just made to Google:

Of course, just because a server supports 3DES does not mean that it’s vulnerable to this attack. In order for a particular connection to be vulnerable, both the client and server must satisfy three main requirements:

1. The client and server must negotiate a 64-bit cipher. This is a relatively rare occurrence, but can happen in cases where one of the two sides is using an out-of-date client. For example, stock Windows XP does not support any of the AES-based ciphersuites. Similarly, SSL3 connections may negotiate 3DES ciphersuites.
2. The server and client must support long-lived TLS sessions, i.e., encrypting a great deal of data with the same key. Unfortunately, most web browsers place no limit on the length of an HTTPS session if Keep-Alive is used, provided that the server allows the session. The Sweet32 authors scanned and discovered that many servers (including IIS) will allow sessions long enough to run their attack. Across the Internet, the percentage of vulnerable servers is small (less than 1%), but includes some important sites.
3. The client must encipher a great deal of known data, including a secret session cookie. This is generally achieved by running adversarial Javascript code in the browser, although it could be done using standard HTML as well.

These caveats aside, the authors were able to run their attack using Firefox, sending at a rate of about 1500 connections per second. With a few optimizations, they were able to recover a 16-byte secret cookie in about 30 hours (a lucky result, given an expected 38 hour run time).The client must encipher a great deal of known data, including a secret session cookie. This is generally achieved by running adversarial Javascript code in the browser, although it could be done using standard HTML as well.

## So what do we do now?

While this is not an earthshaking result, it’s roughly comparable to previous results we’ve seen with legacy ciphers like RC4.

In short, while these are not the easiest attacks to run, it’s a big problem that there even exist semi-practical attacks that undo the encryption used in standard encryption protocols. This is a problem that we should address, and these attack papers help to make those problems more clear.

## August 22, 2016

I really like this comic.
I try to read/learn something every day.

Sometimes, when I find an interesting article, I like to mark it for reading it later.

I use many forms of marking, like pin tabs, bookmarking, sending url via email, save the html page to a folder, save it to my wallabag instance, leave my browser open to this tab, send the URL QR to my phone etc etc etc.

Are all the above ways productive?

None … the time to read something is now!
I mean the first time you lay your eyes upon the article.

Not later, not when you have free time, now.

That’s the way it works with me. Perhaps with you something else is more productive.

I have a short attention span and it is better for me to drop everything and read something carefully that save it for later or some other time.

When I really have to save it for later, my preferable way is to save it to my wallabag instance. It’s perfect and you will love it.

I also have a kobo ebook (e-ink) reader. Not the android based.
From my wallabag I can save them to epub and export them to my kobo.

But I am lazy and I never do it.

My kobo reader has a pocket (getpocket) account.

So I’ve tried to save some articles but not always pocket can parse properly the content of an article. Not even wallabag always work 100%.

The superiority of wallabag (and self-hosted application) is that when a parsing problem occurs I can fix them! Open a git push request and then EVERYBODY in the community will be able to read-this article from this content provider-later. I cant do something like that with pocket or readability.

There is a correct way to do ads and this is when you are not covering the article you want people to read!
The are a lot of wrong ways to do ads: inline the text, above the article, hiding some of the content, make people buy a fee, provide an article to small pages (you know that height in HTML is not a problem, right?) and then there is bandwidth issues.

When I am on my mobile, I DONT want to pay extra for bandwidth I DIDNT ask and certainly do not care about it!!!
If I read the article on my tiny mobile display DO NOT COVER the article with huge ads that I can not find the X-close button because it doesnt fit to my display !!!

So yes, there is a correct way to do ads and that is by respecting the reader and there is a wrong way to do ads.

Getting back to the article’s subject, below you will see six (6) ways to read an article on my desktop. Of course there are hundreds ways but there are the most common ones:

``````
``````

Extra info:
windows width: 852
2 times zoom-out to view more text

01. Pocket
02. Original Post in Firefox 48.0.1
03. Wallabag
05. Chromium 52.0.2743.116
06. Midori 0.5.11 - WebKitGTK+ 2.4.11

Click to zoom:

I believe that Reader View in Firefox is the winner of this test. It is clean and it is focusing on the actual article.
Impressive !

Tag(s): wallabag

### syslog.me

#### cfengine-tap now on GitHub

Back from the holiday season, I have finally found the time to publish a small library on GitHub. It’s called cfengine-tap and can help you writing TAP-compatible tests for your CFEngine policies.

TAP is the test anything protocol. It is a simple text format that test scripts can use to print out the results and test suites can consume. Originally born in the Perl world, it is now supported in many other languages.

Using this library it’s easier to write test suites for your CFEngine policies. Since it’s publicly available on GitHub and published under a GPL license, you are free to use it and welcome to contribute and make it better (please do).

Enjoy!

Tagged: cfengine, Configuration management, DevOps, Github, TAP, testing

## August 20, 2016

### Feeding the Cloud

#### Remplacer un disque RAID défectueux

Traduction de l'article original anglais à https://feeding.cloud.geek.nz/posts/replacing-a-failed-raid-drive/.

Voici la procédure que j'ai suivi pour remplacer un disque RAID défectueux sur une machine Debian.

# Remplacer le disque

Après avoir remarqué que `/dev/sdb` a été expulsé de mon RAID, j'ai utilisé smartmontools pour identifier le numéro de série du disque à retirer :

``````smartctl -a /dev/sdb
``````

Cette information en main, j'ai fermé l'ordinateur, retiré le disque défectueux et mis un nouveau disque vide à la place.

# Initialiser le nouveau disque

Après avoir démarré avec le nouveau disque vide, j'ai copié la table de partitions avec parted.

Premièrement, j'ai examiné la table de partitions sur le disque dur non-défectueux :

``````\$ parted /dev/sda
unit s
print
``````

et créé une nouvelle table de partitions sur le disque de remplacement :

``````\$ parted /dev/sdb
unit s
mktable gpt
``````

Ensuite j'ai utilisé la commande `mkpart` pour mes 4 partitions et je leur ai toutes donné la même taille que les partitions équivalentes sur `/dev/sda`.

Finalement, j'ai utilisé les commandes `toggle 1 bios_grub` (partition d'amorce) et `toggle X raid` (où X est le numéro de la partition) pour toutes les partitions RAID, avant de vérifier avec la commande `print` que les deux tables de partitions sont maintenant identiques.

# Resynchroniser/recréer les RAID

Pour synchroniser les données du bon disque (`/dev/sda`) vers celui de remplacement (`/dev/sdb`), j'ai exécuté les commandes suivantes sur mes partitions RAID1 :

``````mdadm /dev/md0 -a /dev/sdb2
``````

et j'ai gardé un oeil sur le statut de la synchronisation avec :

``````watch -n 2 cat /proc/mdstat
``````

Pour accélérer le processus, j'ai utilisé le truc suivant :

``````blockdev --setra 65536 "/dev/md0"
blockdev --setra 65536 "/dev/md2"
echo 300000 > /proc/sys/dev/raid/speed_limit_min
echo 1000000 > /proc/sys/dev/raid/speed_limit_max
``````

Ensuite, j'ai recréé ma partition swap RAID0 comme suit :

``````mdadm /dev/md1 --create --level=0 --raid-devices=2 /dev/sda3 /dev/sdb3
mkswap /dev/md1
``````

Par que la partition swap est toute neuve (il n'est pas possible de restorer une partition RAID0, il faut la re-créer complètement), j'ai dû faire deux choses:

• remplacer le UUID pour swap dans `/etc/fstab`, avec le UUID donné par la commande `mkswap` (ou bien en utilisant la command `blkid` et en prenant le UUID pour `/dev/md1`)
• remplacer le UUID de `/dev/md1` dans `/etc/mdadm/mdadm.conf` avec celui retourné pour `/dev/md1` par la commande `mdadm --detail --scan`

# S'assurer que l'on peut démarrer avec le disque de remplacement

Pour être certain de bien pouvoir démarrer la machine avec n'importe quel des deux disques, j'ai réinstallé le boot loader grub sur le nouveau disque :

``````grub-install /dev/sdb
``````

avant de redémarrer avec les deux disques connectés. Ceci confirme que ma configuration fonctionne bien.

Ensuite, j'ai démarré sans le disque `/dev/sda` pour m'assurer que tout fonctionnerait bien si ce disque décidait de mourir et de me laisser seulement avec le nouveau (`/dev/sdb`).

Ce test brise évidemment la synchronisation entre les deux disques, donc j'ai dû redémarrer avec les deux disques connectés et puis ré-ajouter `/dev/sda` à tous les RAID1 :

``````mdadm /dev/md0 -a /dev/sda2
``````

Une fois le tout fini, j'ai redémarrer à nouveau avec les deux disques pour confirmer que tout fonctionne bien :

``````cat /proc/mdstat
``````

et j'ai ensuite exécuter un test SMART complet sur le nouveau disque :

``````smartctl -t long /dev/sdb
``````

### pagetable

#### Reverse-Engineered GEOS 2.0 for C64 Source Code

The GEOS operating system managed to clone the Macintosh GUI on the Commodore 64, a computer with an 8 bit CPU and 64 KB of RAM. Based on Maciej Witkowiak's work, I created a reverse-engineered source version of the C64 GEOS 2.0 KERNAL for the cc65 compiler suite:

https://github.com/mist64/geos

• The source compiles into the exact same binary as shipped with GEOS 2.0.
• The source is well-structured and split up into 31 source files.
• Machine-specific code is marked up.
• Copy protection/trap mechanisms can be disabled.
• The build system makes sure binary layout requirements are met.

This makes the source a great starting point for

• adding (optional) optimized code paths or features
• integrating existing patches from various sources
• integrating versions for other computers
• porting it to different 6502-based computers

Just fork the project and send pull requests!

## August 18, 2016

### Michael Biven

#### The Loss of a Sense Doesn’t Always Heighten the Others

Over a two week break my wife and I were talking about a statement she read where someone was called the “long time embedded photojournalist for Burning Man” and how she disagreed with this. This person wasn’t shooting for any news organization. They’re known to be one of the Burning Man faithful which removes some impartiality they may have. In essence they’re a PR photographer for Burning Man.

This made me consider most of the output from social media is in one of two different camps. The first is “Free PR” and the second is “Critics”. You’re either giving away free material to promote an idea, organization, product, or a person or you’re criticizing them.

There is a third camp (Journalism), but so few people have the patience to provide an accurate and objective comment. And so few organization have the integrity to support those ideals. It’s like the goals of our past that helped drive society to do better has been toppled by a rush of individuals trying to create a better self-image.

After mentioning this idea she told me about a piece she heard recently on the radio where the guest was referencing an article from the New York Times about Trump. The host interrupted them to mention that it was an Op-Ed and not an article. This baffled the guest who didn’t understand that it was an opinion piece and not journalism.

The flood of sources of information, both accurate and inaccurate, provided on the internet hasn’t lead people to judge the validity of something better. Instead we have seen a rise in lies, ignorance, and the absurd being reported as fact. This phenomena even has a name… post-fact. As in we now live in a post-fact world. Think birthers, anti-vaxxers, or any other conspiracy theory movement.

Progress has been delayed or reversed by having to debunk ignorance being held up as fact. The time and energy being wasted by these distractions makes me wonder if this period of time will be know as the Age of Misfeasance.

P.S. After reading this post over and fixing one too many mistakes before I hit publish, I thought are people just writing so quickly and then having faith in a technology to fix their mistakes? Is autocomplete damaging our ability to think clearly, because we’re not reading and editing what we’re writing as much as in the past?

## August 16, 2016

### ma.ttias.be

#### TCP vulnerability in Linux kernels pre 4.7: CVE-2016-5696

The post TCP vulnerability in Linux kernels pre 4.7: CVE-2016-5696 appeared first on ma.ttias.be.

This is a very interesting vulnerability in the TCP stack of Linux kernels pre < 4.7. The bad news: there are a lot of systems online running those kernel versions. The bug/vulnerability is as follows.

Red Hat Product Security has been made aware of an important issue in
the Linux kernel's implementation of challenge ACKS as specified in
RFC 5961. An attacker which knows a connections client IP, server IP
and server port can abuse the challenge ACK mechanism
to determine the accuracy of a normally 'blind' attack on the client or server.

Successful exploitation of this flaw could allow a remote attacker to
inject or control a TCP stream contents in a connection between a
Linux device and its connected client/server.

* This does NOT mean that cryptographic information is exposed.
* This is not a Man in the Middle (MITM) attack.
[oss-security] CVE-2016-5389: linux kernel -- challange ack information leak

In short: a successful attack could hijack a TCP session and facilitate a man-in-the-middle attack and allow the attacker to inject data. Ie: altering the content on websites, modifying responses from webservers, ...

This Stack Overflow post explains it very well.

The hard part of taking over a TCP connection is to guess the source port of the client and the current sequence number.

The global rate limit for sending Challenge ACK's (100/s in Linux) introduced together with Challenge ACK (RFC5961) makes it possible in the first step to guess a source port used by the clients connection and in the next step to guess the sequence number. The main idea is to open a connection to the server and send with the source of the attacker as much RST packets with the wrong sequence mixed with a few spoofed packets.

By counting how much Challenge ACK get returned to the attacker and by knowing the rate limit one can infer how much of the spoofed packets resulted in a Challenge ACK to the spoofed client and thus how many of the guesses where correct. This way can can quickly narrow down which values of port and sequence are correct. This attack can be done within a few seconds.

And of course the attacker need to be able to spoof the IP address of the client which is not true in all environments. It might be possible in local networks (depending on the security measures) but ISP will often block IP spoofing when done from the usual DSL/cable/mobile accounts.

For RHEL (and CentOS derivatives), the following OS's are affected.

While it's no permanent fix, the following config will make it a lot harder to abuse this vulnerability.

`\$ sysctl -w net.ipv4.tcp_challenge_ack_limit=999999999`

And make it permanent so it persists on reboot:

`\$ echo "net.ipv4.tcp_challenge_ack_limit=999999999" >> /etc/sysctl.d/net.ipv4.tcp_challenge_ack_limit.conf`

While the attack isn't actually prevented, it is damn hard to reach the ACK limits.

The post TCP vulnerability in Linux kernels pre 4.7: CVE-2016-5696 appeared first on ma.ttias.be.

## August 15, 2016

### Electricmonk.nl

#### Disable "New release available" emails on Ubuntu

We have our Ubuntu machines set up to mail us the output of cron jobs like so:

``````\$ cat /etc/crontab
SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

# m h dom mon dow user    command```
```

This is encrible useful, since cronjobs should never output anything unless something is wrong.

Unfortunately, this means we also get emails like this:

```/etc/cron.weekly/update-notifier-common:
New release '16.04.1 LTS' available.

You can fairly easily disable these by modifying the corresponding cronjob` /etc/cron.weekly/update-notifier-common`:

``` #!/bin/sh

set -e

# Check to see whether there is a new version of Ubuntu available

Now you'll no longer receive these emails. It's also possible to remove the cronjob entirely, but then an upgrade is likely to put it back, and I have no idea if the cronjob has any other side effects besides emailing.

## August 14, 2016

### Failing forwards

Earlier this week Gmail's servers decided that any email sent from Gmail and then forwarded from pantz.org back to Gmail was now, as their servers put it "likely unsolicited mail". People sending mail from Gmail to pantz.org were getting bounce messages, which looks bad. All other email from any other domain was coming in without issue. I have been forwarding email from Gmail accounts for many years now without issue.

### Time to recheck settings

1. DNS A,AAAA and PTR records for IPV4,6 setup and work correctly
2. SPF record setup correctly, but this is a forward so it always shows fail. The bounce message passes SPF, so that's nice.
3. SMTP w/TLS working and available
4. pantz.org does not modify, remove or shuffle message headers or modify the body of the message in any way.
5. No increase in spam getting through. pantz.org filters most mail before it is forwarded.
6. Not using Sender Rewriting Scheme (SRS).
7. No DKIM setup

### The fix

After seeing that everything checked out, I hit up Google to see if anyone else was having this issue. From the results it seems that many people have had this same issue. Some people just started using SRS to fix their issue. Others had to fix their PTR records in DNS. The last group of people had to stop using IPV6 for mail delivery. Since all of the other pantz.org mail server settings were correct, the only thing I could try was implementing SRS or turn off IPV6. Turning off IPV6 delivery was the easiest test. After turning off IPV6 mail delivery, and just leaving IPV4, all mail from Gmail being forwarded through pantz.org was now being accepted again. How dumb is that?

### What is up with IPV6?

It seems Gmail has changed a setting (or I hit some new threshold) on their side dealing with only IPV6. Since Google will not tell you why certain mail is considered "unsolicited mail" we can not figure out what was done to try to fix the issue. If I had to speculate on what is happening, my guess is they turned up the sensitivity on email coming from IPV6 as it is obvious that IPV4 filter is not as sensitive. It is not just my server as it is happening to many other people as well.

I had also noticed that mail coming in from a friend whose server delivers mail to my server via IPV6, and then was forwarded to Gmail via IPV6 was being marked as spam every time. According to Google if the user is in your contacts list (and his email address is) the email is not supposed to be marked as spam. That is straight broken. Now that I switched back to just IPV4 delivery, all of his mail is not being marked as spam anymore. I believe Google has an issue with IPV6 mail delivery and spam classification.

### What now?

I hate that I had to turn off IPV6 for mail forwarding to Gmail. My next likely step is to implement SRS for forwarding, and see if I can turn IPV6 back on. The best article I found on setting this up on Postfix is here. It also shows how to setup DKIM which might be fun to do as well.

## August 13, 2016

### Cryptography Engineering

#### Is Apple’s Cloud Key Vault a crypto backdoor?

TL;DR: No, it isn’t. If that’s all you wanted to know, you can stop reading.

Still, as you can see there’s been some talk on Twitter about the subject, and I’m afraid it could lead to a misunderstanding. That would be too bad, since Apple’s new technology is kind of a neat experiment.

So while I promise that this blog is not going to become all-Apple-all-the-time, I figured I’d take a minute to explain what I’m talking about. This post is loosely based on an explanation of Apple’s new escrow technology that Ivan Krstic gave at BlackHat. You should read the original for the fascinating details.

### What is Cloud Key Vault (and what is iCloud Keychain)?

A few years ago Apple quietly introduced a new service called iCloud Keychain. This service is designed to allow you to back up your passwords and secret keys to the cloud. Now, if backing up your sensitive passwords gives you the willies, you aren’t crazy. Since these probably include things like bank and email passwords, you really want these to be kept extremely secure.

And — at least going by past experience — security is not where iCloud shines:

The problem here is that passwords need to be secured at a much higher assurance level than most types of data backup. But how can Apple ensure this? We can’t simply upload our secret passwords the way we upload photos of our kids. That would create a number of risks, including:

1. The risk that someone will guess, reset or brute-force your iCloud password. Password resets are a particular problem. Unfortunately these seem necessary for normal iCloud usage, since people do forget their passwords. But that’s a huge risk when you’re talking about someone’s entire password collection.
2. The risk that someone will break into Apple’s infrastructure. Even if Apple gets their front-end brute-forcing protections right (and removes password resets), the password vaults themselves are a huge target. You want to make sure that even someone who hacks Apple can’t get them out of the system.
3. The risk that a government will compel Apple to produce data. Maybe you’re thinking of the U.S. government here. But that’s myopic: Apple stores iCloud data all over the world.

So clearly Apple needs a better way to protect these passwords. How do to it?

### Why not just encrypt the passwords?

It is certainly possible for an Apple device to encrypt your password vault before sending it to iCloud. The problem here is that Apple doesn’t necessarily have a strong encryption key to do this with. Remember that the point of a backup is to survive the loss of your device, and thus we can’t assume the existence of a strong recovery key stored on your phone.

This leaves us with basically one option: a user password. This could be either the user’s iCloud password or their device passcode. Unfortunately for the typical user, these tend to be lousy. They may be strong enough to use as a login password — in a system that allows only a very limited number of login attempts. But the kinds of passwords typical users choose to enter on mobile devices are rarely strong enough to stand up to an offline dictionary attack, which is the real threat when using passwords as encryption keys.

(Even using a strong memory-hard password hash like scrypt — with crazy huge parameters — probably won’t save a user who chooses a crappy password. Blame phone manufacturers for making it painful to type in complicated passwords by forcing you to type them so often.)

### So what’s Apple to do?

So Apple finds itself in a situation where they can’t trust the user to pick a strong password. They can’t trust their own infrastructure. And they can’t trust themselves. That’s a problem. Fundamentally, computer security requires some degree of trust — someone has to be reliable somewhere.

Apple’s solution is clever: they decided to make something more trustworthy than themselves. To create a new trust anchor, Apple purchased a bunch of fancy devices called Hardware Security Modules, or HSMs. These are sophisticated, tamper-resistant specialized computers that store and operate with cryptographic keys, while preventing even malicious users from extracting them. The high-end HSMs Apple uses also allow the owner to include custom programming.

Rather than trusting Apple, your phone encrypts its secrets under a hardcoded 2048-bit RSA public key that belongs to Apple’s HSM. It also encrypts a function of your device passcode, and sends the resulting encrypted blob to iCloud. Critically, only the HSM has a copy of the corresponding RSA decryption key, thus only the HSM can actually view any of this information. Apple’s network sees only an encrypted blob of data, which is essentially useless.

When a user wishes to recover their secrets, they authenticate themselves directly to the HSM. This is done using a user’s “iCloud Security Code” (iCSC), which is almost always your device passcode — something most people remember after typing it every day. This authentication is done using the Secure Remote Password protocol, ensuring that Apple (outside of the HSM) never sees any function of your password.

Now, I said that device passcodes are lousy secrets. That’s true when we’re talking about using them as encryption keys — since offline decryption attacks allow the attacker to make an unlimited number of attempts. However, with the assistance of an HSM, Apple can implement a common-sense countermeasure to such attacks: they limit you to a fixed number of login attempts. This is roughly the same protection that Apple implements on the devices themselves.

 The encrypted contents of the data sent to the HSM (source).

The upshot of all these ideas is that — provided that the HSM works as designed, and that it can’t be reprogrammed — even Apple can’t access your stored data except by logging in with a correct passcode. And they only get a limited number of attempts to guess correctly, after which the account locks.

This rules out both malicious insiders and government access, with one big caveat.

### What stops Apple from just reprogramming its HSM?

This is probably the biggest weakness of the system, and the part that’s driving the “backdoor’ concerns above. You see, the HSMs Apple uses are programmable. This means that — as long as Apple still has the code signing keys — the company can potentially update the custom code it includes onto the HSM to do all sort sorts of things.

These things might include: programming the HSM to output decrypted escrow keys. Or disabling the maximum login attempt counting mechanism. Or even inserting a program that runs a brute-force dictionary attack on the HSM itself. This would allow Apple to brute-force your passcode and/or recover your passwords.

Fortunately Apple has thought about this problem and taken steps to deal with it. Note that on HSMs like the one Apple is using, the code signing keys live on a special set of admin smartcards. To remove these keys as a concern, once Apple is done programming the HSM, they run these cards through a process that they call a “physical one-way hash function”.

If that sounds complicated, here’s Ivan’s slightly simpler explanation.

So, with the code signing keys destroyed, updating the HSM to allow nefarious actions should not be possible. Pretty much the only action Apple can take is to  wipe the HSM, which would destroy the HSM’s RSA secret keys and thus all of the encrypted records it’s responsible for. To make sure all admin cards are destroyed, the company has developed a complex ceremony for controlling the cards prior to their destruction. This mostly involves people making assertions that they haven’t made copies of the code signing key — which isn’t quite foolproof. But overall it’s pretty impressive.

The downside for Apple, of course, is that there had better not be a bug in any of their programming. Because right now there’s nothing they can do to fix it — except to wipe all of their HSMs and start over.

### Couldn’t we use this idea to implement real crypto backdoors?

A key assertion I’ve heard is that if Apple can do this, then surely they can do something similar to escrow your keys for law enforcement. But looking at the system shows isn’t true at all.

To be sure, Apple’s reliance on a Hardware Security Module indicates a great deal of faith in a single hardware/software solution for storing many keys. Only time will tell if that faith is really justified. To be honest, I think it’s an overly-strong assumption. But iCloud Keychain is opt-in, so individuals can decide for themselves whether or not to take the risk. That wouldn’t be true of a mandatory law enforcement backdoor.

But the argument that Apple has enabled a law enforcement backdoor seems to miss what Apple has actually done. Instead of building a system that allows the company to recover your secret information, Apple has devoted enormous resources to locking themselves out. Only customers can access their own information. In other words, Apple has decided that the only way they can hold this information is if they don’t even trust themselves with it.

That’s radically different from what would be required to build a mandatory key escrow system for law enforcement. In fact, one of the big objections to such a backdoor — which my co-authors and I recently outlined in a report — is the danger that any of the numerous actors in such a system could misuse it. By eliminating themselves from the equation, Apple has effectively neutralized that concern.

### If Apple can secure your passwords this way, then why don’t they do the same for your backed up photos, videos, and documents?

That’s a good question. Maybe you should ask them?

## August 12, 2016

### Geek and Artist - Tech

#### Thoughts on creating an engineering Tech Radar

Perhaps you are familiar with the ThoughtWorks Tech Radar – I really like it as a useful summary of global technology trends and what I should be looking at familiarising myself with. Even the stuff on the “hold” list (such as Scaled Agile Framework – sometimes anti-patterns are equally useful to understand and appreciate). There’s a degree of satisfaction in seeing your favourite technology rise through the ranks to become something recommended to everyone, but also in my current (new) role it has a different purpose.

Since I started a new job just over a month ago, I’ve come into an organisation with a far simpler tech stack and in some regards, less well-defined technology strategy. I like to put in place measures to help engineers be as autonomous in their decision-making process as possible, so a Tech Radar can help frame which technologies they can or should consider when going about their jobs. This ranges from techniques they should strongly consider adopting (which can be much more of a tactical decision) to databases they could select from when building a new service that doesn’t fit the existing databases already in use. The Tech Radar forms something like a “garden fence” – you don’t necessarily need to implement everything within it, but it shows you where the limits are in case you need something new.

So basically, I wanted to use the Tech Radar as a way to avoid needing to continually make top-down decisions when stepping into unknown territory, and help the organisation and decision-making scale as we add more engineers. The process I followed to generate it was very open and democratic – each development team was gathered together for an hour, and I drew the radar format on the whiteboard. Then engineers contributed post-it notes with names of technologies and placed them on the board. After about 10 minutes of this, I read through all of the notes and got everyone to describe for the room the “what” and the “why” of their note. Duplicates were removed and misplaced notes moved to their correct place.

Afterwards, I transcribed everything into a Google Doc and asked everyone to again add the “what” and “why” of each contributed note to the document. What resulted was an 11-page gargantuan collection of technologies and techniques that seemed to cover everything that everyone could think of in the moment, and didn’t quite match up with my expectations. I’ll describe my observations about the process and outcomes.

### Strategy vs Tactics, and Quadrants

The purpose of the overall radar is to be somewhat strategic. ThoughtWorks prepares their radar twice a year, so it is expected to cover at least the next 6 months. Smaller companies might only prepare it once a year. However, amongst the different quadrants there is a reasonable amount of room for tactics as well. In particular I would say that the Techniques and Tools quadrants are much more tactical, whereas the Platforms and Languages & Frameworks quadrants are much more strategic.

For example, let’s say you have Pair Programming in the Techniques quadrant. Of course, you might strategically adopt this across the whole company, but a single team (in fact, just two developers) can try instituting it this very day, at no impact to anyone in other teams and probably not even others in the same team. It comes with virtually no cost to just try out, and start gaining benefit from immediately, even if nobody else is using it. Similarly, on the Tools side, you might decide to add a code test coverage reporting tool to your build pipeline. It’s purely informational, you benefit from it immediately and it doesn’t require anyone else’s help or participation, nor does it impact anyone else. For that reason it’s arguable whether these things are so urgent to place on the radar – developers can largely make the decisions themselves to adopt such techniques or tools.

On the other hand, the adoption of a new Language or Framework, or building on top of a new Platform (let’s say you want to start deploying your containers to Kubernetes) will come with a large time investment both immediately and ongoing, as well as needing wide-scale adoption across teams to benefit from that investment. Of course there is room for disagreement here – e.g. is a service like New Relic a tool or a platform? Adoption of a new monitoring tool definitely comes with a large cost (you don’t want every team using a different SaaS monitoring suite). But the Tech Radar is just a tool itself and shouldn’t be considered the final definition of anything – just a guide for making better decisions.

### Strategic Impact

As touched on above, adopting a Platform or new Language/Framework has significant costs. While putting together a radar like this with input from all people, who may have different levels of experience, you might find that not all of the strategic impacts have been considered when adding an item to the list. An incomplete list of things I believe need to be examined when selecting a Language or Framework could be:

• What are the hiring opportunities around this technology? Is it easier or harder to hire people with this skillset?
• Is it a growing community, and are we likely to find engineers at all maturity levels (junior/intermediate/senior) with experience in the technology?
• For people already in the company, is it easy and desirable to learn? How long does it take to become proficient?
• Similarly, how many people already at the company already know the technology well enough to be considered proficient for daily work?
• Does the technology actually solve a problem we have? Are there any things our current technologies do very well that would suffer from the new technology’s introduction?
• What other parts of our tech stack would need to change as a result of adopting it? Testing? Build tooling? Deployments? Libraries and Dependencies?
• Do we understand not only the language but also the runtime?
• Would it help us deliver more value to the customer, or deliver value faster?
• By taking on the adoption costs, would we be sacrificing time spent on maximising some other current opportunity?
• Is there a strong ecosystem of libraries and code around the technology? Is there a reliable, well-supported, stable equivalent to all of the libraries we use with our current technologies? If not, is it easy and fast to write our own replacements?
• How well does adoption of the technology align with our current product and technology roadmaps?

By no means is this list exhaustive, but I think all points need some thought, rather than just “is it nicer to program in than my current language”.

### Filtering the list and assembling the radar

As mentioned, I ended up with a fairly huge list of items which now needs to be filtered. This is a task for a CTO or VP of Engineering depending on your organisation size. Ultimately people accountable for the technology strategy need to set the bounds of the radar. For my list, I will attempt to pre-filter the items that have little strategic importance – like tools or techniques (unless we determine it’s something that could/should have widespread adoption and benefit).

Ultimately we’ll have to see what the output looks like and whether engineers feel it answers questions for them – that will determine whether we try to build a follow-up radar in the next quarter or year. If I end up running the process again, I suspect I’ll make use of a smaller group of people to add inputs – who have already collected and moderated inputs from their respective teams. The other benefit of the moderation/filtering process is that the document that is later produced is a way of expressing to engineers (perhaps with less experience) the inherent strategic importance of the original suggestions. There are no wrong suggestions, but we should aim to help people learn and think more about the application of strategy and business importance in their day to day work.

## August 10, 2016

### The Geekess

#### “I was only joking”

There was a very interesting set of tweets yesterday that dissected the social implications of saying, “I was only joking.” To paraphrase:

I’ve been mulling on the application of this analysis of humor with respect to the infamous “Donglegate” incident. Many men in tech responded with anger and fear over a conference attendee getting fired over a sexist joke. “It was only a joke!” they cried.

However, the justification falls flat if we assume that you’re never “just joking” and that jokes define in groups or out groups. The sexist joke shared between two white males (who were part of the dominant culture of conferences in 2013) defined them as part of the “in-group” and pushed the African American woman who overhead the “joke” into the “out-group”.

When the woman pushed back against the joke in by tweeting about it with a picture of the joker, the people who were part of the in-group who found that joke “funny” were angry. When the joker was fired, it was a sign that they were no longer the favored, dominant group. Fear of loss of social status is a powerful motivator, which is what caused people from the joke’s “in-group” to call for the woman to be fired as well.

Of course, it wasn’t all men who blasted the woman for reacting to a “joke”. There were many women who blasted the reporter for “public shaming”, or who thought the woman was being “too sensitive”, or rushed to reassure men that they had never experienced sexist jokes at conferences. Which brings us to the topic of “chill girls”:

The need for women to fit into a male-dominated tech world means that “chill girls” have to laugh at sexist jokes in order to be part of the “in-group”. To not laugh, or to call out the joker, would be to resign themselves to the “out-group”.

Humans have a fierce need to be socially accepted, and defining in-groups and out-groups is one way to secure that acceptance. This is exemplified in many people’s push back against what they see as too much “political correctness”.

For example, try getting your friends to stop using casually abelist terms like “lame”, “retarded”, “dumb”, or “stupid”. Bonus points if you can get them to remove classist terms like “ghetto” or homophobic statements like “that’s so gay”. What you’ll face are nonsense arguments like, “It’s just a word.” People who call out these terms are berated and no longer “cool”. Unconsciously or consciously, the person will try to preserve the in-groups and out-groups, and their own power from being a part of the in-group.

Stop laughing awkwardly. Your silence is only lending power to oppression. Start calling out people for alienating jokes. Stop preserving the hierarchy of classism, ablism, homophobia, transphobia, and sexism.

## August 09, 2016

### pagetable

#### Copy Protection Traps in GEOS for C64

Major GEOS applications on the Commodore 64 protect themselves from unauthorized duplication by keying themselves to the operating system's serial number. To avoid tampering with this mechanism, the system contains some elaborate traps, which will be discussed in this article.

## GEOS Copy Protection

The GEOS boot disk protects itself with a complex copy protection scheme, which uses code uploaded to the disk drive to verify the authenticity of the boot disk. Berkeley Softworks, the creators of GEOS, found it necessary to also protect their applications like geoCalc and geoPublish from unauthorized duplication. Since these applications were running inside the GEOS "KERNAL" environment, which abstracted most hardware details away, these applications could not use the same kind of low-level tricks that the system was using to protect itself.

## Serial Numbers for Protection

The solution was to use serial numbers. On the very first boot, the GEOS system created a 16 bit random number, the "serial number", and stored it in the KERNAL binary. (Since the system came with a "backup" boot disk, the system asked for that disk to be inserted, and stored the same serial in the backup's KERNAL.) Now whenever an application was run for the first time, it read the system's serial number and stored it in the application's binary. On subsequent runs, it read the system's serial number and compared it with the stored version. If the serial numbers didn't match, the application knew it was running on a different GEOS system than the first time – presumably as a copy on someone else's system: Since the boot disk could not be copied, two different people had to buy their own copies of GEOS, and different copies of GEOS had different serial numbers.

## Serial Numbers in Practice

The code to verify the serial number usually looked something like this:

``````.,D5EF  20 D8 C1    JSR \$C1D8 ; GetSerialNumber
.,D5F2  A5 03       LDA \$03   ; read the hi byte
.,D5F4  CD 2F D8    CMP \$D82F ; compare with stored version
.,D5F7  F0 03       BEQ \$D5FC ; branch if equal
.,D5F9  EE 18 C2    INC \$C218 ; sabotage LdDeskAcc syscall: increment vector
.,D5FC  A0 00       LDY #\$00  ; ...``````

If the highest 8 bits of the serial don't match the value stored in the application's binary, it increments the pointer of the LdDeskAcc vector. This code was taken from the "DeskTop" file manager, which uses this subtle sabotage to make loading a "desk accessory" (a small helper program that can be run from within an application) unstable. Every time DeskTop gets loaded, the pointer gets incremented, and while LdDeskAcc might still work by coincidence the first few times (because it only skips a few instructions), it will break eventually. Other applications used different checks and sabotaged the system in different ways, but they all had in common that they called GetSerialNumber.

(DeskTop came with every GEOS system and didn't need any extra copy protection, but it checked the serial anyway to prevent users from permanantly changing their GEOS serial to match one specific pirated application.)

## A Potential Generic Hack

The downside of this scheme is that all applications are protected the same way, and a single hack could potentially circumvent the protection of all applications.

A generic hack would change the system's GetSerialNumber implementation to return exactly the serial number expected by the application by reading the saved value from the application's binary. The address where the saved valus is stored is different for every application, so the hack could either analyze the instructions after the GetSerialNumber call to detect the address, or come with a small table that knows these addresses for all major applications.

GEOS supports auto-execute applications (file type \$0E) that will be executed right after boot – this would be the perfect way to make this hack available at startup without patching the (encrypted) system files.

## Trap 1: Preventing Changing the Vector

Such a hack would change the GetSerialNumber vector in the system call jump table to point to new code in some previously unused memory. But the GEOS KERNAL has some code to counter this:

``````                                ; (Y = \$FF from the code before)
.,EE59  B9 98 C0    LDA \$C098,Y ; read lo byte of GetSerialNumber vector
.,EE5C  18          CLC
.,EE5F  99 38 C0    STA \$C038,Y ; overwrite low byte GraphicsString vector``````

In the middle of code that deals with the menu bar and menus, it uses this obfuscated code to sabotage the GraphicsString system call if the GetSerialNumber vector was changed. If the GetSerialNumber vector is unchanged, these instructions are effectively a no-op: The lo byte of the system's GetSerialNumber vector (\$F3) plus \$5A equals the lo byte of the GraphicsString vector (\$4D). But if the GetSerialNumber vector was changed, then GraphicsString will point to a random location and probably crash.

Berkely Softworks was cross-developing GEOS on UNIX machines with a toolchain that supported complex expressions, so they probably used code like this to express this:

``````    ; Y = \$FF
lda GetSerialNumber + 1 - \$FF,y
clc
sta GraphicsString + 1 - \$FF,y``````

In fact, different variations of GEOS (like the GeoRAM version) were separate builds with different build time arguments, so because of different memory layouts, they were using different ADC values here.

Note that the pointers to the GetSerialNumber and GraphicsString have been obfuscated, so that an attacker that has detected the trashed GraphicsString vector won't be able to find the sabotage code by looking for the address.

## Trap 2: Preventing Changing the Implementation

If the hack can't change the GetSerialNumber vector, it could put a JMP instruction at the beginning of the implementation to the new code. But the GEOS KERNAL counters this as well. The GetSerialNumber implementation looks like this:

``````.,CFF3  AD A7 9E    LDA \$9EA7 ; load lo byte of serial
.,CFF6  85 02       STA \$02   ; into return value (lo)
.,CFF8  AD A8 9E    LDA \$9EA8 ; load hi byte of serial
.,CFFB  85 03       STA \$03   ; into return value (hi)
.,CFFD  60          RTS       ; return``````

At the end of the system call function UseSystemFont, it does this:

``````.,E6C9  AD 2F D8    LDA \$D82F ; read copy of hi byte of serial
.,E6CC  D0 06       BNE \$E6D4 ; non-zero? done this before already
.,E6CE  20 F8 CF    JSR \$CFF8 ; call second half of GetSerialNumber
.,E6D1  8D 2F D8    STA \$D82F ; and store the hi byte in our copy
.,E6D4  60          RTS       ; ...``````

And in the middle of the system call function FindFTypes, it does this:

``````.,D5EB  A2 C1       LDX #\$C1
.,D5ED  A9 96       LDA #\$96  ; public GetSerialNumber vector (\$C196)
.,D5EF  20 D8 C1    JSR \$C1D8 ; "CallRoutine": call indirectly (obfuscation)
.,D5F2  A5 03       LDA \$03   ; read hi byte of serial
.,D5F4  CD 2F D8    CMP \$D82F ; compare with copy
.,D5F7  F0 03       BEQ \$D5FC ; if identical, skip next instruction
.,D5F9  EE 18 C2    INC \$C218 ; sabotage LdDeskAcc by incrementing its vector
.,D5FC  A0 00       LDY #\$00  ; ...``````

So UseSystemFont makes a copy of the hi byte of the serial, and FindFTypes compares the copy with the serial – so what's the protection? The trick is that one path goes through the proper GetSerialNumber vector, while the other one calls into the bottom half of the original implementation. If the hack overwrites the first instruction of the implementation (or managed to disable the first trap and changed the system call vector directly), calling though the vector will reach the hack, while calling into the middle of the original implementation will still reach the original code. If the hack returns a different value than the original code, this will sabotage the system in a subtle way, by incrementing the LdDeskAcc system call vector.

Note that this code calls a KERNAL system function that will call GetSerialNumber indirectly, so the function pointer is split into two 8 bit loads and can't be found by just searching for the constant. Since the code in UseSystemFont doesn't call GetSerialNumber either, an attacker won't find a call to that function anywhere inside KERNAL.

## Summary

I don't know whether anyone has ever created a generic serial number hack for GEOS, but it would have been a major effort – in a time before emulators allowed for memory watch points. An attacker would have needed to suspect the memory corruption, compared memory dumps before and after, and then found the two places that change code. The LdDeskAcc sabotage would have been easy to find, because it encodes the address as a constant, but the GraphicsString sabotage would have been nearly impossible to find, because the trap neither uses the verbatim GraphicsString address nor GetSerialNumber function.

Usual effective hacks were much more low-tech and instead "un-keyed" applications, i.e. removed the cached serial number from their binaries to revert them into their original, out-of-the-box state.

## August 08, 2016

### That grumpy BSD guy

#### Chinese Hunting Chinese Over POP3 In Fjord Country

Yes, you read that right: There is a coordinated effort in progress to steal Chinese-sounding users' mail, targeting machines at the opposite end of the Eurasian landmass (and probably elsewhere).

More specifically, here at bsdly.net we've been seeing attempts at logging in to the pop3 mail retrieval service using usernames that sound distinctively like Chinese names, and the attempts originate almost exclusively from Chinese networks.

This table lists the user names and corresponding real life names attempted so far:

 Name Username Chen Qiang chenqiang Fa Dum fadum Gao Dang gaodang Gao Di gaodi Gao Guan gaoguan Gao Hei gaohei Gao Hua gaohua Gao Liu gaoliu Gao Yang gaoyang Gao Zhang gaozhang He An hean He Biao hebiao He Bing hebing He Chang hechuang He Chao hechao He Chen hechen He Cheng hecheng He Chun hechun He Cong hecong He Da heda He Di hedi He Die hedie He Ding heding He Dong hedong He Duo heduo He Fa hefa He Ging heqing He Guo heguo He Han hehan He Hao hehao He Heng heheng He Hui hehui He Jia hejia He Jian hejian He Jiang hejiang He Jie hejie He Jin hejin He Juan hejuan He Kai hekai He Kan hekan He Kong hekong He La hela He Le hele He Leng heleng He Li heli He Lian helian He Lie helie He Mu hemu He Niang heniang He Quan hequan He Ran heran He Sha hesha He Shan heshan He Shi heshi He Si hesi He Song hesong He Xiao hexiao He Yao heyao He Yi heyi He Yin heyin He Yu heyu He Yun heyun He Zeng hezeng He Zeng hezhan He Zhang hezhangxxxx He Zhe hezhe He Zheng hezheng He Zhi hezhi He Zhong hezhong He Zhuang hezhuang Li An lian Li Biao libiao Li Bin libin Li Bo libo Li Cheng licheng Li Chi lichi Li Chong lichong Li Chuang lichuang Li Chun lichun Li Da lida Li Deng lideng Li Di lidi Li Die lidie Li Ding liding Li Dong lidong Li Duo liduo Li Fa lifa Li Fang lifang Li Fen lifen Li Feng lifeng Li Gang ligang Li Gao ligao Li Guan liguan Li Guang liguang Li Hai lihai Li Ka lika Li Kai likai Li La lila Li Le lile Li Lei lilei Li Lin lilin Li Ling liling Li Liu liliu Li Long lilong Li Man liman Li Mei limei Li Mu limu Li Neng lineng Li Niang liniang Li Peng lipeng Li Pian lipian Li Qian liqian Li Qu liqu Li Rang lirang Li Ren liren Li Ru liru Li Sha lisha Li Shi lishi Li Shuai lishuai Li Shun lishun Li Si lisi Li Song lisong Li Tao litao Li Teng liteng Li Tian litian Li Ting liting Li Wang liwang Li Wei liwei Li Wen liwen Li Xiang lixiang Li Xing lixing Li Xiu lixiu Li Ying liying Li You liyou Li Ze lize Li Zeng lizeng Li Zheng lizheng Li Zhong lizhong Li Zhu lizhu Li Zhuang lizhuang Li Zhuo lizhuo Liang Min liangmin Liang Ming liangming Liang Qiang liangqiang Liang Rui liangrui Lin Chen linchen Lin Cheng lincheng Lin He linhe Lin Hua linhua Lin Huang linhuang Lin Neng linneng Lin Pian linpian Lin Qu linqu Lin Ru linru Lin Zhang linzhang Liu Bin liubin Liu Duo liuduo Liu Fang liufang Liu Han liuhan Liu Hao liuhao Liu Heng liuheng Liu Hong liuhong Liu Hui liuhui Liu Jia liujia Liu Jiang liujiang Liu Jiao liujiao Liu Ju liuju Liu Juan liujuan Liu Kai liukai Liu Kan liukan Liu Kang liukang Liu Ke liuke Liu Kong liukong Liu Lang liulang Liu Long liulong Liu Mu liumu Liu Nuo liunuo Liu Qin liuqin Liu Qing liuqing Liu Qiong liuqiong Liu Rong liurong Liu Sen liusen Liu Sha liusha Liu Shun liushun Liu Si liusi Liu Tian liutian Liu Wang liuwang Liu Wei liuwei Liu Xia liuxia Liu Xiu liuxiu Liu Yao liuyao Liu Yi liuyi Liu Ying liuying Liu Yu liuyu Liu Yuan liuyuan Liu Yun liuyun Liu Zhen liuzhen Liu Zheng liuzheng Liu Zhi liuzhi Liu Zun liuzun Lou Liu luoliu Lu Huang lihuang Luo Chang luochuang Luo Chen luochen Luo Cheng luocheng Luo Deng luochi Luo Deng luodeng Luo Di luodi Luo Dian luodian Luo Gao luogao Luo Guai luoguai Luo Hang luohuang Luo Hua luohua Luo Lie luolie Luo Neng luoneng Luo Pian luopian Luo Qi luoqi Luo Qin luoqin Luo Qing luoqing Luo Qu luoqu Luo Rong luorong Luo Ru luoru Luo Rui luorui Luo Shuang luoshuang Luo Ting luoting Luo Tong luotong Luo Wang luowang Luo Wei luowei Luo Yang luoyang Luo Ze luoze Song Chen songchen Song Cheng songcheng Song Chuang songchuang Song Da songda Song Deng songdeng Song Dian songdian Song Die songdie Song Fei songfei Song Fen songfen Song Gang songgang Song Gao songgao Song Guai songguai Song Guan songguan Song Guo songguo Song Hai songhai Song Han songhan Song Hang songhang Song He songhe Song Hei songhei Song Heng songheng Song Hu songhu Song Hua songhua Song Jia songjia Song Jiao songjiao Song Jie songjie Song Jin songjin Song Jing songjing Song Ka songka Song Kan songkan Song Kang songkang Song Kong songkong Song Lan songlan Song Le songle Song Lei songlei Song Lian songlian Song Liang songliang Song Liang songliao Song Liang songliang Song Liao songliao Song Lin songlin Song Liu songliu Song Meng songmeng Song Ming songming Song Mu songmu Song Nan songnan Song Neng songneng Song Ning songning Song Pian songpian Song Pin songpin Song Qi songqi Song Qiang songqiang Song Qing songqing Song Qiu songqiu Song Ran songran Song Rong songrong Song Rui songrui Song Sha songsha Song Shuai songshuai Song Shuang songshuang Song Song songsong Song Song Jun songsongjun Song Tao songtao Song Teng songteng Song Wang songwang Song Wei songwei Song Xi songxi Song Xia songxia Song Xiu songxiu Song Ya songya Song Yang songyang Song Yong songyong Song You songyou Song Yuan songyuan Song Yue songyue Song Yun songyun Song Zhe songzhe Song Zhen songzhen Song Zheng songzheng Song Zhuang songzhuang Tan Qian tangqian Tang Bing tangbing Tang Chi tangchi Tang Chong tangchong Tang Chuang tangchuang Tang Cong tangcong Tang Di tangdi Tang Dian tangdian Tang Duo tangduo Tang Fa tangfa Tang Fan tangfan Tang Fang tangfang Tang Fei tangfei Tang Fen tangfen Tang Feng tangfeng Tang Gang tanggang Tang Guai tangguai Tang Guan tangguan Tang Guang tangguang Tang Guo tangguo Tang Han tanghan Tang Hao tanghao Tang Hei tanghei Tang Heng tangheng Tang Hong tanghong Tang Hu tanghu Tang Hui tanghui Tang Jie tangjie Tang Jin tangjin Tang Jing tangjing Tang Ju tangju Tang Ka tangka Tang Kai tangkai Tang Kan tangkan Tang Kang tangkang Tang Ke tangke Tang Kong tangkong Tang La tangla Tang Lang tanglang Tang Le tangle Tang Leng tangleng Tang Li tangli Tang Lian tanglian Tang Lie tanglie Tang Lin tanglin Tang Ling tangling Tang Liu tangliu Tang Long tanglong Tang Mei tangmei Tang Mo tangmo Tang Mu tangmu Tang Neng tangneng Tang Niang tangniang Tang Nuo tangnuo Tang Peng tangpeng Tang Pian tangpian Tang Ping tangping Tang Qian tangqian Tang Qin tangqin Tang Qu tangqu Tang Quan tangquan Tang Quing tangqing Tang Rang tangrang Tang Ren tangren Tang Ru tangru Tang Ruan tangruan Tang Rui tangrui Tang Sen tangsen Tang Sha tangsha Tang Shan tangshan Tang Shi tangshi Tang Shun tangshun Tang Song tangsong Tang Tang Jun tangtangjun Tang Tao tangtao Tang Tian tangtian Tang Tian tangyan Tang Wei tangwei Tang Xi tangxi Tang Xia tangxia Tang Xing tangxing Tang Xiong tangxiong Tang Yan tangyan Tang Yang tangyang Tang Yao tangyao Tang Yi tangyi Tang Ying tangying Tang Yong tangyong Tang You tangyou Tang Yue tangyue Tang Yun tangyun Tang Ze tangze Tang Zeng tangzeng Tang Zhang tangzhang Tang Zhe tangzhe Tang Zhen tangzhen Tang Zun tangzun Xie An xiean Xie Bin xiebin Xie Bo xiebo Xie Chao xiechao Xie Cong xiecong Xie Da xieda Xie Di xiedi Xie Dian xiedian Xie Die xiedie Xie Ding xieding Xie Dong xiedong Xie Duo xieduo Xie Fang xiefang Xie Fei xiefei Xie Feng xiefeng Xie Gang xiegang Xie Gao xiegao Xie Guai xieguai Xie Guan xieguan Xie Hai xiehai Xie Hang xiehang Xie Heng xieheng Xie Heng xieneng Xie Heng xieheng Xie Heng xieneng Xie Hong xiehong Xie Hu xiehu Xie Hui xiehui Xie Jia xiejia Xie Jian xiejian Xie Jiang xiejiang Xie Jiao xiejiao Xie Jie xiejie Xie Jing xiejing Xie Ju xieju Xie Kai xiekai Xie La xiela Xie Leng xieleng Xie Liang xieliang Xie Lie xielie Xie Lin xielin Xie Ling xieling Xie Long xielong Xie Man xieman Xie Meng xiemeng Xie Min xiemin Xie Ming xieming Xie Na xiena Xie Niang xieniang Xie Peng xiepeng Xie Pian xiepian Xie Pin xiepin Xie Qi xieqi Xie Qing xieqing Xie Qiong xieqiong Xie Qiu xieqiu Xie Qu xiequ Xie Quan xiequan Xie Ran xieran Xie Ruan xieruan Xie Rui xierui Xie Sha xiesha Xie Shuang xieshuang Xie Si xiesi Xie Tao xietao Xie Ting xieting Xie Tong xietong Xie Wei xiewei Xie Wen xiewen Xie Xi xiexi Xie Xiang xiexiang Xie Xin xiexin Xie Xing xiexing Xie Xiu xiexiu Xie Ya xieya Xie Yi xieyi Xie Yin xieyin Xie Ying xieying Xie Yong xieyong Xie Yu xieyu Xie Yue xieyue Xie Zeng xiezeng Xie Zhan xiezhan Xie Zhang xiezhang Xie Zhe xiezhe Xie Zhuo xiezhuo Zheng Nan zhengnan

That list of some 493 names is up to date as of this writing, 2016-08-23 early evening CEST. A few more turn up with the bursts of activity we have seen every day since June 19th, 2016.

A possibly more up to date list is available here. That's a .csv file, if that sounds unfamiliar, think of it as a platform neutral text representation (to wit, "Comma Separated Values") of a spreadsheet or database -- take a peek with Notepad.exe or similar if you're not sure. I'll be updating that second list along with other related data at quasi-random intervals as time allows and as long as interesting entries keep turning up in my logs.

If your name or username is on either of those lists, you would be well advised to change your passwords right now and to check breach notification sites such as Troy Hunt's haveibeenpwned.com or breachalarm.com for clues to where your accounts could have been compromised.

That's your scoop for now. If you're interested in some more background and data, keep reading.

If you are a regular or returning reader of this column, you are most likely aware that I am a Unix sysadmin. In addition to operating and maintaining variuos systems in my employers' care, I run a small set of servers of my own that run a few Internet-facing services for myself and a small circle of friends and family.

For the most part those systems are roundly ignored by the world at large, but when they are not, funny, bizarre or interesting things happen. And mundane activities like these sometimes have interesting byproducts. When you run a mail service, you are bound to find a way to handle the spam people will try to send, and about ten years ago I started publishing a blacklist of known spamming hosts, generated from attempts to deliver mail to a slowly expanding list of known bad, invalid, never to be deliverable addresses in the domains we handle mail for.

After a while, I discovered that the list of spamtrap addresses (once again, invalid and destined never to be deliverable, ever) had been hilariously repurposed: The local parts (the string before the @ or 'at sign') started turning up as usernames in failed attempts to log on to our pop3 mail retrieval service. That was enough fun to watch that I wrote that article, and for reasons known only to the operators of the machines at the other end, those attempts have never stopped entirely.

These attempts to log in as our imaginary friends is a strong contender for the most bizarre and useless activity ever, but when those attempts were no longer news, there was nothing to write about. The spamtrap login attempts make up sort of a background noise in the authentication logs, and whenever there is an attempt to log in as a valid user from somewhere that user is clearly not, the result is usually that an entire network (whatever I could figure out from whois output) would be blocked from any communication with our site for 24 hours.

There are of course also attempts to log in as postmaster, webmaster and other IDs, some RFC mandated, that most sites including this one would handle as aliases to make up the rest of the background noise.

Then recently, something new happened. The first burst looked like this in my logs (times given in local timezone, CEST at the time):

Jun 19 06:14:58 skapet spop3d[37601]: authentication failed: no such user: lilei - 59.54.197.34
Jun 19 06:15:01 skapet spop3d[46539]: authentication failed: no such user: lilei - 59.54.197.34
Jun 19 06:15:03 skapet spop3d[8180]: authentication failed: no such user: lilei - 59.54.197.34

-- and so on, for a total of 78 attempts to log in as the non-existing user lilei, in the space of about five minutes. A little later, a similar burst of activity came for the user name lika:

Jun 19 14:11:30 skapet spop3d[68573]: authentication failed: no such user: lika - 182.87.253.48
Jun 19 14:12:22 skapet spop3d[22421]: authentication failed: no such user: lika - 182.87.253.28
Jun 19 14:12:26 skapet spop3d[7587]: authentication failed: no such user: lika - 182.87.253.28
Jun 19 14:12:30 skapet spop3d[16753]: authentication failed: no such user: lika - 182.87.253.28

and so on, for a total of 76 attempts. Over the next few days I noticed an uptick in failed pop3 access attempts that were not for valid users and did not match any entry on our spamtraps list. Still, those attempts were for users that do not exist, and would produce no useful result so I did not do anything much about them.

It was only during the early weeks of July that it struck me that the user name attempted here

Jul  8 12:19:08 skapet spop3d[54818]: authentication failed: no such user: lixing - 49.87.78.12
Jul  8 12:19:28 skapet spop3d[1987]: authentication failed: no such user: lixing - 49.87.78.12
Jul  8 12:19:37 skapet spop3d[70622]: authentication failed: no such user: lixing - 49.87.78.12
Jul  8 12:19:49 skapet spop3d[31208]: authentication failed: no such user: lixing - 49.87.78.12

(a total of 54 attempts for that user name) might actually be based on the name of a Chinese person. "Li Xing" sounded plausible enough as a possible real person. It's perhaps worth noting that at the time I had just finished reading the first two volumes of Cixin Liu's The Three Body Problem, so I was a bit more in tune than usual with what could be plausible Chinese names than I had been. (And yes, the books are very much to my taste and I have the yet unpublished translation of the third volume on pre-order.)

Unsurprisingly, a quick whois lookup revealed that the machines that tried reading the hypothetical person Li Xing's mail all had IP addresses that belonged to Chinese networks.

Once I realized I might be on to a new pattern, I went back over a few days' worth of failed pop3 login attempts and found more than a handful of usernames that looked like they could be based on Chinese names. Checking the whois data for the IP addresses in those attempts, all turned out to be from Chinese networks.

That was in itself an interesting realization, but a small, random sample does not make for proof. In order to establish an actual data set, it was back to collecting data and analysing the content.

First, collect all log data on failed pop3 attempts for a long enough period that we have a reasonable baseline and can distinguish between the background noise and new, exciting developements.

The file bigauthlog is that collection of data. Digging through my archives going back in time, I stopped at January 16, 2016 for no other reason than this would be roughly six months' worth of data, probably enough to give a reasonable baseline and to spot anomalies.

If you've read the previous columns, you will be familiar with the scripts that produce various text and CSV reports from log data input: A text report of user names by number of access attempts, a CSV dump of the same, with first and last spotted, a text report of hosts attempting access, sorted by number of attempts, a CSV dump of the same, with first and last seen dates as for the user names.

But what I wanted to see was where the login attempts were coming from for which usernames, so I started extracting the unique host to username mappings. For each entry in this CSV file, there is a host and a user name it has tried at least once (if you import that somewhere, make sure you mark the Username column as text -- LibreOffice Calc at least becomes confused when trying to parse some of those strings). The data also records whether that particular username was part of the spamtrap database at the time. If you want to do that particular check on your own greytrapping database, any matching output from

\$ doas spamdb | grep -i username@

on your greytrapper box will mean it is in your list. And then finally for each entry there is the expected extract from available whois info: network address range, the network name and the country.

The most useful thing to do with that little database is to play with sorting on various fields and field combinations. If you sort on the "In spamtraps" field, the supposed Chinese names turn up with "No"s, along with a few more random-seeming combinations.

While I was building the data set I decided to add those new usernames with @bsdly.net appended to the spamtraps, and this is what finally pushed the number of spamtraps past the 30,000 mark.

Just browsing the data or perhaps sorting by IP address will show you that the pop3 gropers are spread across a large number of networks in a number of countries and territories with numbers roughly in proportion to the size of that country or territory's economy. Some, such as a particular Mexican ISP and cable TV operator stand out as being slightly over-represented, and as expected networks in the US and China stand for a large number of the total.

If you sort on the In spamtraps field, you will see that a large number of the entries that were not in the spamtraps are the ones identified as Chinese personal names, but not all. Some of the No entries are the RFC mandated mailboxes, some are aliases that are in use here for other reasons, and finally more than a handful that would fit the general description of the rest of the spamtraps: Strings superficially resembling personal names or simply random strings. These may be parts of the potential spamtraps I missed while fishing  spamtrap candidates out of logfiles some time over the decade of weirdness that has gone into maintaining the spamtraps list.

But if you sort the data primarily on the fields Name, Country, and if you like IP address and User name, you will see that as anticipated the attempts on Chinese-sounding user names come exclusively from Chinese networks, except only the "Fa Dum" (fadum) user, which appears to have been attempted only twice (on June 6th) from an IP address registered in the USA and may very well be a misclassification on my part. That particular sorting, with duplicates removed, is the origin of the list of names and usernames given earlier in this article and this CSV file.

Now that we have established that the attempts at Chinese user names come exclusively from Chinese networks, the next questions become: Who are the cyber criminals behind this activity, and what are their motivations? And why are they bothering with hosts in faraway Europe to begin with?

For the first question, it is hard to tell from this perch, but whoever runs those attempts apparently have the run of large swathes of network real estate and seem to not take any special care not to be detected, other than of course distributing the attempts widely across the network ranges and coming in only in short bursts.

So are those attempts by, let us say the public sector, to steal political dissidents' email? Or perhaps, still with a public sector slant, simply hunting for any and all overseas assets belonging to Chinese nationals? Or are we simply seeing the activities of Chinese private sector cyber criminals who are trying out likely user names wherever they can find a service that listens?

Any of all of these things could be true, but in any case it's not unlikely that what we are seeing somebody trying to find new places where username and password combinations from a recent breach might work. After all, username and password combinations that have been verified to work somewhere are likely worth more on the market than the unverified ones.

Looking at the log entries, there are sequences there that could plausibly have been produced by humans typing at keyboards. Imagine if you please vast, badly lit and insufficiently ventilated Asian cyber-sweatshops, but I would not be too surprised to find that this is actually a highly automated operation, with timing tuned to avoid detection.

Security professionals have been recommending that people stop using the pop3 protocol since as long as I care to remember, but typing "pop3" into shodan.io still produces a whopping 684,291 results, meaning that the pop3 service is nowhere near as extinct as some would have preferred.

The large number of possible targets is a likely explanation for the burstiness of the activity we are seeing: with that many hosts to cover, the groping hosts will need to set up some sort of rotation, and in addition there is the need to stay below some volume of traffic per host in order to avoid detection. This means that what any one site sees is only a very small part of the total activity. The pop3 hunt for Chinese users is most likely not exclusive to the fjord country.

If you run a pop3 service, please do yourself a favor and check your setup for any weaknesses including any not yet applied updates, as you were about to do anyway. Once you've done that, take some moments to browse your logs for strange looking login attempts.

If you find something similar to what I've reported here, I would like to hear from you. Please note that at least one of the pop3 deaemons out there by default does not report the username for failed authentication attempts but notes that the username was unknown instead. Anyway, your war stories will be appreciated in email or comments.

And finally, if you have information on one or more breaches that may have been the source of this list of likely Chinese user names, I'd like to hear from you too.

Good night and good luck.

I would like to thank Tore Nordstrand and Øystein Alsaker for valuable input on various aspects of this article.

The data referenced in this article will likely be updated on a roughly daily basis while the Chinese episode lasts. You can fetch them from the links in the article or from this directory, which also contains some trivial data extraction and data massaging scripts I use. If you find any errors or have any concerns, please let me know.

## August 06, 2016

### pagetable

#### How Amica Paint protected tampering with its credits

In mid-1990, the floppy disk of special issue 55 of the German Commodore 64 magazine "64'er" contained the "Amica Paint" graphics program – which was broken beyond usefulness. I'll describe what went wrong.

"Amica Paint" was devloped by Oliver Stiller and first published in 64'er special issue 27 in 1988, as a type-in program that filled 25 pages of the magazine.

Two years later, Amica Paint was published again in special issue 55, which this time came with a floppy disk. But this version was completely broken: Just drawing a simple line would cause severe glitches.

64'er issue 9/1990 published an erratum to fix Amica Paint, which described three ways (BASIC script, asm monitor and disk monitor) to patch 7 bytes in one of the executable files:

``````--- a/a.paint c000.txt
+++ b/a.paint c000.txt
@@ -67,8 +67,8 @@
00000420  a5 19 85 ef a5 1a 85 f0  4c 29 c4 20 11 c8 20 6c  |........L). .. l|
00000430  c8 08 20 ed c7 28 90 f6  60 01 38 00 20 20 41 4d  |.. ..(..`.8.  AM|
00000440  49 43 41 20 50 41 49 4e  54 20 56 31 2e 34 20 20  |ICA PAINT V1.4  |
-00000450  01 38 03 20 20 20 4f 2e  53 54 49 4c 4c 45 52 20  |.8.   O.STILLER |
-00000460  31 39 39 30 20 20 20 01  31 00 58 3d 30 30 30 20  |1990   .1.X=000 |
+00000450  01 38 03 42 59 20 4f 2e  53 54 49 4c 4c 45 52 20  |.8.BY O.STILLER |
+00000460  31 39 38 36 2f 38 37 01  31 00 58 3d 30 30 30 20  |1986/87.1.X=000 |
00000470  59 3d 30 30 30 20 20 20  20 20 20 20 20 20 00 01  |Y=000         ..|
00000480  31 00 42 49 54 54 45 20  57 41 52 54 45 4e 20 2e  |1.BITTE WARTEN .|
00000490  2e 2e 20 20 20 20 00 64  0a 01 00 53 43 48 57 41  |..    .d...SCHWA|``````

This changes the credits message from "O.STILLER 1990" to "BY O.STILLER 1986/87" – which is the original message from the previous publication.

64'er magazine had published the exact same application without any updates, but binary patched the credits message from "1986/87" to "1990", and unfortunately for them, Amica Paint contained code to detect exactly this kind of tampering:

``````.,C5F5  A0 14       LDY #\$14     ; check 20 bytes
.,C5F7  A9 00       LDA #\$00     ; init checksum with 0
.,C5F9  18          CLC
.,C5FA  88          DEY
.,C5FE  88          DEY
.,C5FF  18          CLC
.,C600  10 F9       BPL \$C5FB    ; loop
.,C602  EE FD C5    INC \$C5FD
.,C605  C9 ED       CMP #\$ED     ; checksum should be \$ED
.,C607  F0 05       BEQ \$C60E
.,C609  A9 A9       LDA #\$A9
.,C60B  8D E4 C7    STA \$C7E4    ; otherwise sabotage line drawing
.,C60E  60          RTS``````

The code checksums the message "BY O.STILLER 1986/87". If the checksum does not match, the code will overwrite an instruction in the following code:

``````.,C7DC  65 EC       ADC \$EC
.,C7DE  85 EC       STA \$EC
.,C7E0  90 02       BCC \$C7E4
.,C7E2  E6 ED       INC \$ED
.,C7E4  A4 DD       LDY \$DD
.,C7E6  60          RTS``````

The "LDY \$DD" instruction at \$C7E4 will be overwritten with "LDA #\$DD", which will cause the glitches in line drawing.

The proper fix would have been to change the comparison with \$ED into a comparison with \$4F, the checksum of the updated message – a single byte fix. But instead of properly debugging the issue, 64'er magazine published a patch to restore the original message, practically admitting that they had cheated by implying the re-release was not the exact same software.

## August 05, 2016

### Steve Kemp's Blog

Recently somebody reported that my console-based mail-client was segfaulting when opening an IMAP folder, and then when they tried with a local Maildir-hierarchy the same fault was observed.

I couldn't reproduce the problem at all, as neither my development host (read "my personal desktop"), nor my mail-host had been crashing at all, both being in use to read my email for several months.

Debugging crashes with no backtrace, or real hint of where to start, is a challenge. Even when downloading the same Maildir samples I couldn't see a problem. It was only when I decided to see if I could add some more diagnostics to my code that I came across a solution.

My intention was to make it easier to receive a backtrace, by adding more compiler options:

``````  -fsanitize=address -fno-omit-frame-pointer
``````

I added those options and my mail-client immediately started to segfault on my own machine(s), almost as soon as it started. Ultimately I found three pieces of code where I was allocating C++ objects and passing them to the Lua stack, a pretty fundamental part of the code, which were buggy. Once I'd tracked down the areas of code that were broken and fixed them the user was happy, and I was happy too.

Its interesting that I've been running for over a year with these bogus things in place, which "just happened" to not crash for me or anybody else. In the future I'll be adding these options to more of my C-based projects, as there seems to be virtually no downside.

In related news my console editor has now achieved almost everything I want it to, having gained:

• Syntax highlighting via Lua + LPEG
• Support for TAB completion of Lua-code and filenames.
• Bookmark support.
• Support for setting the mark and copying/cutting regions.

The only outstanding feature, which is a biggy, is support for Undo which I need to add.

Happily no segfaults here, so far..

## August 04, 2016

#### Open compressed file with gzip zcat perl php lua python

I have a compressed file of:

``````
250.000.000 lines
Compressed the file size is: 671M
Uncompressed, it's: 6,5G
``````

Need to extract a plethora of things and verify some others.

I dont want to use bash but something more elegant, like python or lua.

Looking through “The-Internet”, I’ve created some examples for the single purpose of educating my self.

So here are my results.
BE AWARE they are far-far-far away from perfect in code or execution.

Sorted by (less) time of execution:

## pigz

pigz - Parallel gzip - Zlib

``````

# time pigz  -p4 -cd  2016-08-04-06.ldif.gz &> /dev/null

real	0m9.980s
user	0m16.570s
sys	0m0.980s

``````

## gzip

gzip 1.8

``````

# time /bin/gzip -cd 2016-08-04-06.ldif.gz &> /dev/null

real	0m23.951s
user	0m23.790s
sys	0m0.150s
``````

## zcat

zcat (gzip) 1.8

``````

# time zcat 2016-08-04-06.ldif.gz &> /dev/null

real	0m24.202s
user	0m24.100s
sys	0m0.090s

``````

## Perl

Perl v5.24.0

code:

``````

#!/usr/bin/perl

open (FILE, '/bin/gzip -cd 2016-08-04-06.ldif.gz |');

while (my \$line = ) {
print \$line;
}

close FILE;
``````

time:

``````
# time ./dump.pl &> /dev/null

real	0m49.942s
user	1m14.260s
sys	0m2.350s

``````

## PHP

PHP 7.0.9 (cli)

code:

``````
#!/usr/bin/php

< ? php

\$fp = gzopen("2016-08-04-06.ldif.gz", "r");

while ((\$buffer = fgets(\$fp, 4096)) !== false) {
echo \$buffer;
}

gzclose(\$fp);

? >

``````

time:

``````
# time php -f dump.php &> /dev/null

real	1m19.407s
user	1m4.840s
sys	0m14.340s

``````

## PHP - Iteration #2

PHP 7.0.9 (cli)

Impressed with php results, I took the perl-approach on code:

``````

< ? php

\$fp = popen("/bin/gzip -cd 2016-08-04-06.ldif.gz", "r");

while ((\$buffer = fgets(\$fp, 4096)) !== false) {
echo \$buffer;
}

pclose(\$fp);

? >
``````

time:

``````
# time php -f dump2.php &> /dev/null

real	1m6.845s
user	1m15.590s
sys	0m19.940s

``````

## Lua

Lua 5.3.3

code:

``````
#!/usr/bin/lua

local gzip = require 'gzip'

local filename = "2016-08-04-06.ldif.gz"

for l in gzip.lines(filename) do
print(l)
end
``````

time:

``````
# time ./dump.lua &> /dev/null

real	3m50.899s
user	3m35.080s
sys	0m15.780s

``````

## Lua - Iteration #2

Lua 5.3.3

I was depressed to see that php is faster than lua!!
Depressed I say !

So here is my next iteration on lua:

code:

``````
#!/usr/bin/lua

local file = assert(io.popen('/bin/gzip -cd 2016-08-04-06.ldif.gz', 'r'))

while true do
if line == nil then break end
print (line)
end
file:close()

``````

time:

``````
# time ./dump2.lua &> /dev/null

real	2m45.908s
user	2m54.470s
sys	0m21.360s

``````

One minute faster than before, but still too slow !!

## Lua - Zlib

Lua 5.3.3

My next iteration with lua is using zlib :

code:

``````

#!/usr/bin/lua

local zlib = require 'zlib'
local filename = "2016-08-04-06.ldif.gz"

local block = 64
local d = zlib.inflate()

local file = assert(io.open(filename, "rb"))
while true do
if not bytes then break end
print (d(bytes))
end

file:close()

``````

time:

``````

# time ./dump.lua  &> /dev/null

real	0m41.546s
user	0m40.460s
sys	0m1.080s

``````

Now, that's what I am talking about !!!

Playing with window_size (block) can make your code faster or slower.

## Python v3

Python 3.5.2

code:

``````
#!/usr/bin/python

import gzip

filename='2016-08-04-06.ldif.gz'
with gzip.open(filename, 'r') as f:
for line in f:
print(line,)

``````

time:

``````
# time ./dump.py &> /dev/null

real	13m14.460s
user	13m13.440s
sys	0m0.670s

``````

Not enough tissues on the whole damn world!

## Python v3 - Iteration #2

Python 3.5.2

but wait ... a moment ... The default mode for gzip.open is 'rb'.

let's try this once more with rt(read-text) mode:

code:

``````
#!/usr/bin/python

import gzip

filename='2016-08-04-06.ldif.gz'
with gzip.open(filename, 'rt') as f:
for line in f:
print(line, end="")

``````

time:

``````
# time ./dump.py &> /dev/null

real	5m33.098s
user	5m32.610s
sys	0m0.410s

``````

With only one super tiny change and run time in half!!!
But still tooo slow.

## Python v3 - Iteration #3

Python 3.5.2

Let's try a third iteration with popen this time.

code:

``````
#!/usr/bin/python

import os

cmd = "/bin/gzip -cd 2016-08-04-06.ldif.gz"
f = os.popen(cmd)
for line in f:
print(line, end="")
f.close()
``````

time:

``````
# time ./dump2.py &> /dev/null

real	6m45.646s
user	7m13.280s
sys	0m6.470s

``````

## Python v3 - zlib Iteration #1

Python 3.5.2

Let's try a zlib iteration this time.

code:

``````

#!/usr/bin/python

import zlib

d = zlib.decompressobj(zlib.MAX_WBITS | 16)
filename='2016-08-04-06.ldif.gz'

with open(filename, 'rb') as f:
for line in f:
print(d.decompress(line))

``````

time:

``````
# time ./dump.zlib.py &> /dev/null

real	1m4.389s
user	1m3.440s
sys	0m0.410s

``````

finally some proper values with python !!!

## Specs

All the running tests occurred to this machine:

``````
4 x Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz
8G RAM
``````

## Conclusions

Ok, I Know !

The shell-pipe approach of using gzip for opening the compressed file, is not fair to all the above code snippets.
But ... who cares ?

I need something that run fast as hell and does smart things on those data.

## Get in touch

As I am not a developer, I know that you people know how to do these things even better!

So I would love to hear any suggestions or even criticism on the above examples.

I will update/report everything that will pass the "I think I know what this code do" rule and ... be gently with me ;)

PLZ use my email address: evaggelos [ _at_ ] balaskas [ _dot_ ] gr

to send me any suggestions

Thanks !

Tag(s): php, perl, python, lua, pigz

## August 03, 2016

#### How to dockerize a live system

I need to run some ansible playbooks to a running (live) machine.
But, of-course, I cant use a production server for testing purposes !!

So here comes docker!
I have ssh access from my docker-server to this production server:

``````

[docker-server] ssh livebox tar -c / | docker import - centos6:livebox

``````

Then run the new docker image:

``````

[docker-server]  docker run -t -i --rm -p 2222:22 centos6:livebox bash

[root@40b2bab2f306 /]# /usr/sbin/sshd -D

``````

Create a new entry on your hosts inventory file, that uses ssh port 2222
or create a new separated inventory file

and test it with ansible ping module:

``````
# ansible -m ping -i hosts.docker dockerlivebox

dockerlivebox | success >> {
"changed": false,
"ping": "pong"
}

``````
Tag(s): docker

## August 01, 2016

### Simon Lyall

#### Putting Prometheus node_exporter behind apache proxy

I’ve been playing with Prometheus monitoring lately. It is fairly new software that is getting popular. Prometheus works using a pull architecture. A central server connects to each thing you want to monitor every few seconds and grabs stats from it.

In the simplest case you run the node_exporter on each machine which gathers about 600-800 (!) metrics such as load, disk space and interface stats. This exporter listens on port 9100 and effectively works as an http server that responds to “GET /metrics HTTP/1.1” and spits several hundred lines of:

```node_forks 7916
node_intr 3.8090539e+07
node_memory_Active 6.23935488e+08```

Other exporters listen on different ports and export stats for apache or mysql while more complicated ones will act as proxies for outgoing tests (via snmp, icmp, http). The full list of them is on the Prometheus website.

So my problem was that I wanted to check my virtual machine that is on Linode. The machine only has a public IP and I didn’t want to:

1. Allow random people to check my servers stats
2. Have to setup some sort of VPN.

So I decided that the best way was to just use put a user/password on the exporter.

However the node_exporter does not  implement authentication itself since the authors wanted the avoid maintaining lots of security code. So I decided to put it behind a reverse proxy using apache mod_proxy.

#### Step 1 – Install node_exporter

Node_exporter is a single binary that I started via an upstart script. As part of the upstart script I told it to listen on localhost port 19100 instead of port 9100 on all interfaces

```# cat /etc/init/prometheus_node_exporter.conf
description "Prometheus Node Exporter"

start on startup

chdir /home/prometheus/

script
end script

```

Once I start the exporter a simple “curl 127.0.0.1:19100/metrics” makes sure it is working and returning data.

#### Step 2 – Add Apache proxy entry

First make sure apache is listening on port 9100 . On Ubuntu edit the /etc/apache2/ports.conf file and add the line:

`Listen 9100`

Next create a simple apache proxy without authentication (don’t forget to enable mod_proxy too):

```# more /etc/apache2/sites-available/prometheus.conf
<VirtualHost *:9100>
ServerName prometheus

CustomLog /var/log/apache2/prometheus_access.log combined
ErrorLog /var/log/apache2/prometheus_error.log

ProxyRequests Off
<Proxy *>
Allow from all
</Proxy>

ProxyErrorOverride On
ProxyPass / http://127.0.0.1:19100/
ProxyPassReverse / http://127.0.0.1:19100/

</VirtualHost>```

This simply takes requests on port 9100 and forwards them to localhost port 19100 . Now reload apache and test via curl to port 9100. You can also use netstat to see what is listening on which ports:

```Proto Recv-Q Send-Q Local Address   Foreign Address State  PID/Program name
tcp   0      0      127.0.0.1:19100 0.0.0.0:*       LISTEN 8416/node_exporter
tcp6  0      0      :::9100         :::*            LISTEN 8725/apache2```

#### Step 3 – Get Prometheus working

I’ll assume at this point you have other servers working. What you need to do now is add the following entries for you server in you prometheus.yml file.

```- job_name: 'node'

scrape_interval: 15s

basic_auth:

static_configs:
- targets: ['myserver.example.com:9100']
labels:
group: 'servers'
alias: 'myserver'```

Now restart Prometheus and make sure it is working. You should see the following lines in your apache logs plus stats for the server should start appearing:

```10.212.62.207 - - [31/Jul/2016:11:31:38 +0000] "GET /metrics HTTP/1.1" 200 11377 "-" "Go-http-client/1.1"
10.212.62.207 - - [31/Jul/2016:11:31:53 +0000] "GET /metrics HTTP/1.1" 200 11398 "-" "Go-http-client/1.1"
10.212.62.207 - - [31/Jul/2016:11:32:08 +0000] "GET /metrics HTTP/1.1" 200 11377 "-" "Go-http-client/1.1"```

Notice that connections are 15 seconds apart, get http code 200 and are 11k in size. The Prometheus server is using Authentication but apache doesn’t need it yet.

#### Step 4 – Enable Authentication.

Now create an apache password file:

`htpasswd -cb /home/prometheus/passwd prom mypassword`

and update your apache entry to the followign to enable authentication:

```# more /etc/apache2/sites-available/prometheus.conf
<VirtualHost *:9100>
ServerName prometheus

CustomLog /var/log/apache2/prometheus_access.log combined
ErrorLog /var/log/apache2/prometheus_error.log

ProxyRequests Off
<Proxy *>
Order deny,allow
Allow from all
#
AuthType Basic
AuthBasicProvider file
AuthUserFile "/home/prometheus/passwd"
Require valid-user
</Proxy>

ProxyErrorOverride On
ProxyPass / http://127.0.0.1:19100/
ProxyPassReverse / http://127.0.0.1:19100/
</VirtualHost>```

After you reload apache you should see the following:

```10.212.56.135 - prom [01/Aug/2016:04:42:08 +0000] "GET /metrics HTTP/1.1" 200 11394 "-" "Go-http-client/1.1"
10.212.56.135 - prom [01/Aug/2016:04:42:23 +0000] "GET /metrics HTTP/1.1" 200 11392 "-" "Go-http-client/1.1"
10.212.56.135 - prom [01/Aug/2016:04:42:38 +0000] "GET /metrics HTTP/1.1" 200 11391 "-" "Go-http-client/1.1"```

Note that the “prom” in field 3 indicates that we are logging in for each connection. If you try to connect to the port without authentication you will get:

```Unauthorized
This server could not verify that you
are authorized to access the document
requested. Either you supplied the wrong
browser doesn't understand how to supply
the credentials required.```

That is pretty much it. Note that will need to add additional Virtualhost entries for more ports if you run other exporters on the server.

## July 29, 2016

### Slaptijack

#### TACACS Detected 'Invalid Argument'

As always, I've changed pertinent details for reasons.

I was working on an ASR the other day and received the follow error:

```RP/0/RSP0/CPU0:ASR9K(config-tacacs-host)# commit
Fri Jul 29 12:55:46.243 PDT

% Failed to commit one or more configuration items during a pseudo-atomic
failed [inheritance]' from this session to view the errors
RP/0/RSP0/CPU0:ASR9K(config-tacacs-host)# show configuration failed
Fri Jul 29 12:55:55.421 PDT
!! SEMANTIC ERRORS: This configuration was rejected by
!! the system due to semantic errors. The individual
!! errors with each failed configuration command can be
!! found below.

tacacs-server host 10.0.0.2 port 49
!!% 'TACACS' detected the 'fatal' condition 'Invalid Argument'
!
end
```

The problem here is that the tacacs daemon thinks the configuration contains an invalid argument. It doesn't. So, restart tacacs:

```RP/0/RSP0/CPU0:ASR9K# show proc | inc tacacs
Fri Jul 29 12:56:32.376 PDT
1142   1    2  108K  16 Sigwaitinfo 7399:06:34:0893    0:00:00:0109 tacacsd
1142   2    0  108K  10 Receive     7399:06:35:0099    0:00:00:0000 tacacsd
1142   3    2  108K  10 Nanosleep      0:00:05:0940    0:00:00:0057 tacacsd
1142   4    1  108K  10 Receive     7399:06:34:0957    0:00:00:0000 tacacsd
1142   5    1  108K  10 Receive        0:00:00:0664    0:00:41:0447 tacacsd
1142   6    1  108K  10 Receive     2057:20:44:0638    0:00:44:0805 tacacsd
1142   7    2  108K  10 Receive     1167:26:53:0781    0:01:02:0991 tacacsd
1142   8    3  108K  10 Receive     1167:26:51:0567    0:01:29:0541 tacacsd
1142   9    2  108K  10 Receive      403:35:55:0206    0:01:09:0700 tacacsd
RP/0/RSP0/CPU0:ASR9K# process restart tacacsd
Fri Jul 29 12:56:54.768 PDT
RP/0/RSP0/CPU0:ASR9K# show proc | inc tacacs
Fri Jul 29 12:56:58.455 PDT
1142   1    3   64K  16 Sigwaitinfo    0:00:03:0806    0:00:00:0069 tacacsd
1142   2    1   64K  10 Receive        0:00:03:0998    0:00:00:0000 tacacsd
1142   3    3   64K  10 Nanosleep      0:00:03:0977    0:00:00:0000 tacacsd
1142   4    1   64K  10 Receive        0:00:03:0867    0:00:00:0002 tacacsd
1142   5    3   64K  10 Receive        0:00:03:0818    0:00:00:0000 tacacsd
1142   6    2   64K  16 Receive        0:00:03:0818    0:00:00:0000 tacacsd
1142   7    1   64K  16 Receive        0:00:03:0818    0:00:00:0000 tacacsd
1142   8    3   64K  16 Receive        0:00:03:0818    0:00:00:0000 tacacsd
1142   9    3   64K  10 Receive        0:00:00:0673    0:00:00:0003 tacacsd
```

And try again:

```RP/0/RSP0/CPU0:ASR9K# config t
Fri Jul 29 12:57:04.787 PDT
RP/0/RSP0/CPU0:ASR9K(config)# tacacs-server host 10.0.0.2 port 49
RP/0/RSP0/CPU0:ASR9K(config-tacacs-host)# commit
Fri Jul 29 12:57:20.627 PDT
RP/0/RSP0/CPU0:ASR9K(config-tacacs-host)#
```

## July 28, 2016

### Electricmonk.nl

#### Ansible-cmdb v1.15: Generate a host overview of Ansible facts.

I've just released ansible-cmdb v1.15. Ansible-cmdb takes the output of Ansible's fact gathering and converts it into a static HTML overview page containing system configuration information. It supports multiple templates and extending information gathered by Ansible with custom data.

This release includes the following bugfixes and feature improvements:

• Improvements to the resilience against wrong, unsupported and missing data.
• SQL template. Generates SQL for use with SQLite and MySQL.
• Minor bugfixes.

As always, packages are available for Debian, Ubuntu, Redhat, Centos and other systems. Get the new release from the Github releases page.

### R.I.Pienaar

#### A look at the Puppet 4 Application Orchestration feature

Puppet 4 got some new language constructs that let you model multi node applications and it assist with passing information between nodes for you. I recently wrote a open source orchestrator for this stuff which is part of my Choria suite, figured I’ll write up a bit about these multi node applications since they are now useable in open source.

The basic problem this feature solves is about passing details between modules. Lets say you have a LAMP stack, you’re going to have Web Apps that need access to a DB and that DB will have a IP, User, Password etc. Exported resources never worked for this because it’s just stupid to have your DB exporting web resources, there are unlimited amount of web apps and configs, no DB module support this. So something new is needed.

The problem is made worse by the fact that Puppet does not really have something like a Java interface, you can’t say that foo-mysql module implements a standard interface called database so that you can swap out one mysql module for another, they’re all snowflakes. So a intermediate translation layer is needed.

In a way you can say this new feature brings a way to create an interface – lets say SQL – and allows you to hook random modules into both sides of the interface. On one end a database and on the other a webapp. Puppet will then transfer the information across the interface for you, feeding the web app with knowledge of port, user, hosts etc.

### LAMP Stack Walkthrough

Lets walk through creating a standard definition for a multi node LAMP stack, and we’ll create 2 instances of the entire stack. It will involve 4 machines sharing data and duties.

These interfaces are called capabilities, here’s an example of a SQL one:

```Puppet::Type.newtype :sql, :is_capability => true do newparam :name, :is_namevar => true newparam :user newparam :password newparam :host newparam :database end```

This is a generic interface to a database, you can imagine Postgres or MySQL etc can all satisfy this interface, perhaps you could add here a field to confer the type of database, but lets keep it simple. The capability provides a translation layer between 2 unrelated modules.

It’s a pretty big deal conceptually, I can see down the line there be some blessed official capabilities and we’ll see forge modules starting to declare their compatibility. And finally we can get to a world of interchangeable infrastructure modules.

Now I’ll create a defined type to make my database for my LAMP stack app, I’m just going to stick a notify in instead of the actual creating of a database to keep it easy to demo:

```define lamp::mysql ( \$db_user, \$db_password, \$host = \$::hostname, \$database = \$name, ) { notify{"creating mysql db \${database} on \${host} for user \${db_user}": } }```

I need to tell Puppet this defined type exist to satisfy the producing side of the interface, there’s some new language syntax to do this, it feels kind of out of place not having a logical file to stick this in, I just put it in my lamp/manifests/mysql.pp:

```Lamp::Mysql produces Sql { user => \$db_user, password => \$db_password, host => \$host, database => \$database, }```

Here you can see the mapping from the variables in the defined type to those in the capability above. \$db_user feeds into the capability property \$user etc.

With this in place if you have a lamp::mysql or one based on some other database, you can always query it’s properties based on the standard user etc, more on that below.

So we have a database, and we want to hook a web app onto it, again for this we use a defined type and again just using notifies to show the data flow:

```define lamp::webapp ( \$db_user, \$db_password, \$db_host, \$db_name, \$docroot = '/var/www/html' ) { notify{"creating web app \${name} with db \${db_user}@\${db_host}/\${db_name}": } }```

As this is the other end of the translation layer enabled by the capability we tell Puppet that this defined type consumes a Sql capability:

```Lamp::Webapp consumes Sql { db_user => \$user, db_password => \$password, db_host => \$host, db_name => \$database, }```

This tells Puppet to read the value of user from the capability and stick it into db_user of the defined type. Thus we can plumb arbitrary modules found on the forge together with a translation layer between their properties!

So you have a data producer and a data consumer that communicates across a translation layer called a capability.

The final piece of the puzzle that defines our LAMP application stack is again some new language features:

```application lamp ( String \$db_user, String \$db_password, Integer \$web_instances = 1 ) { lamp::mysql { \$name: db_user => \$db_user, db_password => \$db_password, export => Sql[\$name], }   range(1, \$web_instances).each |\$instance| { lamp::webapp {"\${name}-\${instance}": consume => Sql[\$name], } } }```

Pay particular attention to the application bit and export and consume meta parameters here. This tells the system to feed data from the above created translation layer between these defined types.

You should kind of think of the lamp::mysql and lamp::webapp as node roles, these define what an individual node will do in this stack. If I create this application and set \$instances = 10 I will need 1 x database machine and 10 x web machines. You can cohabit some of these roles but I think that’s a anti pattern. And since these are different nodes – as in entirely different machines – the magic here is that the capability based data system will feed these variables from one node to the next without you having to create any specific data on your web instances.

Finally, like a traditional node we now have a site which defines a bunch of nodes and allocate resources to them.

```site { lamp{'app2': db_user => 'user2', db_password => 'secr3t', web_instances => 3, nodes => { Node['dev1.example.net'] => Lamp::Mysql['app2'], Node['dev2.example.net'] => Lamp::Webapp['app2-1'], Node['dev3.example.net'] => Lamp::Webapp['app2-2'], Node['dev4.example.net'] => Lamp::Webapp['app2-3'] } }   lamp{'app1': db_user => 'user1', db_password => 's3cret', web_instances => 3, nodes => { Node['dev1.example.net'] => Lamp::Mysql['app1'], Node['dev2.example.net'] => Lamp::Webapp['app1-1'], Node['dev3.example.net'] => Lamp::Webapp['app1-2'], Node['dev4.example.net'] => Lamp::Webapp['app1-3'] } } }```

Here we are creating two instances of the LAMP application stack, each with it’s own database and with 3 web servers assigned to the cluster.

You have to be super careful about this stuff, if I tried to put my Mysql for app1 on dev1 and the Mysql for app2 on dev2 this would basically just blow up, it would be a cyclic dependency across the nodes. You generally best avoid sharing nodes across many app stacks or if you do you need to really think this stuff through. It’s a pain.

You now have this giant multi node monolith with order problems not just inter resource but inter node too.

### Deployment

Deploying these stacks with the abilities the system provide is pretty low tech. If you take a look at the site above you can infer dependencies. First we have to run dev1.example.net. It will both produce the data needed and install the needed infrastructure, and then we can run all the web nodes in any order or even at the same time.

There’s a problem though, traditionally Puppet runs every 30 minutes and gets a new catalog every 30 minutes. We can’t have these nodes randomly get catalogs in random order since there’s no giant network aware lock/ordering system. So Puppet now has a new model, nodes are supposed to run cached catalogs for ever and only get a new catalog when specifically told so. You tell it to deploy this stack and once deployed Puppet goes into a remediation cycle fixing the stack as it is with an unchanging catalog. If you want to change code, you again have to run this entire stack in this specific order.

This is a nice improvement for release management and knowing your state, but without tooling to manage this process it’s a fail, and today that tooling is embryonic and PE only.

So Choria which I released in Beta yesterday provides at least some relief, it brings a manual orchestrator for these things so you can kick of a app deploy on demand, later maybe some daemon will do this regularly I don’t know yet.

Lets take a look at Choria interacting with the above manifests, lets just show the plan:

This shows all the defined stacks in your site and group them in terms of what can run in parallel and in what order.

Lets deploy the stack, Choria is used again and it uses MCollective to do the runs using the normal Puppet agent, it tries to avoid humans interfering with a stack deploy by disabling Puppet and enabling Puppet at various stages etc:

It has options to limit the runs to a certain node batch size so you don’t nuke your entire site at once etc.

Lets look at some of the logs and notifies:

```07:46:53 dev1.example.net> puppet-agent[27756]: creating mysql db app2 on dev1 for user user2 07:46:53 dev1.example.net> puppet-agent[27756]: creating mysql db app1 on dev1 for user user1   07:47:57 dev4.example.net> puppet-agent[27607]: creating web app app2-3 with db user2@dev1/app2 07:47:57 dev4.example.net> puppet-agent[27607]: creating web app app1-3 with db user1@dev1/app1   07:47:58 dev2.example.net> puppet-agent[23728]: creating web app app2-1 with db user2@dev1/app2 07:47:58 dev2.example.net> puppet-agent[23728]: creating web app app1-1 with db user1@dev1/app1   07:47:58 dev3.example.net> puppet-agent[23728]: creating web app app2-2 with db user2@dev1/app2 07:47:58 dev3.example.net> puppet-agent[23728]: creating web app app1-2 with db user1@dev1/app1```

All our data flowed nicely through the capabilities and the stack was built with the right usernames and passwords etc. Timestamps reveal dev{2,3,4} ran concurrently thanks to MCollective.

### Conclusion

To be honest, this whole feature feels like a early tech preview and not something that should be shipped. This is basically the plumbing a user friendly feature should be written on and that’s not happened yet. You can see from above it’s super complex – and you even have to write some pure ruby stuff, wow.

If you wanted to figure this out from the docs, forget about it, the docs are a complete mess, I found a guide in the Learning VM which turned out to be the best resource showing a actual complete walk through. This is sadly par for the course with Puppet docs these days 🙁 UPDATE: There is an official sample module here.

There’s some more features here – you can make cross node monitor checks to confirm the DB is actually up before attempting to start the web server for example, interesting. But implementing new checks is just such a chore – I can do it, I doubt your average user will be bothered, just make it so we can run Nagios checks, there’s 1000s of these already written and we all have them and trust them. Tbf, I could probably write a generic nagios checker myself for this, I doubt average user can.

The way nodes depend on each other and are ordered is of course obvious. It should be this way and these are actual dependencies. But at the same time this is stages done large. Stages failed because they make this giant meta dependency layered over your catalog and a failure in any one stage results in skipping entire other, later, stages. They’re a pain in the arse, hard to debug and hard to reason about. This feature implements the exact same model but across nodes. Worse there does not seem to be a way to do cross node notifies of resources. It’s as horrible.

That said though with how this works as a graph across nodes it’s the only actual option. This outcome should have been enough to dissuade the graph approach from even being taken though and something new should have been done, alas. It’s a very constrained system, it demos well but building infrastructure with this is going to be a losing battle.

The site manifest has no data capabilities. You can’t really use hiera/lookup there in any sane way. This is unbelievable, I know there were general lack of caring for external data at Puppet but this is like being back in Puppet 0.22 days before even extlookup existed and about as usable. It’s unbelievable that there’s no features for programatic node assignment to roles for example etc, though given how easy it is to make cycles and impossible scenarios I can see why. I know this is something being worked on though. External data is first class. External data modelling has to inform everything you do. No serious user uses Puppet without external data. It has to be a day 1 concern.

The underlying Puppet Server infrastructure that builds these catalogs is ok, I guess, the catalog is very hard to consume and everyone who want to write a thing to interact with it will have to write some terrible sorting/ordering logic themselves – and probably all have their own set of interesting bugs. Hopefully one day there will be a gem or something, or just a better catalog format. Worse it seems to happily compile and issue cyclic graphs without error, filed a ticket for that.

The biggest problem for me is that this is in the worst place of intersection between PE and OSS Puppet, it is hard/impossible to find out roadmap, or activity on this feature set since it’s all behind private Jira tickets. Sometimes some bubble up and become public, but generally it’s a black hole.

Long story short, I think it’s just best avoided in general until it becomes more mature and more visible what is happening. The technical issues are fine, it’s a new feature that’s basically new R&D, this stuff happens. The softer issues makes it a tough one to consider using.

### Cryptography Engineering

#### Statement on DMCA lawsuit

My name is Matthew Green. I am a professor of computer science and a researcher at Johns Hopkins University in Baltimore. I focus on computer security and applied cryptography.

Today I filed a lawsuit against the U.S. government, to strike down Section 1201 of the Digital Millennium Copyright Act. This law violates my First Amendment right to gather information and speak about an urgent matter of public concern: computer security. I am asking a federal judge to strike down key parts of this law so they cannot be enforced against me or anyone else.

A large portion of my work involves building and analyzing the digital security systems that make our modern technological world possible. These include security systems like the ones that protect your phone calls, instant messages, and financial transactions – as well as more important security mechanisms that safeguard property and even human life.

I focus a significant portion of my time on understanding the security systems that have been deployed by industry. In 2005, my team found serious flaws in the automotive anti-theft systems used in millions of Ford, Toyota and Nissan vehicles. More recently, my co-authors and I uncovered flaws in the encryption that powers nearly one third of the world’s websites, including Facebook and the National Security Agency. Along with my students, I’ve identified flaws in Apple’s iMessage text messaging system that could have allowed an eavesdropper to intercept your communications. And these are just a sampling of the public research projects I’ve been involved with.

I don’t do this work because I want to be difficult. Like most security researchers, the research I do is undertaken in good faith. When I find a flaw in a security system, my first step is to call the organization responsible. Then I help to get the flaw fixed. Such independent security research is an increasingly precious commodity. For every security researcher who investigates systems in order to fix them, there are several who do the opposite – and seek to profit from the insecurity of the computer systems our society depends on.

There’s a saying that no good deed goes unpunished. The person who said this should have been a security researcher. Instead of welcoming vulnerability reports, companiesroutinely threaten good-faith security researchers with civil action, or even criminal prosecution. Companies use the courts to silence researchers who have embarrassing things to say about their products, or who uncover too many of those products’ internal details. These attempts are all too often successful, in part because very few security researchers can afford a prolonged legal battle with well-funded corporate legal team.

This might just be a sad story about security researchers, except for the fact that these vulnerabilities affect everyone. When security researchers are intimidated, it’s the public that pays the price. This is because real criminals don’t care about lawsuits and intimidation – and they certainly won’t bother to notify the manufacturer. If good-faith researchers aren’t allowed to find and close these holes, then someone else will find them, walk through them, and abuse them.

In the United States, one of the most significant laws that blocks security researchers is  Section 1201 of the Digital Millennium Copyright Act (DMCA). This 1998 copyright law instituted a raft of restrictions aimed at preventing the “circumvention of copyright protection systems.” Section 1201 provides both criminal and civil penalties for people who bypass technological measures protecting a copyrighted work. While that description might bring to mind the copy protection systems that protect a DVD or an iTunes song, the law has also been applied to prevent users from reverse-engineering software to figure out how it works. Such reverse-engineering is a necessary party of effective security research.

Section 1201 poses a major challenge for me as a security researcher. Nearly every attempt to analyze a software-based system presents a danger of running afoul of the law. As a result, the first step in any research project that involves a commercial system is never science – it’s to call a lawyer; to ask my graduate students to sign a legal retainer; and to inform them that even with the best legal advice, they still face the possibility of being sued and losing everything they have. This fear chills critical security research.

Section 1201 also affects the way that my research is conducted. In a recent project – conducted in Fall 2015 – we were forced to avoid reverse-engineering a piece of software when it would have been the fastest and most accurate way to answer a research question. Instead, we decided to treat the system as a black box, recovering its operation only by observing inputs and outputs. This approach often leads to a less perfect understanding of the system, which can greatly diminish the quality of security research. It also substantially increases the time and effort required to finish a project, which reduces the quantity of security research.

Finally, I have been luckier than most security researchers in that I have access to legal assistance from organizations such as the Electronic Frontier Foundation. Not every security researcher can benefit from this.

The risk imposed by Section 1201 and the heavy cost of steering clear of it discourage me – and other researchers — from pursuing any project that does not appear to have an overwhelming probability of success. This means many projects that would yield important research and protect the public simply do not happen.

In 2015, I filed a request with the Library of Congress for a special exemption that would have exempted good faith security researchers from the limitations of Section 1201. Representatives of the major automobile manufacturers and the Business Software Alliance (a software industry trade group) vigorously opposed the request. This indicates to me that even reasonable good faith security testing is still a risky proposition.

This risk is particularly acute given that the exemption we eventually won was much more limited than what we asked for, and leaves out many of the technologies with the greatest impact on public health, privacy, and the security of financial transactions.

Section 1201 has prevented crucial security research for far too long. That’s why I’m seeking a court order that would strike Section 1201 from the books as a violation of the First Amendment.

## July 27, 2016

### R.I.Pienaar

#### Fixing the mcollective deployment story

Getting started with MCollective has always been an adventure, you have to learn a ton of new stuff like Middleware etc. And once you get that going the docs tend to present you with a vast array of options and choices including such arcane topics like which security plugin to use while the security model chosen is entirely unique to mcollective. To get a true feeling for the horror see the official deployment guide.

This is not really a pleasant experience and probably results in many insecure or half build deployments out there – and most people just not bothering. This is of course entirely my fault, too many options with bad defaults chosen is to blame.

I saw the graph of the learning curve of Eve Online and immediately always think of mcollective 🙂 Hint: mcollective is not the WoW of orchestration tools.

I am in the process of moving my machines to Puppet 4 and the old deployment methods for MCollective just did not work, everything is falling apart under the neglect the project has been experiencing. You can’t even install any plugin packages on Debian as they will nuke your entire Puppet install etc.

So I figured why not take a stab at rethinking this whole thing and see what I can do, today I’ll present the outcome of that – a new Beta distribution of MCollective tailored to the Puppet 4 AIO packaging that’s very easy to get going securely.

### Overview

My main goals with these plugins were that they share as much security infrastructure with Puppet as possible. This means we get a understandable model and do not need to mess around with custom CAs and certs and so forth. Focussing on AIO Puppet means I can have sane defaults that works for everyone out of the box with very limited config. The deployment guide should be a single short page.

For a new user who has never used MCollective and now need certificates there should be no need to write a crazy ~/.mcollective file and configure a ton of SSL stuff, they should only need to do:

`\$ mco choria request_cert`

This will make a CSR, submit it to the PuppetCA and wait for it to be signed like Puppet Agent. Once signed they can immediately start using MCollective. No config needed. No certs to distribute. Secure by default. Works with the full AAA stack by default.

Sites may wish to have tighter than default security around what actions can be made, and deploying these policies should be trivial.

### Introducing Choria

Choria is a suite of plugins developed specifically with the Puppet AIO user in mind. It rewards using Puppet as designed with defaults and can yield a near zero configuration setup. It combines with a new mcollective module used to configure AIO based MCollective.

The deployment guide for a Choria based MCollective is a single short page. The result is:

• A Security Plugin that uses the Puppet CA
• A connector for NATS
• A discovery cache that queries PuppetDB using the new PQL language
• A open source Application Orchestrator for the new Puppet Multi Node Application stuff (naming is apparently still hard)
• Puppet Agent, Package Agent, Service Agent, File Manager Agent all setup and ready to use
• SSL and TLS used everywhere, any packet that leaves a node is secure. This cannot be turned off
• A new packager that produce Puppet Modules for your agents etc and supports every OS AIO Puppet does
• The full Authentication, Authorization and Auditing stack set up out of the box, with default secure settings
• Deployment scenarios works by default, extensive support for SRV records and light weight manual configuration for those with custom needs

It’s easy to configure using the new lookup system and gives you a full, secure, usable, mcollective out of the box with minimal choices to make.

You can read how to deploy it at it’s deployment guide.

### Status

This is really a Beta release at the moment, I’m looking for testers and feedback. I am particularly interested in feedback on NATS and the basic deployment model, in future I might give the current connectors a same treatment with chosen defaults etc.

The internals of the security plugin is quite interesting, it proposes a new internal message structure for MCollective which should be much easier to support in other languages and is more formalised – to be clear these messages always existed, they were just a bit adhoc.

Additionally it’s the first quality security plugin that has specific support for building a quality web stack compatible MCollective REST server that’s AAA compatible and would even allow centralised RBAC and signature authority.