Planet SysAdmin


July 20, 2017

Chris Siebenmann

HTTPS is a legacy protocol

Ted Unangst recently wrote moving to https (via), in which he gave us the following, in his usual inimitable way:

On the security front, however, there may be a few things to mention. Curiously, some browsers react to the addition of encryption to a website by issuing a security warning. Yesterday, reading this page in plaintext was perfectly fine, but today, add some AES to the mix, and it’s a terrible menace, unfit for even casual viewing. But fill out the right forms and ask the right people and we can fix that, right?

(You may have trouble reading Ted's post, especially on mobile devices.)

One way to look at this situation is to observe that HTTPS today is a legacy protocol, much like every other widely deployed Internet protocol. Every widely deployed protocol, HTTPS included, is at least somewhat old (because it takes time to get widely deployed), and that means that they're all encrusted with at least some old decisions that we're now stuck with in the name of backwards compatibility. What we end up with is almost never what people would design if they were to develop these protocols from scratch today.

A modern version of HTTP(S) would probably be encrypted from the start regardless of whether the web site had a certificate, as encryption has become much more important today. This isn't just because we're in a post-Snowden world; it's also because today's Internet has become a place full of creepy ad-driven surveillance and privacy invasion, where ISPs are one of your enemies. When semi-passive eavesdroppers are demonstrably more or less everywhere, pervasive encryption is a priority for a whole bunch of people for all sorts of reasons, both for privacy and for integrity of what gets transferred.

But here on the legacy web, our only path to encryption is with HTTPS, and HTTPS comes with encryption tightly coupled to web site authentication. In theory you could split them apart by fiat with browser and web server cooperation (eg); in practice there's a chicken and egg problem with how old and new browsers interact with various sorts of sites, and how users and various sorts of middleware software expect HTTP and HTTPS links and traffic to behave. At this point there may not be any way out of the tangle of history and people's expectations. That HTTPS is a legacy protocol means that we're kind of stuck with some things that are less than ideal, including this one.

(I don't know what the initial HTTPS and SSL threat model was, but I suspect that the Netscape people involved didn't foresee anything close to the modern web environment we've wound up with.)

So in short, we're stuck with a situation where adding some AES to your website does indeed involve making it into a horrible menace unless you ask the right people. This isn't because it's particularly sensible; it's because that's how things evolved, for better or worse. We can mock the silliness of the result if we want to (although every legacy protocol has issues like this), but the only real way to do better is to destroy backwards compatibility. Some people are broadly fine with this sort of move, but a lot of people aren't, and it's very hard to pull off successfully in a diverse ecology where no single party has strong control.

(It's not useless to point out the absurdity yet again, not exactly, but it is shooting well-known fish in a barrel. This is not a new issue and, as mentioned, it's not likely that it's ever going to be fixed. But Ted Unangst does write entertaining rants.)

by cks at July 20, 2017 04:31 AM

July 19, 2017

ma.ttias.be

Podcast: The Ceremony, the birth of Zcash

The post Podcast: The Ceremony, the birth of Zcash appeared first on ma.ttias.be.

This was an interesting -- an unexpected -- podcast from the the folks at Radiolab.

Last November, journalist Morgen Peck showed up at her friend Molly Webster's apartment in Brooklyn, told her to take her battery out of her phone, and began to tell her about The Ceremony, a moment last fall when a group of, well, let's just call them wizards, came together in an undisclosed location to launch a new currency.

It's an undertaking that involves some of the most elaborate security and cryptography ever done (so we've been told). And math. Lots of math. It was all going great until, in the middle of it, something started to behave a little...strangely.

Source: The Ceremony -- Radiolab

The story is about the birth of Zcash & the paranoia involved in creating the initial keys to launch a new cryptocurrency.

After this episode I'm still stuck with lots of questions;

  • Who choose the location of the computenode purchase?
  • Who vetted the recording equipment?
  • If a phone got compromised during key initialization, why didn't they abort?

For a team this paranoid, it sounds like they made a couple of strange choices along the way. Or at least, it was portraid as such in the story.

Either way, it was a fun listen and interesting to hear terms like RSA 4096, Diffie-Hellman and cryptocurrency get mentioned in an otherwise rather sciency-podcast!

The post Podcast: The Ceremony, the birth of Zcash appeared first on ma.ttias.be.

by Mattias Geniar at July 19, 2017 07:37 AM

Chris Siebenmann

I've become resigned to Firefox slowly leaking memory

Over the years I've written a number of things here about how my Firefox setup seems to be fairly fragile as far as memory usage goes, in that any number of addons or other changes seem to cause it to leak memory, often rapidly. Sometimes there have been apparently innocuous changes in addons I use, like NoScript, that cause a new version of the addon to make my Firefox sessions explode.

(I've actually bisected some of those changes down to relatively innocent changes and found at least one pattern in addon and even core Firefox JavaScript that seems to cause memory leaks, but I'm not sure I believe my results.)

For a long time I held out hope that if I only found the right combination of addons and options and so on, I could get my Firefox to have stable memory usage over a period of a week or more (with a fixed set of long-term browser windows open, none of which run JavaScript). But by now I've slowly and reluctantly come around to the idea that that's not going to happen. Instead, even with my best efforts I can expect Firefox's Resident Set Size to slowly grow over a few days from a starting point of around 600 to 700 MBytes, eventually crossing over the 1 GB boundary, and then I'll wind up wanting to restart it once I notice.

The good news is that Firefox performance doesn't seem to degrade drastically at this sort of memory usage. I haven't kept close track of how it feels, but it's certainly not the glaringly obvious responsiveness issues that used to happen to me. Instead I wind up restarting Firefox because it's just using too much of my RAM and I want it to use less.

(It's possible that Firefox's performance would degrade noticeably if I let it keep growing its memory usage, which of course is one reason not to.)

Restarting Firefox is not too much of a pain (I built a tool to help a while back), but it makes me vaguely unhappy despite my resignation. Software should be better than this, but apparently it isn't and I just get to live with it. Restarting Firefox feels like giving in, but not restarting Firefox is clearly just tilting at windmills.

Sidebar: The JavaScript pattern that seemed to leak

The short version is 'calling console.log() with an exception object'. The general pattern seemed to be:

try {
  [... whatever ...]
} catch (e) {
  console.log(e);
}

My theory is that this causes the browser-wide Firefox developer console to capture the exception object, which in turn presumably captures a bunch of JavaScript state, variables, and code, and means that none of them can be garbage collected the way they normally would be. Trigger such exceptions very often and there you go.

Replacing the console.log(e) with 'console.log("some-message")' seemed to usually make the prominent leaks go away. The loss of information was irrelevant; it's not as if I'm going to debug addons (or core Firefox code written in JavaScript). I never even look at the browser console.

It's possible that opening the browser console every so often and explicitly clearing it would make my memory usage drop. I may try that the next time I have a bloated-up Firefox, just to see. It's also possible that there's a magic option that causes Firefox to just throw away everything sent to console.log(), which would be fine by me.

by cks at July 19, 2017 04:21 AM

July 18, 2017

ma.ttias.be

Choose source IP with ping to force ARP refreshes

The post Choose source IP with ping to force ARP refreshes appeared first on ma.ttias.be.

This is a little trick you can use to force outgoing traffic via one particular IP address on a server. By default, if your server has multiple IPs, it'll use the default IP address for any outgoing traffic.

However, if you're changing IP addresses or testing failovers, you might want to force traffic to leave your server as if it's coming from one of the other IP addresses. That way, upstream switches learn your server has this IP address and they can update their ARP caches.

ping allows you to do that very easily (sending ICMP traffic).

$ ping 8.8.8.8 -I 10.0.1.4

The -I parameter sets the interface via which packets should be sent, it can accept either an interface name (like eth1) or an IP address. This way, traffic leaves your server with srcip 10.0.1.4.

Docs describe -I like this;

-I interface address
 Set  source address to specified interface address. Argument may be numeric IP
 address or name of device. When pinging IPv6 link-local address this option is
 required.

So works for both individual IP addresses as well as interfaces.

The post Choose source IP with ping to force ARP refreshes appeared first on ma.ttias.be.

by Mattias Geniar at July 18, 2017 07:30 PM

Apache httpd 2.2.15-60: underscores in hostnames are now blocked

The post Apache httpd 2.2.15-60: underscores in hostnames are now blocked appeared first on ma.ttias.be.

A minor update to the Apache httpd project on CentOS 6 had an unexpected consequence. The update from 2.2.15-59 to 2.2.15-60, as advised because of a small security issue, started respecting RFC 1123 and as a result, stops allowing underscores in hostnames.

I spent a while debugging a similar problem with IE dropping cookies on hostnames with underscores, because it turns out that's not a valid "hostname" as per the definition in RFC 1123.

Long story short, the minor update to Apache broke these kind of URLs;

  • http://client_one.site.tld/
  • http://client_two.site.tld/page/three

These worked fine before,  but now started throwing these errors;

Bad Request

Your browser sent a request that this server could not understand.
Additionally, a 400 Bad Request error was encountered while trying to use an ErrorDocument to handle the request.

The error logs showed the message as such;

[error] client sent HTTP/1.1 request without hostname (see RFC2616 section 14.23)

The short-term workaround was to downgrade Apache again.

$ yum downgrade httpd-tools httpd mod_ssl

And that allowed the underscores again, long-term plan is to migrate those (sub)domains to versions without underscores.

The post Apache httpd 2.2.15-60: underscores in hostnames are now blocked appeared first on ma.ttias.be.

by Mattias Geniar at July 18, 2017 07:30 PM

Chris Siebenmann

Python's complicating concept of a callable

Through Planet Python I recently wound up reading this article on Python decorators. It has the commendable and generally achieved goal of a clear, easily followed explanation of decorators, starting out by talking about how functions can return other functions and then defining decorators as:

A decorator is a function (such as foobar in the above example) that takes a function object as an argument, and returns a function object as a return value.

Some experienced Python people are now jumping up and and down to say that this definition is not complete and thus not correct. To be complete and correct, you have to change function to callable. These people are correct, but at the same time this correctness creates a hassle in this sort of explanation.

In Python, it's pretty easy to understand what a function is. We have an intuitive view that pretty much matches reality; if you write 'def foobar(...):' (in module scope), you have a function. It's not so easy to inventory and understand all of the things in Python that can be callables. Can you do it? I'm not sure that I can:

  • functions, including lambdas
  • classes
  • objects if they're instances of a class with a __call__ special method
  • bound methods of an object (eg anobj.afunc)
  • methods of classes in general, especially class methods and static methods

(I don't think you can make modules callable, and in any case it would be perverse.)

Being technically correct in this sort of explanation exacts an annoying price. Rather than say 'function', you must say 'callable' and then probably try to give some sort of brief and useful explanation of just what a callable is beyond a function (which should cover at least classes and callable objects, which are the two common cases for things like decorators). This is inevitably going to complicate your writing and put at least a speed bump in the way for readers.

The generality of Python accepting callables for many things is important, but it does drag a relatively complicated concept into any number of otherwise relatively straightforward explanations of things. I don't have any particular way to square the circle here; even web specific hacks like writing function as an <ABBR> element with its own pseudo-footnote seem kind of ugly.

(You could try to separate the idea of 'many things can be callable' from the specifics of just what they are beyond functions. I'm not sure that would work well, but I'm probably too close to the issue; it's quite possible that people who don't already know about callables would be happy with that.)

by cks at July 18, 2017 04:59 AM

July 17, 2017

Everything Sysadmin

Four Ways to Make CS and IT Curricula More Immersive

My new column in ACM Queue is entitled, "Four Ways to Make CS and IT Curricula More Immersive". I rant and rail against the way that CS and IT is taught today and propose 4 ways CS educators can improve the situation.

The article is free to ACM members. Non-members can purchase an annual subscription for $19.99 or a single issue for $6.99 online or through the Apple or Google stores.

by Tom Limoncelli at July 17, 2017 03:00 PM

July 16, 2017

Everything Sysadmin

Still time to RSVP to NYCDevOps...

This month's NYCDevOps meetup speaker will be Martín Beauchamp talking about "Clos Networks for Datacenters". You don't want to miss this!

  • Date: Tuesday, July 18, 2017
  • Time: 6:30 PM
  • Location: Stack Overflow HQ, 110 William St, 28th floor, NY, NY

Space is limited! RSVP soon!

https://www.meetup.com/nycdevops/events/240295361/

by Tom Limoncelli at July 16, 2017 09:50 PM

July 14, 2017

Sean's IT Blog

Top 10 EUC Sessions at #VMworld 2017 Las Vegas

VMworld 2017 is just around the corner.  The premier virtualization conference will be returning to the Mandalay Bay convention center in Las Vegas at the end of August. 

There is one major addition to the EUC content at VMworld this year.  VMware has decided to move the Airwatch Connect conference, which cover’s VMware’s offerings in the mobility management space, from Atlanta and colocate it with VMworld.  So not only do attendees interested in EUC get great expert content on VMware’s Horizon solutions, they’ll get more content on Airwatch, mobility management, identity management, and IoT as well.

My top 10 EUC sessions for 2017 are:

  1. ADV1594BU – Beyond the Marketing: VMware Horizon 7.1 Instant Clones Deep Dive – This session, by Jim Yanik and Oswald Chen, is a technical deep dive into how Instant Clone desktops work.  This updated session will cover new features that have been added to Instant Clones since they were released in Horizon 7.  I’m often wary of “deep dive sessions,” but I’ve seen Jim give a similar presentation at various events and he does a great job talking through the Instant Clone technology in a way that all skill levels can understand it.  If you’re interested in VMware EUC, this is the one session you must attend as this technology will be relevant for years to come. 
  2. ADV1609BU – Deliver Any App, Any Desktop, Anywhere in the World Using VMware Blast Extreme – Blast Extreme is VMware’s new protocol that was officially introduced in Horizon 7.  Pat Lee and Ramu Panayappan will provide a deep dive into Blast Extreme.  Pat does a good job talking about Blast Extreme and how it works, and attendees will definitely walk away having learned something.
  3. ADV1681GU/ADV1607BU – Delivering 3D graphics desktops and applications in the real world with VMware Horizon, BEAT and NVIDIA GRID – VMware’s Kiran Rao and NVIDIA’s Luke Wignall talk about how Blast Extreme utilizes NVIDIA GPUs to provide a better user experience in End-User Computing environments.  This session was actually listed twice in the Content Catalog, so don’t worry if you miss one.
  4. ADV1583BU – Delivering Skype for Business with VMware Horizon: All You Need to Know – Official support for Skype for Business GA’d with Horizon 7.2.  This session will dive into how that the new Skype for Business plugin works to provide a better telephony experience in EUC environments.
  5. ADV3370BUS – DeX Solutions: How Samsung and VMware are Pioneering Digital Transformation – Samsung DeX is a new cell phone from Samsung that, when placed in a dock, can utilize a keyboard, mouse, and monitor to act as a virtual thin client endpoint while still having all the capabilities of a phone.  DeX has the potential to revolutionize how businesses provide endpoints and cellular phones to users.
  6. ADV1655GU – CTOs perspective on the Workspace 2020 and beyond: time to act now! – End-User Computing expert and technology evangelist Ruben Spruijt talks about the future of the end-user workspace and strategies on how to implement next-generation workspace technology.
  7. UEM1359BU – Best Practices in Migrating Windows 7 to Windows 10 – Windows 10 migrations are a hot topic, and almost every business will need a Windows 10 strategy.  This session will explore the best practices for migrating to Windows 10 in any type of organization.
  8. SAAM1684GU – Ask the Experts: How to Enable Secure Access from Personal/BYO Devices and All Types of Users with Workspace ONE – How do you enable secure remote access to company resources while allowing employees, contractors, and other types of workers to use their personal devices?  This group discussion will cover best practices for using VMware Workspace ONE to provide various levels of secure access to company resources from personal devices based on various context settings.  Unlike most sessions, this is a group discussion.  There are very few slides, and most of the session time will be devoted to allowing attendees to ask questions to the discussion leaders.
  9. ADV1588BU – Architecting Horizon 7 and Horizon Apps – A successful EUC environment starts with a solid architecture.  This session covers how to architect an integrated Horizon environment consisting of all components of the Horizon Suite. 
  10. vBrownbag TechTalks on EUC – There are three community driven vBrownbag Tech Talks focusing on EUC led by EUC Champions.  These talks are:
    1. GPU-Enabled Linux VDI by Tony Foster – Tony will cover how to build GPU-enabled Linux virtual desktops in Horizon and some of the pain points he encountered while implementing this solution at a customer.
    2. Windows 10 and VDI – Better Come Prepared – Rob Beekmans and Sven Huisman will cover lessons they’ve learned while implementing Windows 10 in VDI environments.
    3. Leveraging User Environment Manager to Remove GPOs – Nigel Hickey will cover how to use VMware UEM as a Group Policy replacement tool.
  11. ADV1605PU – Ask the Experts: Practical Tips and Tricks to Help You Succeed in EUC – So this top 10 list will actually have 11 entries, and this one is a bit of shameless self-promotion.  This session is s a repeat of last year’s EUC champions session featuring Earl Gay, VCDX Johan van Amersfoot, moderator Matt Heldstab, and I.  We’re answering your questions about EUC based on our experiences in the trenches.  Last year, we also had some prizes. 

Bonus Session

There is one bonus session that you must put on your schedule.  It’s not EUC-related, but it is put on by two of the smartest people in the business today.  They were also two of my VCDX mentors.  The session is Upgrading to vSphere 6.5 the VCDX Way [SER2318BU] by Rebecca Fitzhugh and Melissa Palmer.  You should seriously check this session out as they’ll provide a roadmap to take your environment up to vSphere 6.5. 


by seanpmassey at July 14, 2017 05:00 PM

Evaggelos Balaskas

Install Slack Desktop to Archlinux

Slack

How to install slack dekstop to archlinux

Download Slack Desktop

eg. latest version

https://downloads.slack-edge.com/linux_releases/slack-2.6.3-0.1.fc21.x86_64.rpm

Extract under root filesystem

# cd /

# rpmextract.sh slack-2.6.3-0.1.fc21.x86_64.rpm

Done

Actually, that’s it!

Run

Run slack-desktop as a regular user:

$ /usr/lib/slack/slack

Slack Desktop

slackdesktop.jpg

Proxy

Define your proxy settings on your environment:

declare -x ftp_proxy="proxy.example.org:8080"
declare -x http_proxy="proxy.example.org:8080"
declare -x https_proxy="proxy.example.org:8080"

Slack

slackdesktop2.jpg

Tag(s): slack

July 14, 2017 10:28 AM

July 13, 2017

Everything Sysadmin

Free book: "Expanding Pockets of Greatness"

Companies don't make their "DevOps transformation" over night. Usually there is a small team that adopts devops practices and then, after proving their success, the practices spread throughout the company horizontally. However sometimes their success becomes an island. There is no momentum and the better practices fail to expand around the company.

Growing devops practices within a company is not easy. It is especially difficult when it does not have management support, or the advocate does not executive authority. Some techniques for building momentum work, others do not.

Earlier this year Josh Atwell, Carmen DeArdo, Jeff Gallimore, and myself sat down to write a list of techniques we've seen succeed. No theory. No hyperbole. No fluff. We wanted to write down what we've seen work so that other people can copy these simple but effective techniques. This is a book for people in the trenches, not executives.

We realized that the list didn't need to be long nor did it need to be exhaustive. There are 2-3 that are simple, powerful, and almost always work. This didn't need to be an encyclopedia!

The result of this list is a new 14-page free book from IT Revolution called Expanding Pockets of Greatness: Spreading DevOps Horizontally in Your Organization.

The book is now available online for free. It's only 14 pages (10 if you skip the cover and front-matter). We wrote it in a day. You can read it in an hour:

Get it! https://itrevolution.com/book/expanding-pockets-greatness



Title:

Expanding Pockets of Greatness: Spreading DevOps Horizontally in Your Organization

Description:

Here you are: There are a few pockets of DevOps in your organization, but you are a long way from achieving a total DevOps transformation.

How do you build momentum and go from a few islands of DevOps goodness to a tipping point where the entire organization embraces common DevOps methods?

This paper is about the techniques others have used to build momentum to spread DevOps horizontally across an organization. The techniques fall in four categories: sharing, communicating, standardizing, and empowering new leaders.

You're not alone. DevOps is out there in your organization. We want to help you find it and scale it.


by Tom Limoncelli at July 13, 2017 03:00 PM

July 12, 2017

Debian Administration

Implementing two factor authentication with perl

Two factor authentication is a system which is designed to improve security, by requiring a second factor, in addition to a username/password pair, to access a particular resource. Here we'll look briefly at how you add two factor support to your applications with Perl.

by Steve at July 12, 2017 05:23 PM

July 11, 2017

The Lone Sysadmin

The Dangers of Experts Writing Documentation: A Real Life Example

There are some real, tangible dangers to having experts write documentation. Experts have the perfect tools, skip steps, know where things are based on experience, use jargon, have spare parts so mistakes aren’t a big deal, and as a result make terrible time & work estimates. This leads to confused, and subsequently angry, people, which […]

The post The Dangers of Experts Writing Documentation: A Real Life Example appeared first on The Lone Sysadmin. Head over to the source to read the full post!

by Bob Plankers at July 11, 2017 05:58 AM

July 10, 2017

pagetable

80 Columns Text on the Commodore 64

The text screen of the Commodore 64 has a resolution of 40 by 25 characters, based on the hardware text mode of the VIC-II video chip. This is a step up from the VIC-20′s 22 characters per line, but since computers in the professional segment (Commodore PET 8000 series, CP/M, MS-DOS) usually had 80 columns, several solutions – both hardware and software – exist to allow 80 columns on a C64 as well. Let’s look at how this is done in software! At the end of this article, I present a fast and full-featured open source implementation with several different character sets.

Regular 40×25 Mode

First, we need to understand how 40 columns are done. The VIC-II video chip has dedicated support for a text mode. There is a 1000 (= 40 * 25) byte “Screen RAM”, each byte of which contains the character to be displayed at that location in the form of an index into the 2 KB character set, which contains 8 bytes (for 8×8 pixels) for each of the 256 characters. In addition, there is a “Color RAM”, which contains a 1000 (again 40 * 25) 4-bit values, which represents a color for each character on the screen.

Putting a character onto the screen is quite trivial: Just write the index of it to offset column + 40 * line into Screen RAM, and its color to the same offset in Color RAM. An application can load its own character set, but the Commodore 64 already comes with two character sets in ROM: One with uppercase characters and lots of graphical symbols (“GRAPHICS”), and one with upper and lower case (“TEXT”). You can switch these by pressing the Commodore and the Shift key at the same time.

Bitmap Mode

There is no hardware support for an 80 column mode, but such a mode can be implemented in software by using bitmap mode. In bitmap mode, all 320 by 200 pixels on the screen can be freely addressed through an 8000 byte bitmap, which contains one bit for every pixel on the screen. Luckily for us, the layout of the bitmap in memory is not linear, but reminds of the encoding of text mode: The first 8 bytes in Bitmap RAM don’t describe, as you would expect, the leftmost 64 pixels on the first line. Instead, they describe the top left 8×8 block. The next 8 bytes describe the 8×8 block to the right of it, and so on.

This is the same layout as the character set’s: The first 8 bytes correspond to the first character, the next 8 bytes to the second character and so on. Drawing an 8×8 character onto a bitmap (aligned to the 8×8 grid) is as easy as copying 8 consecutive bytes.

This is what an 8×8 font looks like in memory:

0000 ··████··  0008 ···██···  0010 ·█████··
0001 ·██··██·  0009 ··████··  0011 ·██··██·
0002 ·██·███·  000a ·██··██·  0012 ·██··██·
0003 ·██·███·  000b ·██████·  0013 ·█████··
0004 ·██·····  000c ·██··██·  0014 ·██··██·
0005 ·██···█·  000d ·██··██·  0015 ·██··██·
0006 ··████··  000e ·██··██·  0016 ·█████··
0007 ········  000f ········  0017 ········

For an 80 column screen, every character is 4×8 pixels. So we could describe the character set like this:

0000 ········  0008 ········  0010 ········
0001 ··█·····  0009 ··█·····  0011 ·██·····
0002 ·█·█····  000a ·█·█····  0012 ·█·█····
0003 ·███····  000b ·███····  0013 ·██·····
0004 ·███····  000c ·█·█····  0014 ·█·█····
0005 ·█······  000d ·█·█····  0015 ·█·█····
0006 ··██····  000e ·█·█····  0016 ·██·····
0007 ········  000f ········  0017 ········

Every 4×8 character on the screen is either in the left half or the right half of an 8×8 block, so drawing an 4×8 character is as easy as copying the bit pattern into the 8×8 block – and shifting it 4 bits to the right for characters at odd positions.

Color

In bitmap mode, it is only possible to use two out of the 16 colors per 8×8 block, because there are only 1000 (40 * 25) entries for the color matrix. This is a problem, since we need three colors per 8×8: Two for the two characters and one for the background. We will have to compromise: Whenever a character gets drawn into an 8×8 block and the other character in the block has a different color, that other character will be changed to the same color as the new character.

Scrolling

Scrolling on a text screen is easy: 960 bytes of memory have to be copied to move the character indexes to their new location. In bitmap mode, 7680 bytes have to be copied – 8 times more. Even with the most optimized implementation (73ms, about 3.5 frames), scrolling will be slower, and tearing artifacts are unavoidable.

Character Set

Creating a 4×8 character set that is both readable and looks good is not easy. There has to be a one-pixel gap between characters, so characters can effectively only be 3 pixels wide. For characters like “M” and “N”, this is a challenge.

These are the character sets of four different software solutions for 80 columns:

COLOR 80 by Richvale Telecommunications
  

80COLUMNS
  

SCREEN-80 by Compute’s Gazette
  

Highspeed80 by CKtwo
  

Some observations:

  • Highspeed80 and SCREEN-80 have capitals of a height of 7 pixels (more detail, but very tight vertical spacing, making it hard to read), while COLOR 80 uses only 5 pixels (more square like the original 8×8 font, but less detail). 6 pixels, as used by 80COLUMNS, seems like a good compromise.
  • 80COLUMNS has the empty column at the left, which makes single characters in reverse mode more readable, since most characters have their stems at the left.
  • Except for Highspeed80, the graphical symbols are very similar between the different character sets.
  • All four character sets use the same strategy to distinguish between “M” and “N”.

The Editor

The “EDITOR” is the part of C64′s ROM operating system (“KERNAL”) that handles printing characters to the screen (and interpreting control characters), as well as converting on-screen contents back into a PETSCII string – yes, text input on CBM computers is done by feeding the keyboard input directly into character output, and reading back the screen contents when the user presses the return key. This way, the user can use the cursor keys to navigate to existing text anywhere on the screen (even to the output of previous commands), edit it, and have the system interpret it as input when pressing return.

The C64′s EDITOR can only deal with 40 columns (but has a very nice feature that allows using two 40 character lines as one virtual 80 character input line), and has no idea how to draw into a bitmap, so a software 80 characters solution basically has to provide a complete reimplementation of this KERNAL component.

The KERNAL provides user vectors for lots of its functionality, so both character output, and reading back characters from the screen can be hooked (vectors IBSOUT at $0326 and IBASIN at $0324). In addition to drawing characters into the bitmap, the character codes have to be cached in a 80×25 virtual Screen RAM, so the input routine can read them back.

The PETSCII code contains control codes for changing the current color, moving the cursor, clearing the screen, turning reverse on and off, and switching between the “GRAPHICS” and “TEXT” character sets. The new editor has provide code to interpret these. There are two special cases though: When in quote mode (the user is typing text between quotes) or insert mode (the user has typed shift+delete), most special characters show up as inverted graphical characters instead of being interpreted. This way, control characters can be included e.g. in strings in BASIC programs.

There are two functions though that cannot be intercepted through vectors: Applications (and BASIC programs) change the screen background color by writing the color’s value to $d020, since there is no KERNAL function or BASIC command for it, and the KERNAL itself switches between the two character sets (when the user presses the Commodore and the Shift key at the same time) by directly writing to $d018. The only way to intercept these is by hooking the timer interrupt vector and detecting a change in these VIC-II registers. If the background color has changed, the whole 1000 byte color matrix for bitmap mode has to be updated, and if the character set has changed, the whole screen has to be redrawn.

The Implementation

I looked at all existing software implementations I could find and concluded that “80COLUMNS” (by an unknown author) had the best design and was the only one to implement the full feature set of the original EDITOR. I reverse-engineered it into structured, easy to read code, added Ilker Ficicilar’s fast scrolling patch as well as my own minor cleanups, fixes and optimizations.

https://www.github.com/mist64/80columns

The project requies cc65 and exomizer to build. Running make will produce 80columns-compressed.prg, which is about 2.2 KB in size and can be started using LOAD/RUN.

The source contains several character sets (charset.s, charset2.s etc.) from different 80 column software solutions, which can be selected by changing the reference to the filename in the Makefile.

The object code resides at $c800-$cfff. The two character sets are located at $d000-$d7ff. The virtual 80×25 Screen RAM (in order to read back screen contents) is at $c000-$c7ff. The bitmap is at $e000-$ff40, and the color matrix for bitmap mode is at $d800-$dbe8. All this lies beyond the top of BASIC RAM, so BASIC continues to have 38911 bytes free.

In order to speed up drawing, the character set contains all characters duplicated like this:

0000 ········  0008 ········  0010 ········
0001 ··█···█·  0009 ··█···█·  0011 ·██··██·
0002 ·█·█·█·█  000a ·█·█·█·█  0012 ·█·█·█·█
0003 ·███·███  000b ·███·███  0013 ·██··██·
0004 ·███·███  000c ·█·█·█·█  0014 ·█·█·█·█
0005 ·█···█··  000d ·█·█·█·█  0015 ·█·█·█·█
0006 ··██··██  000e ·█·█·█·█  0016 ·██··██·
0007 ········  000f ········  0017 ········

This way, the drawing code only has to mask the value instead of shifting it. In addition, parts of character drawing and all of screen scrolling are using unrolled loops for performance.

Contributions to the project are very welcome. It would be especially interesting to add new character sets, both existing 4×8 fonts from other projects (including hinted TrueType fonts!), and new ones that combine the respective strengths of the existing ones.

80×33 Mode?

Reducing the size of characters to 4×6 would allow a text mode resolution of 80×33 characters. Novaterm 10 has an implementation. At this resolution, logical characters don’t end at vertical 8×8 boundaries any more, making color impossible, and the drawing routine a little slower. It would be interesting to add an 80×33 mode as a compile time option to “80columns”.

by Michael Steil at July 10, 2017 09:45 PM

Steve Kemp's Blog

bind9 considered harmful

Recently there was another bind9 security update released by the Debian Security Team. I thought that was odd, so I've scanned my mailbox:

  • 11 January 2017
    • DSA-3758 - bind9
  • 26 February 2017
    • DSA-3795-1 - bind9
  • 14 May 2017
    • DSA-3854-1 - bind9
  • 8 July 2017
    • DSA-3904-1 - bind9

So in the year to date there have been 7 months, in 3 of them nothing happened, but in 4 of them we had bind9 updates. If these trends continue we'll have another 2.5 updates before the end of the year.

I don't run a nameserver. The only reason I have bind-packages on my system is for the dig utility.

Rewriting a compatible version of dig in Perl should be trivial, thanks to the Net::DNS::Resolver module:

These are about the only commands I ever run:

dig -t a    steve.fi +short
dig -t aaaa steve.fi +short
dig -t a    steve.fi @8.8.8.8

I should do that then. Yes.

July 10, 2017 09:00 PM

That grumpy BSD guy

OpenBSD and the modern laptop

Did you think that OpenBSD is suitable only for firewalls and high-security servers? Think again. Here are my steps to transform a modern mid to high range laptop into a useful Unix workstation with OpenBSD.

One thing that never ceases to amaze me is that whenever I'm out and about with my primary laptop at conferences and elsewhere geeks gather, a significant subset of the people I meet have a hard time believing that my laptop runs OpenBSD, and that it's the only system installed.

A typical exchange runs something like,
"So what system do you run on that laptop there?"
"It's OpenBSD. xfce is the window manager, and on this primary workstation I tend to just upgrade from snapshot to snapshot."
"Really? But ..."
and then it takes a bit of demonstrating that yes, the graphics runs with the best available resolution the hardware can offer, the wireless network is functional, suspend and resume does work, and so forth. And of course, yes, I do use that system when writing books and articles too. Apparently heavy users of other free operating systems do not always run them on their primary workstations.

I'm not sure at what time I permanently converted my then-primary workstation to run OpenBSD exclusively, but I do remember that when I took delivery of the ThinkPad R60 (mentioned in this piece) in 2006, the only way forward was to install the most recent OpenBSD snapshot. By mid-2014 the ThinkPad SL500 started falling to pieces, and its replacement was a Multicom Ultrabook W840, manufactured by Clevo. The Clevo Ultrabook has weathered my daily abuse and being dragged to various corners of the world for conferences well, but on the trek to BSDCan 2017 cracks started appearing in the glass on the display and the situation worsened on the return trip.

So the time came to shop around for a replacement. After a bit of shopping around I came back to Multicom, a small computers and parts supplier outfit in rural Åmli in southern Norway, the same place I had sourced the previous one.

One of the things that attracted me to that particular shop and their own-branded offerings is that they will let you buy those computers with no operating system installed. That is of course what you want to do when you source your operating system separately, as we OpenBSD users tend to do.

The last time around I had gone for a "Thin and lightweight" 14 inch model (Thickness 20mm, weight 2.0kg) with 16GB RAM, 240GB SSD for system disk and 1TB HD for /home (since swapped out for a same-size SSD, as the dmesg will show).

Three years later, the rough equivalent with some added oomph for me to stay comfortable for some years to come ended me with a 13.3 inch model, 18mm and advertised as 1.3kg (but actually weighing in at 1.5kg, possibly due to extra components), 32GB RAM, 512GB SSD and 2TB harddisk. For now the specification can be viewed online here (the site language is Norwegian, but product names and units of measure are not in fact different).

That system arrived today, in a slender box:



Here are the two machines, the old (2014-vintage) and the new side by side:



The OpenBSD installer is a wonder of straightforward, no-nonsense simplicity that simply gets the job done. Even so, if you are not yet familiar with OpenBSD, it is worth spending some time reading the OpenBSD FAQ's installation guidelines and the INSTALL.platform file (in our case, INSTALL.amd64) to familiarize yourself with the procedure. If you're following this article to the letter and will be installing a snapshot, it is worth reading the notes on following -current too.

The main hurdle back when I was installing the 2014-vintage 14" model was getting the system to consider the SSD which showed up as sd1 the automatic choice for booting (I solved that by removing the MBR, setting the size of the MBR on the hard drive that showed up as sd0 to 0 and enlarging the OpenBSD part to fill the entire drive).

Let's see how the new one is configured, then. I try running with the default UEFI "Secure boot" option enabled, and it worked.

Here we see the last part of the messages that scroll across the screen when the new laptop boots from the USB thumbdrive that has had the most recent OpenBSD/amd64 install61.fs dd'ed onto it:



And as the kernel messages showed us during boot (yes, that scrolled off the top before I got around to taking the picture), the SSD came up as sd1 while the hard drive registered as sd0. Keep that in mind for later.



After the initial greeting from the installer, the first menu asks what we want to do. This is a new system, so only (A)utoinstall and (I)nstall would have any chance of working. I had not set up for automatic install this time around, so choosing (I)nstall was the obvious thing to do.

The next item the installer wants to know is which keyboard layout to use and to set as the default on the installed system. I'm used to using Norwegian keyboards, so no is the obvious choice for me here. If you want to see the list of available options, you press ? and then choose the one you find the must suitable.

Once you've chosen the keyboard layout, the installer prompts you for the system's host name. This is only the host part, the domain part comes later. I'm sure your site or organization has some kind of policy in place for choice of host names. Make sure you stay inside any local norms, the one illustrated here conforms with what we have here.

Next up the installer asks which network interfaces to configure. A modern laptop such as this one comes with at least two network interfaces: a wireless interface, in this case an Intel part that is supported in OpenBSD with the iwm(4) driver, and a wired gigabit ethernet interface which the installer kernel recognized as re0.

Quite a few pieces the hardware in a typical modern laptop requires the operating system to load firmware onto the device before it can start interacting meaningfully with the kernel. The Intel wireless network parts supported by the iwm(4) driver and the earlier iwn(4) all have that requirement. However, for some reason the OpenBSD project has not been granted permission to distribute the Intel firmware files, so with only the OpenBSD installer it is not possible to use iwm(4) devices during an initial install. So in this initial round I only configure the re0 interface. During the initial post-install boot the rc.firsttime script will run fw_update(1) command that will identify devices that require firmware files and download them from the most convenient OpenBSD firmware mirror site.

My network here has a DHCP server in place, so I simply choose the default dhcp for IPv4 address assignment and autoconf for IPv6.

With the IPv4 and IPv6 addresses set, the installer prompts for the domain name. Once again, the choice was not terribly hard in my case.



On OpenBSD, root is a real user, and you need to set that user's password even if you will rarely if ever log in directly as root. You will need to type that password twice, and as the install documentation states, the installer will only check that the passwords match. It's up to you to set a usefully strong password, and this too is one of the things organizations are likely to have specific guidelines for.

Once root's password is set, the installer asks whether you want to start sshd(8) by default. The default is the sensible yes, but if you answer no here, the installed system will not have any services listening on the system's reachable interfaces.

The next question is whether the machine will run the X Windows system. This is a laptop with a "Full HD" display and well supported hardware to drive it, so the obvious choice here is yes.

I've gotten used to running with xenodm(1) display manager and xfce as the windowing environment, so the question about xenodm is a clear yes too, in my case.

The next question is whether to create at least one regular user during the install. Creating a user for your systems adminstrator during install has one important advantage: the user you create at this point will be a member of the wheel group, which makes it slightly easier to move to other privilege levels via doas(1) or similar.

Here I create a user for myself, and it is added, behind the scenes, to the wheel group.

With a user in place, it is time to decide whether root will be able to log in via ssh. The sensible default is no, which means you too should just press enter here.

The installer guessed correctly for my time zone, so it's another Enter to move forward.

Next up is the part that people have traditionally found the most scary in OpenBSD installing: Disk setup.

If the machine had come with only one storage device, this would have been a no-brainer. But I have a fast SSD that I want to use as the system disk, and a slightly slower and roomier rotating rust device aka hard disk that I want primarily as the /home partition.

I noted during the bsd.rd boot that the SSD came up as sd1 and the hard drive came up as sd0, so we turn to the SSD (sd1) first.

Since the system successfully booted with the "Secure boot" options in place, I go for the Whole disk GPT option and move on to setting partition sizes.

The default suggestion for disk layout makes a lot of sense and will set sensible mount options, but I will be storing /home on a separate device, so I choose the (E)dit auto layout option and use the R for Resize option to redistribute the space left over to the other partitions.

Here is also where you decide the size of the swap space, traditionally on the boot device's b partition. Both crashdumps and suspend to disk use swap space for their storage needs, so if you care about any of these, you will need to allocate at least as much space as the amount of physical RAM installed in the system. Because I could, I allocated the double of that, or 64GB.

For sd0, I once again choose the Whole disk GPT option and make one honking big /home partition for myself.

The installer then goes on to create the file systems, and returns with the prompt to specify where to find install sets.

The USB drive that I dd'ed the install61.fs image to is the system's third sd device (sd2), so choosing disk and specifying sd2 with the subdirectory 6.1/amd64 makes sense here. On the other hand, if your network and the path to the nearest mirror is fast enough, you may actually save time choosing a http network install over installing from a relatively slow USB drive.

Anyway, the sets install proceeds and trundles through what is likely the longest period of forced inactivity that you will have during an OpenBSD install.

The installer verifies the signed sets and installs them.



Once the sets install is done, you get the offer of specifying more sets -- your site could have a site-specific items in an install set -- but I don't have any of those handy, so I just press enter to accept the default done.

If you get the option to correct system time here, accept it and have ntpd(8) set your system clock to a sane setting gleaned from well known NTP servers.

With everything else in place, the installer links the kernel with a unique layout, in what is right now a -current-only feature, but one that will most likely be one of the more talked-about items in the OpenBSD 6.2 release some time in the not too distant future.

With all items on the installer's agenda done, the installer exits and leaves you at a root shell prompt where the only useful action is to type reboot and press enter. Unless, of course you have specific items you know will need to be edited into the configuration before the reboot.

After completing the reboot, the system did unfortunately not, as expected, immediately present the xenodm login screen, but rather the text login prompt.

Looking at the /var/log/Xorg.0.log file pointed to driver problems, but after a little web searching on the obvious keywords, I found this gist note from notable OpenBSD developer Reyk Flöter that gave me the things to paste into my /etc/xorg.conf to yield a usable graphics display for now.

My task for this evening is to move my working environment to new hardware, so after install there are really only two items remaining, in no particular order:
  • move my (too large) accumulation of /home/ data to the new system, and
  • install the same selection of packages on the old machine to the new system.
The first item will take longer, so I shut down all the stuff I normally have running on the laptop such as web browsers, editors and various other client programs, and use pkg_info(1) to create the list of installed packages on the 'from' system:

$ pkg_info -mz >installed_packages

then I transfer the installed_packages file to the fresh system, but not before recording the df -h status of the pristine fresh install:

$ df -h
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd1a     1005M   76.4M    878M     8%    /
/dev/sd0d      1.8T    552K    1.7T     0%    /home
/dev/sd1d     31.5G   12.0K   29.9G     0%    /tmp
/dev/sd1f     98.4G    629M   92.9G     1%    /usr
/dev/sd1g      9.8G    177M    9.2G     2%    /usr/X11R6
/dev/sd1h      108G    218K    103G     0%    /usr/local
/dev/sd1k      9.8G    2.0K    9.3G     0%    /usr/obj
/dev/sd1j     49.2G    2.0K   46.7G     0%    /usr/src
/dev/sd1e     98.4G    5.6M   93.5G     0%    /var

Not directly visible here is the amount of swap configured in the sd1b partition. As I mentioned earlier, crashdumps and suspend to disk both use swap space for their storage needs, so if you care about any of these, you will need to allocate at least as much space as the amount of physical RAM installed in the system. Because I could, I allocated the double of that, or 64GB.

I also take a peek at the old system's /etc/doas.conf and enter the same content on the new system to get the same path to higher privilege that I'm used to having. With those in hand, recreating the set of installed packages on the fresh system is then a matter of a single command:

$ doas pkg_add -l installed_packages

and pkg_add(1) proceeds to fetch and install the same packages I had on the old system.

Then there is the matter of transferring the I-refuse-to-admit-the-actual-number-of gigabytes that make up the content of my home directory. In many environments it would make sense to just restore from the most recent backup, but in my case where the source and destination sit side by side, i chose to go with a simple rsync transfer:

$ rsync  -rcpPCavu 192.168.103.69:/home/peter . | tee -a 20170710-transferlog.txt

(Yes, I'm aware that I could have done something similar with nc and tar, which are both in the base system. But rsync wins by being more easily resumable.)

While the data transfers, there is ample time to check for parts of the old system's configuration that should be transferred to the new one. Setting up the hostname.iwm0 file to hold the config for the wireless networks (see the hostname.if man page) by essentially copying across the previous one is an obvious thing, and this is the time when you discover tweaks you love that were not part of that package's default configuration.

Some time went by while the content transferred, and I can now announce that I'm typing away on the machine that is at the same time both the most lightweight and the most powerful machine I have ever owned.

I am slowly checking and finding that the stuff I care about just works, though I haven't bothered to check whether the webcam works yet. I know you've been dying to see the dmesg, which can be found here. I'm sure I'll get to the bottom of the 'not configured' items (yes, there are some) fairly soon. Look for updates that will be added to the end of this column.

And after all these years, I finally have a machine that matches my beard color:



If you have any questions on running OpenBSD as a primary working environment, I'm generally happy to answer but in almost all cases I would prefer that you use the mailing lists such as misc@openbsd.org or the OpenBSD Facebook group so the question and hopefully useful answers become available to the general public. Browsing the slides for my recent OpenBSD and you user group talk might be beneficial if you're not yet familiar with the system. And of course, comments on this article are welcome.


Update 2017-07-18: One useful thing to do once you have your system up and running is to submit your dmesg to the NYCBUG dmesg database. The one for the system described here is up as http://dmesgd.nycbug.org/index.cgi?do=view&id=3227

by Peter N. M. Hansteen (noreply@blogger.com) at July 10, 2017 07:53 PM

July 09, 2017

Errata Security

Burner laptops for DEF CON

Hacker summer camp (Defcon, Blackhat, BSidesLV) is upon us, so I thought I'd write up some quick notes about bringing a "burner" laptop. Chrome is your best choice in terms of security, but I need Windows/Linux tools, so I got a Windows laptop.

I chose the Asus e200ha for $199 from Amazon with free (and fast) shipping. There are similar notebooks with roughly the same hardware and price from other manufacturers (HP, Dell, etc.), so I'm not sure how this compares against those other ones. However, it fits my needs as a "burner" laptop, namely:
  • cheap
  • lasts 10 hours easily on battery
  • weighs 2.2 pounds (1 kilogram)
  • 11.6 inch and thin
Some other specs are:
  • 4 gigs of RAM
  • 32 gigs of eMMC flash memory
  • quad core 1.44 GHz Intel Atom CPU
  • Windows 10
  • free Microsoft Office 365 for one year
  • good, large keyboard
  • good, large touchpad
  • USB 3.0
  • microSD
  • WiFi ac
  • no fans, completely silent
There are compromises, of course.
  • The Atom CPU is slow, thought it's only noticeable when churning through heavy webpages. Adblocking addons or Brave are a necessity. Most things are usably fast, such as using Microsoft Word.
  • Crappy sound and video, though VLC does a fine job playing movies with headphones on the airplane. Using in bright sunlight will be difficult.
  • micro-HDMI, keep in mind if intending to do presos from it, you'll need an HDMI adapter
  • It has limited storage, 32gigs in theory, about half that usable.
  • Does special Windows 10 compressed install that you can't actually upgrade without a completely new install. It doesn't have the latest Windows 10 Creators update. I lost a gig thinking I could compress system files.

Copying files across the 802.11ac WiFi to the disk was quite fast, several hundred megabits-per-second. The eMMC isn't as fast as an SSD, but its a lot faster than typical SD card speeds.

The first thing I did once I got the notebook was to install the free VeraCrypt full disk encryption. The CPU has AES acceleration, so it's fast. There is a problem with the keyboard driver during boot that makes it really hard to enter long passwords -- you have to carefully type one key at a time to prevent extra keystrokes from being entered.

You can't really install Linux on this computer, but you can use virtual machines. I installed VirtualBox and downloaded the Kali VM. I had some problems attaching USB devices to the VM. First of all, VirtualBox requires a separate downloaded extension to get USB working. Second, it conflicts with USBpcap that I installed for Wireshark.

It comes with one year of free Office 365. Obviously, Microsoft is hoping to hook the user into a longer term commitment, but in practice next year at this time I'd get another burner $200 laptop rather than spend $99 on extending the Office 365 license.

Let's talk about the CPU. It's Intel's "Atom" processor, not their mainstream (Core i3 etc.) processor. Even though it has roughly the same GHz as the processor in a 11inch MacBook Air and twice the cores, it's noticeably and painfully slower. This is especially noticeable on ad-heavy web pages, while other things seem to work just fine. It has hardware acceleration for most video formats, though I had trouble getting Netflix to work.

The tradeoff for a slow CPU is phenomenal battery life. It seems to last forever on battery. It's really pretty cool.

Conclusion

A Chromebook is likely more secure, but for my needs, this $200 is perfect.


by Robert Graham (noreply@blogger.com) at July 09, 2017 03:14 AM

July 08, 2017

Electricmonk.nl

Two-factor authentication via SMS is worse than no two-factor authentication at all

Another case of online theft whereby the attacker takes over a victim's phone and performs an account reset through SMS has just hit the web. This is the sixth case I've read about, but undoubtedly there are many many more. In this case, the victim only lost $200. In other cases, victims have lost thousands of dollars worth of bitcoins in a very similar method of attack.

Basically every site in existence, including banks, paypal and bitcoin wallets offer password resets via email. This makes your email account an extremely important weak link in the chain of your security. If an attacker manages to get into your email account, you're basically done for. But you're using Gmail, and their security is state-of-the-art, right? Well, no.. read on.

So how does the attack work? It's really simple. An attacker doesn't need to break any military-grade crypto or perform magic man-in-the-middle DNS cache poisoning voodoo. All they have to do is:

  1. Call your cellphone provider.
  2. Get them to forward your phone number to a different phone (of which there are several possibilities)
  3. Request a password reset on your mail account.
  4. Receive the recovery code via SMS
  5. Reset the password on your mail account with the recovery code
  6. Reset any other account you own through your email box.

That's all it takes. Even the most unsophisticated attacker can pull this off and get unrestricted access to every account you own! 

The weakness lies in step 2: getting control over your phone number. While you may think this is difficult, it really isn't. Telco's, like nearly all big companies, are horrible at security. They'll get security audits up the wazoo every few months, but the auditors themselves are usually way behind the curve when it comes to the latest attack vectors. Most security auditors still insist on nonsense such as a minimum of 8 character (16 is more like it) passwords and changing them every few months. Because, you know, the advice of not using the same password for different services is way beyond them.

In the story linked above, the Human element, like always, is the problem:

The man on the phone reads through the notes and explains that yes, someone has been dialing the AT&T call center all day trying to get into my phone but was repeatedly rejected because they didn't know my passcode, until someone broke protocol and didn't require the passcode.

Someone broke protocol. Naturally that is only possible if employees of AT&T have unrestricted privileges to override the passcode requirement and modify anybody's data. Exactly the sort of thing that receives no scrutiny in a security audit. They'll require background checks on employees to see if they're trustworthy, while competely neglecting the fact that it's much mure likely that trustworthy employees merely make mistakes.

A few weeks ago I was the "victim" of a similar incident. I suddenly started receiving emails from a big Dutch online shop regarding my account there. The email address of my account had been changed and my password had been reset. I immediately called the helpdesk and within a few minutes the helpdesk employee had put everything straight. It turns out that someone in the Netherlands with the exact same name as me (which is very uncommon) had mistaken my account for his and had requested his email address be reset through some social media support method. Again, no restrictions or verifications that he was the owner of the account were required. If it hadn't been for their warning emails, the promptness of the helpdesk employee and the fact that my "attacker" meant no harm, I could have easily been in trouble. 

The lessons here are clear:

  • Users: don't use your phone number as a recovery device. It may seem safe, but it's not. Even Google / Gmail don't just allow you to do this, they actively encourage this bad security practice. Delete your phone number as a recovery device and use downloadable backup codes.
  • Users: don't use two-factor authentication through SMS. It's better to not use two-factor authentication at all if SMS is the only option. Without two-factor authentication, they have to guess your password. With two-factor authentication through SMS, they only have to place a call or two to your provider. And your provider will do as they ask, make no mistake about it.
  • Companies: stop offering authentication, verification and account recovery through phone numbers! Use TOTP (RFC 6238) for two-factor authentication and offer backup codes or a secret key (no, that does not mean those idiotic security questions asking for my mothers maiden name) or something.
  • Companies: Do NOT allow employees to override such basic security measures as Account Owner Verification! This really should go without saying, but it seems big companies are just too clueless to get this right. A person has to be able to prove they're the owner of the account! And for Pete's sake, please stop using easily obtained info such as my birth date and address as verification! If you really must have a way of overriding such things, it should only be possible for a single senior account manager with a good grasp of security.

As more and more aspects of our lives are managed online, the potential for damage to our real lives keeps getting bigger. Government institutes and companies are scrambling to go online with their services. It's more cost efficient and convenient for the customer. But the security is severely lacking.

The online world is not like the real world, where it takes a large amount of risky work for an attacker to obtain a small reward. On the internet, anyone with malicious intent and the most basic level of literacy can figure out how to reap big rewards at nearly zero risk. As we've seen with the recent Ransomwares and other attacks, those people are out there and are actively abusing our bad security practices. If you, the reader, had any idea how horrible the security of everything in our daily lives is, from your online accounts to the lock on your cars, you'd be highly surprised that digital crime wasn't much, much more widespread. 

Let's pull our head out of our asses and give online security the priority it deserves.

by admin at July 08, 2017 10:12 AM

July 07, 2017

HolisticInfoSec.org

Toolsmith #126: Adversary hunting with SOF-ELK

As we celebrate Independence Day, I'm reminded that we honor what was, of course, an armed conflict. Today's realities, when we think about conflict, are quite different than the days of lining troops up across the field from each other, loading muskets, and flinging balls of lead into the fray.
We live in a world of asymmetrical battles, often conflicts that aren't always obvious in purpose and intent, and likely fought on multiple fronts. For one of the best reads on the topic, take the well spent time to read TJ O'Connor's The Jester Dynamic: A Lesson in Asymmetric Unmanaged Cyber Warfare. If you're reading this post, it's highly likely that your front is that of 1s and 0s, either as a blue team defender, or as a red team attacker. I live in this world every day of my life as a blue teamer at Microsoft, and as a joint forces cyber network operator. We are faced, each day, with overwhelming, excessive amounts of data, of varying quality, where the answers to questions are likely hidden, but available to those who can dig deeply enough.
New platforms continue to emerge to help us in this cause. At Microsoft we have a variety of platforms that make the process easier for us, but no less arduous, to dig through the data, and the commercial sector continues to expand its offerings. For those with limited budgets and resources, but a strong drive for discovery, that have been outstanding offerings as well. Security Onion has been forefront for years, and is under constant development and improvement in the care of Doug Burks.
Another emerging platform, to be discussed here, is SOF-ELK, part of the SANS Forensics community, created by SANS FOR572, Advanced Network Forensics and Analysis author and instructor Phil Hagen. Count SOF-ELK in the NFAT family for sure, a strong player in the Network Forensic Analysis Tool category.
SOF-ELK has a great README, don't be that person, read it. It's everything you need to get started, in one place. What!? :-)
Better yet, you can download a fully realized VM with almost no configuration requirements, so you can hit the ground running. I ran my SOF-ELK instance with VMWare Workstation 12 Pro and no issues other than needing to temporarily disable Device Guard and Credential Guard on Windows 10.
SOF-ELK offers some good test data to get you started with right out of the gate, in /home/elk_user/exercise_source_logs, including Syslog from a firewall, router, converted Windows events, a Squid proxy, and a server named muse. You can drop these on your SOF-ELK server in the /logstash/syslog/ ingestion point for syslog-formatted data. Additionally, utilize /logstash/nfarch/ for archived NetFlow output, /logstash/httpd/ for Apache logs, /logstash/passivedns/ for logs from the passivedns utility, /logstash/plaso/ for log2timeline, and  /logstash/bro/ for, yeah, you guessed it.
I mixed things up a bit and added my own Apache logs for the month of May to /logstash/httpd/. The muse log set in the exercise offering also included a DNS log (named_log), for grins I threw that in the /logstash/syslog/ as well just to see how it would play.
Run down a few data rabbit holes with me, I swear I can linger for hours on end once I latch on to something to chase. We'll begin with a couple of highlights from my Apache logs. The SOF-ELK VM comes with three pre-configured dashboards including Syslog, NetFlow, and HTTPD. You can learn more in the start page for the SOF-ELK UI, my instance is http://192.168.50.110:5601/app/kibana. There are three panels, or blocks, for each dashboard's details, at the bottom of the UI. I drilled through to the HTTPD Log Dashboard for this experiment, and immediately reset the time period for analysis (click the time marker in the upper right hand part of the UI). It defaults to the last 15 minutes, if you're reviewing older data it won't show until you adjust to match your time stamps. My data is from the month of May so I selected an absolute window from the beginning of May to its end. You can also select quick or relative time options, it's great to get comfortable here quickly and early. The resulting opening visualizations for me made me very happy, as seen in Figure 1.
Figure 1: HTTPD Log Dashboard
Nice! An event count summary, source ASNs by count (you can immediately see where I scanned myself from work), a fantastic Access Source map, a records graph by HTTP verbs, and one by response codes.
The beauty of these SOF-ELK dashboards is that they're immediately interactive and allow you to drill right in to interesting data points. The holisticinfosec.org website is intentionally flat and includes no active PHP or dynamic content. As a result, my favorite response code as a web application security tester, the 500 error, is notably missing. But, in both the timeline graphs we note a big traffic spike on 8 MAY 2017, which correlates nicely with my above mention scan from work, as noted in the ASN hit count, and seen here in Figure 2.

Figure 2: Traffic spike from scan
This visualizes well but isn't really all that interesting or uncommon, particularly given that I know I personally ran the scan, and scans from the Intarwebs are dime a dozen. What did jump out for me though, as seen back in Figure 1, was the presence of four PUT requests. That's usually a "bad thing" where some @$$h@t is trying to drop something on my server. Let's drill in a bit, shall we? After clicking the graph line with the four PUT requests, I quickly learned that two requests came from 204.12.194.234 AS32097: WholeSale Internet in Kansas City, MO and two came from 119.23.233.9 AS37963: Hangzhou Alibaba Advertising in Hangzhou, China. This is well represented in the HTTPD Access Source panel map (Figure 3).

Figure 3: Access Source
The PUT request from each included a txt file attempt, specifically dbhvf99151.txt and htjfx99555.txt, both were rejected, redirected (302), and sent to my landing page (200).
Research on the IPs found that 119.23.233.9 was on the "real time suspected malware list as detected by InterServer's intrusion systems" as seen 22 MAY, and 204.12.194.234 was found twice in the AbuseIPDB, flagged on 18 MAY 2017 for Cknife Webshell Detected. Now we're talking. It's common to attempt a remote file include attack or a PUT, with what is a web shell. I opened up SOF-ELK on that IP address and found eight total hits in my logs, all looking for common PHP opportunities with the likes of GET and POST for /plus/mytag_js.php, noted in PHP injection attack attempts.
SOF-ELK made it incredibly easy to hunt down these details, as seen in Figure 4 from the HTTPD Discovery panel.
Figure 4: Discovery
That's a groovy little hunting trip through HTTPD logs, but how about a bit of Syslog? I spotted I likely oddity that could be correlated across a number of the exercise logs, we'll see if the correlation is real. You'll notice tabs at the top of your SOF-ELK UI, we'll use Discover for this experiment. I started from the Syslog Dashboard with my time range set broadly on the last two months. 7606 records presented themselves, sliced neatly by hosts and programs, as seen in Figure 5.

Figure 5: Syslog Dashboard
Squid proxy logs showed the predominance of host entries (6778 or 57.95% of 11,696 to be specific), so I started there. Don' laugh, but I'll often do keyword queries just to see what comes up, sometimes you land a pointer to a good rabbit hole. Within the body of 6778 proxy events, I searched malware. Two hits came back for GET request via a JS redirector to bleepingcomputer.com for your basic how-to based on "random websites opening in Chrome". Ruh-roh.
Figure 6: Malware keyword
More importantly, we have an IP address to pivot on: 10.3.59.53. A search of that IP across the same 6778 Squid logs yielded 3896 entries specific to this IP, and lots to be curious about:
  • datingukrainewomen.com 
  • anastasiadate.com
  • YouTube videos for hair loss
  • crowdscience.com for "random pop-ups driving me nuts"
Do I need to build this user profile out for you, or are you with me? Proxy logs tell us so much, and are deeply worthy of your blue team efforts to collect and review.
I jumped over to the named_log from the muse host to see what else might reveal itself. Here's where I jumped to Discover, the Splunk-like query functionality inherent to SOF-ELK (and ELK implemetations). I did reductive query to see what other oddities might surface: 10.3.59.53 AND dns_query: (*.co.uk OR *.de OR *.eu OR *.info OR *.cc OR *.online OR *.website). I used these TLDs based on the premise that bots using Domain Generation Algorithms (DGA) will often use the TLDs. See The DGA of PadCrypt to learn more, as well as ISC Diary handler John Bambanek's OSINT logic. The query results were quite satisfying, 29 hits, including a number of clearly randomly generated domains. Those that were most interesting all included the .cc TLD, so I zoomed in further. Down to five hits with 10.3.59.53 AND dns_query: *.cc, as seen in Figure 7.
Figure 7:. CC TLD hits
Oh man, not good. I had a hunch now, and went back to the proxy logs with 10.3.59.53 AND squid_request:*.exe. And there you have it, ladies and gentlemen, hunch rewarded (Figure 8).

Figure 8: taxdocs.exe
It taxdocs.exe isn't malware, I'm a monkey's uncle. Unfortunately, I could find no online references to these .cc domains or the .exe sample or URL, but you get the point. Given that it's exercise data, Phil may have generated it to entice to dig deeper.
When we think about the IOC patterns for Petya, a hunt like this is pretty revealing. Petya's "initial infection appears to involve a software supply-chain threat involving the Ukrainian company M.E.Doc, which develops tax accounting software, MEDoc". This is not Petya (as far as I know) specifically but we see pattern similarities for sure, one can learn a great deal about the sheep and the wolves. Be the sheepdog!
Few tools better in the free and open source arsenal to help you train and enhance your inner digital sheepdog than SOF-ELK. "I'm a sheepdog. I live to protect the flock and confront the wolf." ~ LTC Dave Grossman, from On Combat.

Believe it or not, there's a ton more you can do with SOF-ELK, consider this a primer and a motivator.
I LOVE SOF-ELK. Phil, well done, thank you. Readers rejoice, this is really one of my favorites for toolsmith, hands down, out of the now 126 unique tools discussed over more than ten years. Download the VM, and get to work herding. :-)
Cheers...until next time.

by Russ McRee (noreply@blogger.com) at July 07, 2017 10:35 PM

Evaggelos Balaskas

PHP Sorting Iterators

Iterator

a few months ago, I wrote an article on RecursiveDirectoryIterator, you can find the article here: PHP Recursive Directory File Listing . If you run the code example, you ‘ll see that the output is not sorted.

Object

Recursive Iterator is actually an object, a special object that we can perform iterations on sequence (collection) of data. So it is a little difficult to sort them using known php functions. Let me give you an example:

$Iterator = new RecursiveDirectoryIterator('./');
foreach ($Iterator as $file)
    var_dump($file);
object(SplFileInfo)#7 (2) {
  ["pathName":"SplFileInfo":private]=>
  string(12) "./index.html"
  ["fileName":"SplFileInfo":private]=>
  string(10) "index.html"
}

You see here, the iterator is an object of SplFileInfo class.

Internet Answers

Unfortunately stackoverflow and other related online results provide the most complicated answers on this matter. Of course this is not stackoverflow’s error, and it is really a not easy subject to discuss or understand, but personally I dont get the extra fuzz (complexity) on some of the responses.

Back to basics

So let us go back a few steps and understand what an iterator really is. An iterator is an object that we can iterate! That means we can use a loop to walk through the data of an iterator. Reading the above output you can get (hopefully) a better idea.

We can also loop the Iterator as a simply array.

eg.

$It = new RecursiveDirectoryIterator('./');
foreach ($It as $key=>$val)
    echo $key.":".$val."n";

output:

./index.html:./index.html

Arrays

It is difficult to sort Iterators, but it is really easy to sort arrays!
We just need to convert the Iterator into an Array:

// Copy the iterator into an array
$array = iterator_to_array($Iterator);

that’s it!

Sorting

For my needs I need to reverse sort the array by key (filename on a recursive directory), so my sorting looks like:

krsort( $array );

easy, right?

Just remember that you can use ksort before the array is already be defined. You need to take two steps, and that is ok.

Convert to Iterator

After sorting, we need to change back an iterator object format:

// Convert Array to an Iterator
$Iterator = new ArrayIterator($array);

and that’s it !

Full Code Example

the entire code in one paragraph:

<?php
// ebal, Fri, 07 Jul 2017 22:01:48 +0300

// Directory to Recursive search
$dir = "/tmp/";

// Iterator Object
$files =  new RecursiveIteratorIterator(
          new RecursiveDirectoryIterator($dir)
          );

// Convert to Array
$Array = iterator_to_array ( $files );
// Reverse Sort by key the array
krsort ( $Array );
// Convert to Iterator
$files = new ArrayIterator( $Array );

// Print the file name
foreach($files as $name => $object)
    echo "$namen";

?>
Tag(s): php, iterator

July 07, 2017 08:24 PM

Anton Chuvakin - Security Warrior

Monthly Blog Round-Up – June 2017

Here is my next monthly "Security Warrior" blog round-up of top 5 popular posts/topics this
month:
  1. Why No Open Source SIEM, EVER?” contains some of my SIEM thinking from 2009. Is it relevant now? You be the judge.  Succeeding with SIEM requires a lot of work, whether you paid for the software, or not. BTW, this post has an amazing “staying power” that is hard to explain – I suspect it has to do with people wanting “free stuff” and googling for “open source SIEM” … 
  2. Simple Log Review Checklist Released!” is often at the top of this list – this aging checklist is still a very useful tool for many people. “On Free Log Management Tools” (also aged a bit by now) is a companion to the checklist (updated version)
  3. “New SIEM Whitepaper on Use Cases In-Depth OUT!” (dated 2010) presents a whitepaper on select SIEM use cases described in depth with rules and reports [using now-defunct SIEM product]; also see this SIEM use case in depth and this for a more current list of popular SIEM use cases. Finally, see our 2016 research on developing security monitoring use cases here!
  4. This month, my classic PCI DSS Log Review series is extra popular! The series of 18 posts cover a comprehensive log review approach (OK for PCI DSS 3+ even though it predates it), useful for building log review processes and procedures, whether regulatory or not. It is also described in more detail in our Log Management book and mentioned in our PCI book (now in its 4th edition!) – note that this series is mentioned in some PCI Council materials. 
  5. “SIEM Resourcing or How Much the Friggin’ Thing Would REALLY Cost Me?” is a quick framework for assessing the SIEM project (well, a program, really) costs at an organization (a lot more details on this here in this paper).
In addition, I’d like to draw your attention to a few recent posts from my Gartner blog [which, BTW, now has more than 5X of the traffic of this blog]: 

Current research on vulnerability management:
Current research on cloud security monitoring:
Recent research on security analytics and UBA / UEBA:
Miscellaneous fun posts:

(see all my published Gartner research here)
Also see my past monthly and annual “Top Popular Blog Posts” – 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016.

Disclaimer: most content at SecurityWarrior blog was written before I joined Gartner on August 1, 2011 and is solely my personal view at the time of writing. For my current security blogging, go here.

Previous post in this endless series:

by Anton Chuvakin (anton@chuvakin.org) at July 07, 2017 06:35 PM

July 04, 2017

Steve Kemp's Blog

I've never been more proud

This morning I remembered I had a beefy virtual-server setup to run some kernel builds upon (when I was playing with Linux security moduels), and I figured before I shut it down I should use the power to run some fuzzing.

As I was writing some code in Emacs at the time I figured "why not fuzz emacs?"

After a few hours this was the result:

 deagol ~ $ perl -e 'print "`" x ( 1024 * 1024  * 12);' > t.el
 deagol ~ $ /usr/bin/emacs --batch --script ./t.el
 ..
 ..
 Segmentation fault (core dumped)

Yup, evaluating an lisp file caused a segfault, due to a stack-overflow (no security implications). I've never been more proud, even though I have contributed code to GNU Emacs in the past.

July 04, 2017 09:00 PM

Evaggelos Balaskas

Malicious ReplyTo

Prologue

Part of my day job is to protect a large mail infrastructure. That means that on a daily basis we are fighting SPAM and try to protect our customers for any suspicious/malicious mail traffic. This is not an easy job. Actually globally is not a easy job. But we are trying and trying hard.

ReplyTo

The last couple months, I have started a project on gitlab gathering the malicious ReplyTo from already identified spam emails. I was looking for a pattern or something that I can feed our antispam engines with so that we can identify spam more accurately. It’s doesnt seem to work as i thought. Spammers can alter their ReplyTo in a matter of minutes!

TheList

Here is the list for the last couple months: ReplyTo
I will -from time to time- try to update it and hopefully someone can find it useful

Free domains

It’s not much yet, but even with this small sample you can see that ~ 50% of phishing goes back to gmail !

    105 gmail.com
     49 yahoo.com
     18 hotmail.com
     17 outlook.com

More Info

You can contact me with various ways if you are interested in more details.

Preferably via encrypted email: PGP: ‘ 0×1c8968af8d2c621f
or via DM in twitter: @ebalaskas

PS

I also keep another list, of suspicious fwds
but keep in mind that it might have some false positives.

Tag(s): spam

July 04, 2017 07:44 AM

July 03, 2017

Vincent Bernat

Performance progression of IPv4 route lookup on Linux

TL;DR: Each of Linux 2.6.39, 3.6 and 4.0 brings notable performance improvements for the IPv4 route lookup process.


In a previous article, I explained how Linux implements an IPv4 routing table with compressed tries to offer excellent lookup times. The following graph shows the performance progression of Linux through history:

IPv4 route lookup performance

Two scenarios are tested:

  • 500,000 routes extracted from an Internet router (half of them are /24), and
  • 500,000 host routes (/32) tightly packed in 4 distinct subnets.

All kernels are compiled with GCC 4.9 (from Debian Jessie). This version is able to compile older kernels1 as well as current ones. The kernel configuration used is the default one with CONFIG_SMP and CONFIG_IP_MULTIPLE_TABLES options enabled (however, no IP rules are used). Some other unrelated options are enabled to be able to boot them in a virtual machine and run the benchmark.

The measurements are done in a virtual machine with one vCPU2. The host is an Intel Core i5-4670K and the CPU governor was set to “performance”. The benchmark is single-threaded. Implemented as a kernel module, it calls fib_lookup() with various destinations in 100,000 timed iterations and keeps the median. Timings of individual runs are computed from the TSC (and converted to nanoseconds by assuming a constant clock).

The following kernel versions bring a notable performance improvement:

  • In Linux 2.6.39, commit 3630b7c050d9, David Miller removes the hash-based routing table implementation to switch to the LPC-trie implementation (available since Linux 2.6.13 as a compile-time option). This brings a small regression for the scenario with many host routes but boosts the performance for the general case.

  • In Linux 3.0, commit 281dc5c5ec0f, the improvement is not related to the network subsystem. Linus Torvalds disables the compiler size-optimization from the default configuration. It was believed that optimizing for size would help keeping the instruction cache efficient. However, compilers generated under-performing code on x86 when this option was enabled.

  • In Linux 3.6, commit f4530fa574df, David Miller adds an optimization to not evaluate IP rules when they are left unconfigured. From this point, the use of the CONFIG_IP_MULTIPLE_TABLES option doesn’t impact the performances unless some IP rules are configured. This version also removes the route cache (commit 5e9965c15ba8). However, this has no effect on the benchmark as it directly calls fib_lookup() which doesn’t involve the cache.

  • In Linux 4.0, notably commit 9f9e636d4f89, Alexander Duyck adds several optimizations to the trie lookup algorithm. It really pays off!

  • In Linux 4.1, commit 0ddcf43d5d4a, Alexander Duyck collapses the local and main tables when no specific IP rules are configured. For non-local traffic, both those tables were looked up.


  1. Compiling old kernels with an updated userland may still require some small patches

  2. The kernels are compiled with the CONFIG_SMP option to use the hierarchical RCU and activate more of the same code paths as actual routers. However, progress on parallelism are left unnoticed. 

by Vincent Bernat at July 03, 2017 01:25 PM

July 02, 2017

Cryptography Engineering

Beyond public key encryption

One of the saddest and most fascinating things about applied cryptography is how little cryptography we actually use. This is not to say that cryptography isn’t widely used in industry — it is. Rather, what I mean is that cryptographic researchers have developed so many useful technologies, and yet industry on a day to day basis barely uses any of them. In fact, with a few minor exceptions, the vast majority of the cryptography we use was settled by the early-2000s.*

Most people don’t sweat this, but as a cryptographer who works on the boundary of research and deployed cryptography it makes me unhappy. So while I can’t solve the problem entirely, what I can do is talk about some of these newer technologies. And over the course of this summer that’s what I intend to do: talk. Specifically, in the next few weeks I’m going to write a series of posts that describe some of the advanced cryptography that we don’t generally see used.

Today I’m going to start with a very simple question: what lies beyond public key cryptography? Specifically, I’m going to talk about a handful of technologies that were developed in the past 20 years, each of which allows us to go beyond the traditional notion of public keys.

This is a wonky post, but it won’t be mathematically-intense. For actual definitions of the schemes, I’ll provide links to the original papers, and references to cover some of the background. The point here is to explain what these new schemes do — and how they can be useful in practice.

Identity Based Cryptography

In the mid-1980s, a cryptographer named Adi Shamir proposed a radical new idea. The idea, put simply, was to get rid of public keys.

To understand where Shamir was coming from, it helps to understand a bit about public key encryption. You see, prior to the invention of public key crypto, all cryptography involved secret keys. Dealing with such keys was a huge drag. Before you could communicate securely, you needed to exchange a secret with your partner. This process was fraught with difficulty and didn’t scale well.

Public key encryption (beginning with Diffie-Hellman and Shamir’s RSA cryptosystem) hugely revolutionized cryptography by dramatically simplifying this key distribution process. Rather than sharing secret keys, users could now transmit their public key to other parties. This public key allowed the recipient to encrypt to you (or verify your signature) but it could not be used to perform the corresponding decryption (or signature generation) operations. That part would be done with a secret key you kept to yourself.

While the use of public keys improved many aspects of using cryptography, it also gave rise to a set of new challenges. In practice, it turns out that having public keys is only half the battle — people still need to use distribute them securely.

For example, imagine that I want to send you a PGP-encrypted email. Before I can do this, I need to obtain a copy of your public key. How do I get this? Obviously we could meet in person and exchange that key on physical media — but nobody wants to do this. It would much more desirable to obtain your public key electronically. In practice this means either (1) we have to exchange public keys by email, or (2) I have to obtain your key from a third piece of infrastructure, such as a website or key server. And now we come to the  problem: if that email or key server is untrustworthy (or simply allows anyone to upload a key in your name), I might end up downloading a malicious party’s key by accident. When I send a message to “you”, I’d actually be encrypting it to Mallory.

Mallory.

Solving this problem — of exchanging public keys and verifying their provenance — has motivated a huge amount of practical cryptographic engineering, including the entire web PKI. In most cases, these systems work well. But Shamir wasn’t satisfied. What if, he asked, we could do it better? More specifically, he asked: could we replace those pesky public keys with something better?

Shamir’s idea was exciting. What he proposed was a new form of public key cryptography in which the user’s “public key” could simply be their identity. This identity could be a name (e.g., “Matt Green”) or something more precise like an email address. Actually, it didn’t realy matter. What did matter was that the public key would be some arbitrary string — and not a big meaningless jumble of characters like “7cN5K4pspQy3ExZV43F6pQ6nEKiQVg6sBkYPg1FG56Not”.

Of course, using an arbitrary string as a public key raises a big problem. Meaningful identities sound great — but I don’t own them. If my public key is “Matt Green”, how do I get the corresponding private key? And if I can get out that private key, what stops some other Matt Green from doing the same, and thus reading my messages? And ok, now that I think about this, what stops some random person who isn’t named Matt Green from obtaining it? Yikes. We’re headed straight into Zooko’s triangle.

Shamir’s idea thus requires a bit more finesse. Rather than expecting identities to be global, he proposed a special server called a “key generation authority” that would be responsible for generating the private keys. At setup time, this authority would generate a single master public key (MPK), which it would publish to the world. If you wanted to encrypt a message to “Matt Green” (or verify my signature), then you could do so using my identity and the single MPK of an authority we’d both agree to use. To decrypt that message (or sign one), I would have to visit the same key authority and ask for a copy of my secret key. The key authority would compute my key based on a master secret key (MSK), which it would keep very secret.

With all algorithms and players specified, whole system looks like this:

Overview of an Identity-Based Encryption (IBE) system. The Setup algorithm of the Key Generation Authority generates the master public key (MPK) and master secret key (MSK). The authority can use the Extract algorithm to derive the secret key corresponding to a specific ID. The encryptor (left) encrypts using only the identity and MPK. The recipient requests the secret key for her identity, and then uses it to decrypt. (Icons by Eugen Belyakoff)

This design has some important advantages — and more than a few obvious drawbacks. On the plus side, it removes the need for any key exchange at all with the person you’re sending the message to. Once you’ve chosen a master key authority (and downloaded its MPK), you can encrypt to anyone in the entire world. Even cooler: at the time you encrypt, your recipient doesn’t even need to have contacted the key authority yet. She can obtain her secret key after I send her a message.

Of course, this “feature” is also a bug. Because the key authority generates all the secret keys, it has an awful lot of power. A dishonest authority could easily generate your secret key and decrypt your messages. The polite way to say this is that standard IBE systems effectively “bake in” key escrow.**

Putting the “E” in IBE

All of these ideas and more were laid out by Shamir in his 1984 paper. There was just one small problem: Shamir could only figure out half the problem.

Specifically, Shamir’s proposed a scheme for identity-based signature (IBS) — a signature scheme where the public verification key is an identity, but the signing key is generated by the key authority. Try as he might, he could not find a solution to the problem of building identity-based encryption (IBE). This was left as an open problem.***

It would take more than 16 years before someone answered Shamir’s challenge. Surprisingly, when the answer came it came not once but three times.

The first, and probably most famous realization of IBE was developed by Dan Boneh and Matthew Franklin much later — in 2001. The timing of Boneh and Franklin’s discovery makes a great deal of sense. The Boneh-Franklin scheme relies fundamentally on elliptic curves that support an efficient “bilinear map” (or “pairing”).**** The algorithms needed to compute such pairings were not known when Shamir wrote his paper, and weren’t employed constructively — that is, as a useful thing rather than an attack — until about 2000. The same can be said about a second scheme called Sakai-Kasahara, which would be independently discovered around the same time.

(For a brief tutorial on the Boneh-Franklin IBE scheme, see this page.)

The third realization of IBE was not as efficient as the others, but was much more surprising. This scheme was developed by Clifford Cocks, a senior cryptologist at Britain’s GCHQ. It’s noteworthy for two reasons. First, Cocks’ IBE scheme does not require bilinear pairings at all — it is based in the much older RSA setting, which means in principle it spent all those years just waiting to be found. Second, Cocks himself had recently become known for something even more surprising: discovering the RSA cryptosystem, nearly five years before RSA did. To bookend that accomplishment with a second major advance in public key cryptography was a pretty impressive accomplishment.

In the years since 2001, a number of additional IBE constructions have been developed, using all sorts of cryptographic settings. Nonetheless, Boneh and Franklin’s early construction remains among the simplest and most efficient.

Even if you’re not interested in IBE for its own sake, it turns out that this primitive is really useful to cryptographers for many things beyond simple encryption. In fact, it’s more helpful to think of IBE as a way of “pluralizing” a single public/secret master keypair into billions of related keypairs. This makes it useful for applications as diverse as blocking chosen ciphertext attacks, forward-secure public key encryption, and short signature schemes.

Attribute Based Encryption

Of course, if you leave cryptographers alone with a tool like IBE, the first thing they’re going to do is find a way to make things more complicated improve on it.

One of the biggest such improvements is due to Sahai and Waters. It’s called Attribute-Based Encryption, or ABE.

The origin of this idea was not actually to encrypt with attributes. Instead Sahai and Waters were attempting to develop an Identity-Based encryption scheme that could encrypt using biometrics. To understand the problem, imagine I decide to use a biometric like your iris scan as the “identity” to encrypt you a ciphertext. Later on you’ll ask the authority for a decryption key that corresponds to your own iris scan — and if everything matches up and you’ll be able to decrypt.

The problem is that this will almost never work.

The issue here is that biometric readings (like iris scans or fingerprint templates) are inherently error-prone. This means every scan will typically be very close, but often there will be a few bits that disagree. With standard IBE

Tell me this isn’t giving you nightmares.

this is fatal: if the encryption identity differs from your key identity by even a single bit, decryption will not work. You’re out of luck.

Sahai and Waters decided that the solution to this problem was to develop a form of IBE with a “threshold gate”. In this setting, each bit of the identity is represented as a different “attribute”. Think of each of these as components you’d encrypt under — something like “bit 5 of your iris scan is a 1” and “bit 23 of your iris scan is a 0”. The encrypting party lists all of these bits and encrypts under each one. The decryption key generated by the authority embeds a similar list of bit values. The scheme is defined so that decryption will work if and only if the number of matching attributes (between your key and the ciphertext) exceeds some pre-defined threshold: e.g., any 2024 out of 2048 bits must be identical in order to decrypt.

The beautiful thing about this idea is not fuzzy IBE. It’s that once you have a threshold gate and a concept of “attributes”, you can more interesting things. The main observation is that a threshold gate can be used to implement the boolean AND and OR gates, like so:

Even better, you can stack these gates on top of one another to assign a fairly complex boolean formula — which will itself determine what conditions your ciphertext can be decrypted under. For example, switching to a more realistic set of attributes, you could encrypt a medical record so that either a pediatrician in a hospital could read it, or an insurance adjuster could. All you’d need is to make sure people received keys that correctly described their attributes (which are just arbitrary strings, like identities).

A simple “ciphertext policy”, in which the ciphertext can be decrypted if and only if a key matches an appropriate set of attributes. In this case, the key satisfies the formula and thus the ciphertext decrypts. The remaining key attributes are ignored.

The other direction can be implemented as well. It’s possible to encrypt a ciphertext under a long list of attributes, such as creation time, file name, and even GPS coordinates indicating where the file was created. You can then have the authority hand out keys that correspond to a very precise slice of your dataset — for example, “this key decrypts any radiology file encrypted in Chicago between November 3rd and December 12th that is tagged with ‘pediatrics’ or ‘oncology'”.

Functional Encryption

Once you have a related of primitives like IBE and ABE, the researchers’ instinct is to both extend and generalize. Why stop at simple boolean formulae? Can we make keys (or ciphertexts) that embed arbitrary computer programs? The answer, it turns out, is yes — though not terribly efficiently. A set of recent works show that it is possible to build ABE that works over arbitrary polynomial-size circuits, using various lattice-based assumptions. So there is certainly a lot of potential here.

This potential has inspired researchers to generalize all of the above ideas into a single class of encryption called “functional encryption“. Functional encryption is more conceptual than concrete — it’s just a way to look at all of these systems as instances of a specific class. The basic idea is to represent the decryption procedure as an algorithm that computes an arbitary function F over (1) the plaintext inside of a ciphertext, and (2) the data embedded in the key. This function has the following profile:

output = F(key data, plaintext data)

In this model IBE can be expressed by having the encryption algorithm encrypt (identity, plaintext) and defining the function F such that, if “key input == identity”, it outputs the plaintext, and outputs an empty string otherwise. Similarly, ABE can be expressed by a slightly more complex function. Following this paradigm, once can envision all sorts of interesting functions that might be computed by different functions and realized by future schemes.

But those will have to wait for another time. We’ve gone far enough for today.

So what’s the point of all this?

For me, the point is just to show that cryptography can do some pretty amazing things. We rarely see this on a day-to-day basis when it comes to industry and “applied” cryptography, but it’s all there waiting to be used.

Perhaps the perfect application is out there. Maybe you’ll find it.

Notes:

* An earlier version of this post said “mid-1990s”. In comments below, Tom Ristenpart takes issue with that and makes the excellent point that many important developments post-date that. So I’ve moved the date forward about five years, and I’m thinking about how to say this in a nicer way.

** There is also an intermediate form of encryption known as “certificateless encryption“. Proposed by Al-Riyami and Paterson, this idea uses a combination of standard public key encryption and IBE. The basic idea is to encrypt each message using both a traditional public key (generated by the recipient) and an IBE identity. The recipient must then obtain a copy of the secret key from the IBE authority to decrypt. The advantages here are twofold: (1) the IBE key authority can’t decrypt the message by itself, since it does not have the corresponding secret key, which solves the “escrow” problem. And (2) the sender does not need to verify that the public key really belongs to the sender (e.g., by checking a certificate), since the IBE portion prevents imposters from decrypting the resulting message. Unfortunately this system is more like traditional public key cryptography than IBE, and does not have the neat usability features of IBE.

*** A part of the challenge of developing IBE is the need to make a system that is secure against “collusion” between different key holders. For example, imagine a very simple system that has only 2-bit identities. This gives four possible identities: “00”, “01”, “10”, “11”. If I give you a key for the identity “01” and I give Bob a key for “10”, we need to ensure that you two cannot collude to produce a key for identities “00” and “11”. Many earlier proposed solutions have tried to solve this problem by gluing together standard public encryption keys in various ways (e.g., by having a separate public key for each bit of the identity and giving out the secret keys as a single “key”). However these systems tend to fail catastrophically when just a few users collude (or their keys are stolen). Solving the collusion problem is fundamentally what separates real IBE from its faux cousins.

**** A full description of Boneh and Franklin’s scheme can be found here, or in the original paper. Some code is here and here and here. I won’t spend more time on it, except to note that the scheme is very efficient. It was patented and implemented by Voltage Security, now part of HPE.


by Matthew Green at July 02, 2017 03:28 PM

Errata Security

NonPetya: no evidence it was a "smokescreen"

Many well-regarded experts claim that the not-Petya ransomware wasn't "ransomware" at all, but a "wiper" whose goal was to destroy files, without any intent at letting victims recover their files. I want to point out that there is no real evidence of this.


Certainly, things look suspicious. For one thing, it certainly targeted the Ukraine. For another thing, it made several mistakes that prevent them from ever decrypting drives. Their email account was shutdown, and it corrupts the boot sector.

But these things aren't evidence, they are problems. They are things needing explanation, not things that support our preferred conspiracy theory.

The simplest, Occam's Razor explanation explanation is that they were simple mistakes. Such mistakes are common among ransomware. We think of virus writers as professional software developers who thoroughly test their code. Decades of evidence show the opposite, that such software is of poor quality with shockingly bad bugs.

It's true that effectively, nPetya is a wiper. Matthieu Suiche‏ does a great job describing one flaw that prevents it working. @hasherezade does a great job explaining another flaw.  But best explanation isn't that this is intentional. Even if these bugs didn't exist, it'd still be a wiper if the perpetrators simply ignored the decryption requests. They need not intentionally make the decryption fail.

Thus, the simpler explanation is that it's simply a bug. Ransomware authors test the bits they care about, and test less well the bits they don't. It's quite plausible to believe that just before shipping the code, they'd add a few extra features, and forget to regression test the entire suite. I mean, I do that all the time with my code.

Some have pointed to the sophistication of the code as proof that such simple errors are unlikely. This isn't true. While it's more sophisticated than WannaCry, it's about average for the current state-of-the-art for ransomware in general. What people think of, such the Petya base, or using PsExec to spread throughout a Windows domain, is already at least a year old.

Indeed, the use of PsExec itself is a bit clumsy, when the code for doing the same thing is already public. It's just a few calls to basic Windows networking APIs. A sophisticated virus would do this itself, rather than clumsily use PsExec.

Infamy doesn't mean skill. People keep making the mistake that the more widespread something is in the news, the more skill, the more of a "conspiracy" there must be behind it. This is not true. Virus/worm writers often do newsworthy things by accident. Indeed, the history of worms, starting with the Morris Worm, has been things running out of control more than the author's expectations.

What makes nPetya newsworthy isn't the EternalBlue exploit or the wiper feature. Instead, the creators got lucky with MeDoc. The software is used by every major organization in the Ukraine, and at the same time, their website was horribly insecure -- laughably insecure. Furthermore, it's autoupdate feature didn't check cryptographic signatures. No hacker can plan for this level of widespread incompetence -- it's just extreme luck.

Thus, the effect of bumbling around is something that hit the Ukraine pretty hard, but it's not necessarily the intent of the creators. It's like how the Slammer worm hit South Korea pretty hard, or how the Witty worm hit the DoD pretty hard. These things look "targeted", especially to the victims, but it was by pure chance (provably so, in the case of Witty).

Certainly, MeDoc was targeted. But then, targeting a single organization is the norm for ransomware. They have to do it that way, giving each target a different Bitcoin address for payment. That it then spread to the entire Ukraine, and further, is the sort of thing that typically surprises worm writers.

Finally, there's little reason to believe that there needs to be a "smokescreen". Russian hackers are targeting the Ukraine all the time. Whether Russian hackers are to blame for "ransomware" vs. "wiper" makes little difference.


Conclusion

We know that Russian hackers are constantly targeting the Ukraine. Therefore, the theory that this was nPetya's goal all along, to destroy Ukraines computers, is a good one.

Yet, there's no actual "evidence" of this. nPetya's issues are just as easily explained by normal software bugs. The smokescreen isn't needed. The boot record bug isn't needed. The single email address that was shutdown isn't significant, since half of all ransomware uses the same technique.

The experts who disagree with me are really smart/experienced people who you should generally trust. It's just that I can't see their evidence.

Update: I wrote another blogpost about "survivorship bias", refuting the claim by many experts talking about the sophistication of the spreading feature.



Update: comment asks "why is there no Internet spreading code?". The answer is "I don't know", but unanswerable questions aren't evidence of a conspiracy. "What aren't there any stars in the background?" isn't proof the moon landings are fake, such because you can't answer the question. One guess is that you never want ransomware to spread that far, until you've figured out how to get payment from so many people.







by Robert Graham (noreply@blogger.com) at July 02, 2017 12:22 AM

Yet more reasons to disagree with experts on nPetya

In WW II, they looked at planes returning from bombing missions that were shot full of holes. Their natural conclusion was to add more armor to the sections that were damaged, to protect them in the future. But wait, said the statisticians. The original damage is likely spread evenly across the plane. Damage on returning planes indicates where they could damage and still return. The undamaged areas are where they were hit and couldn't return. Thus, it's the undamaged areas you need to protect.

This is called survivorship bias.

Many experts are making the same mistake with regards to the nPetya ransomware. 

I hate to point this out, because they are all experts I admire and respect, especially @MalwareJake, but it's still an error. An example is this tweet:


The context of this tweet is the discussion of why nPetya was well written with regards to spreading, but full of bugs with regards to collecting on the ransom. The conclusion therefore that it wasn't intended to be ransomware, but was intended to simply be a "wiper", to cause destruction.

But this is just survivorship bias. If nPetya had been written the other way, with excellent ransomware features and poor spreading, we would not now be talking about it. Even that initial seeding with the trojaned MeDoc update wouldn't have spread it far enough.

In other words, all malware samples we get are good at spreading, either on their own, or because the creator did a good job seeding them. It's because we never see the ones that didn't spread.


With regards to nPetya, a lot of experts are making this claim. Since it spread so well, but had hopelessly crippled ransomware features, that must have been the intent all along. Yet, as we see from survivorship bias, none of us would've seen nPetya had it not been for the spreading feature.










by Robert Graham (noreply@blogger.com) at July 02, 2017 12:21 AM

June 28, 2017

Steve Kemp's Blog

Yet more linux security module craziness ..

I've recently been looking at linux security modules. My first two experiments helped me learn:

My First module - whitelist_lsm.c

This looked for the presence of an xattr, and if present allowed execution of binaries.

I learned about the Kernel build-system, and how to write a simple LSM.

My second module - hashcheck_lsm.c

This looked for the presence of a "known-good" SHA1 hash xattr, and if it matched the actual hash of the file on-disk allowed execution.

I learned how to hash the contents of a file, from kernel-space.

Both allowed me to learn things, but both were a little pointless. They were not fine-grained enough to allow different things to be done by different users. (i.e. If you allowed "alice" to run "wget" you'd also allow www-data to do the same.)

So, assuming you wanted to do your security job more neatly what would you want? You'd want to allow/deny execution of commands based upon:

  • The user who was invoking them.
  • The path of the binary itself.

So your local users could run "bad" commands, but "www-data" (post-compromise) couldn't.

Obviously you don't want to have to recompile your kernel to change the rules of who can execute what. So you think to yourself "I'll write those rules down in a file". But of course reading a file from kernel-space is tricky. And parsing any list of rules, in a file, from kernel-space would prone to buffer-related problems.

So I had a crazy idea:

  • When a user attempts to execute a program.
  • Call back to user-space to see if that should be permitted.
    • Give the user-space binary the UID of the invoker, and the path to the command they're trying to execute.

Calling userspace? Every time a command is to be executed? Crazy. But it just might work.

One problem I had with this approach is that userspace might not even be available, when you're booting. So I setup a flag to enable this stuff:

# echo 1 >/proc/sys/kernel/can-exec/enabled

Now the kernel will invoke the following on every command:

/sbin/can-exec $UID $PATH

Because the kernel waits for this command to complete - as it reads the exit-code - you cannot execute any child-processes from it as you'd end up in recursive hell, but you can certainly read files, write to syslog, etc. My initial implementionation was as basic as this:

int main( int argc, char *argv[] )
{
...

  // Get the UID + Program
  int uid = atoi( argv[1] );
  char *prg = argv[2];

  // syslog
  openlog ("can-exec", LOG_CONS | LOG_PID | LOG_NDELAY, LOG_LOCAL1);
  syslog (LOG_NOTICE, "UID:%d CMD:%s", uid, prg );

  // root can do all.
  if ( uid == 0 )
    return 0;

  // nobody
  if ( uid == 65534 ) {
    if ( ( strcmp( prg , "/bin/sh" ) == 0 ) ||
         ( strcmp( prg , "/usr/bin/id" ) == 0 ) ) {
      fprintf(stderr, "Allowing 'nobody' access to shell/id\n" );
      return 0;
    }
  }

  fprintf(stderr, "Denied\n" );
  return -1;
}

Although the UIDs are hard-code it actually worked! Yay!

I updated the code to convert the UID to a username, then check executables via the file /etc/can-exec/$USERNAME.conf, and this also worked.

I don't expect anybody to actually use this code, but I do think I've reached a point where I can pretend I've written a useful (or non-pointless) LSM at last. That means I can stop.

June 28, 2017 09:00 PM

June 21, 2017

Vincent Bernat

IPv4 route lookup on Linux

TL;DR: With its implementation of IPv4 routing tables using LPC-tries, Linux offers good lookup performance (50 ns for a full view) and low memory usage (64 MiB for a full view).


During the lifetime of an IPv4 datagram inside the Linux kernel, one important step is the route lookup for the destination address through the fib_lookup() function. From essential information about the datagram (source and destination IP addresses, interfaces, firewall mark, …), this function should quickly provide a decision. Some possible options are:

  • local delivery (RTN_LOCAL),
  • forwarding to a supplied next hop (RTN_UNICAST),
  • silent discard (RTN_BLACKHOLE).

Since 2.6.39, Linux stores routes into a compressed prefix tree (commit 3630b7c050d9). In the past, a route cache was maintained but it has been removed1 in Linux 3.6.

Route lookup in a trie

Looking up a route in a routing table is to find the most specific prefix matching the requested destination. Let’s assume the following routing table:

$ ip route show scope global table 100
default via 203.0.113.5 dev out2
192.0.2.0/25
        nexthop via 203.0.113.7  dev out3 weight 1
        nexthop via 203.0.113.9  dev out4 weight 1
192.0.2.47 via 203.0.113.3 dev out1
192.0.2.48 via 203.0.113.3 dev out1
192.0.2.49 via 203.0.113.3 dev out1
192.0.2.50 via 203.0.113.3 dev out1

Here are some examples of lookups and the associated results:

Destination IP Next hop
192.0.2.49 203.0.113.3 via out1
192.0.2.50 203.0.113.3 via out1
192.0.2.51 203.0.113.7 via out3 or 203.0.113.9 via out4 (ECMP)
192.0.2.200 203.0.113.5 via out2

A common structure for route lookup is the trie, a tree structure where each node has its parent as prefix.

Lookup with a simple trie

The following trie encodes the previous routing table:

Simple routing trie

For each node, the prefix is known by its path from the root node and the prefix length is the current depth.

A lookup in such a trie is quite simple: at each step, fetch the nth bit of the IP address, where n is the current depth. If it is 0, continue with the first child. Otherwise, continue with the second. If a child is missing, backtrack until a routing entry is found. For example, when looking for 192.0.2.50, we will find the result in the corresponding leaf (at depth 32). However for 192.0.2.51, we will reach 192.0.2.50/31 but there is no second child. Therefore, we backtrack until the 192.0.2.0/25 routing entry.

Adding and removing routes is quite easy. From a performance point of view, the lookup is done in constant time relative to the number of routes (due to maximum depth being capped to 32).

Quagga is an example of routing software still using this simple approach.

Lookup with a path-compressed trie

In the previous example, most nodes only have one child. This leads to a lot of unneeded bitwise comparisons and memory is also wasted on many nodes. To overcome this problem, we can use path compression: each node with only one child is removed (except if it also contains a routing entry). Each remaining node gets a new property telling how many input bits should be skipped. Such a trie is also known as a Patricia trie or a radix tree. Here is the path-compressed version of the previous trie:

Patricia trie

Since some bits have been ignored, on a match, a final check is executed to ensure all bits from the found entry are matching the input IP address. If not, we must act as if the entry wasn’t found (and backtrack to find a matching prefix). The following figure shows two IP addresses matching the same leaf:

Lookup in a Patricia trie

The reduction on the average depth of the tree compensates the necessity to handle those false positives. The insertion and deletion of a routing entry is still easy enough.

Many routing systems are using Patricia trees:

Lookup with a level-compressed trie

In addition to path compression, level compression2 detects parts of the trie that are densily populated and replace them with a single node and an associated vector of 2k children. This node will handle k input bits instead of just one. For example, here is a level-compressed version our previous trie:

Level-compressed trie

Such a trie is called LC-trie or LPC-trie and offers higher lookup performances compared to a radix tree.

An heuristic is used to decide how many bits a node should handle. On Linux, if the ratio of non-empty children to all children would be above 50% when the node handles an additional bit, the node gets this additional bit. On the other hand, if the current ratio is below 25%, the node loses the responsibility of one bit. Those values are not tunable.

Insertion and deletion becomes more complex but lookup times are also improved.

Implementation in Linux

The implementation for IPv4 in Linux exists since 2.6.13 (commit 19baf839ff4a) and is enabled by default since 2.6.39 (commit 3630b7c050d9).

Here is the representation of our example routing table in memory3:

Memory representation of a trie

There are several structures involved:

The trie can be retrieved through /proc/net/fib_trie:

$ cat /proc/net/fib_trie
Id 100:
  +-- 0.0.0.0/0 2 0 2
     |-- 0.0.0.0
        /0 universe UNICAST
     +-- 192.0.2.0/26 2 0 1
        |-- 192.0.2.0
           /25 universe UNICAST
        |-- 192.0.2.47
           /32 universe UNICAST
        +-- 192.0.2.48/30 2 0 1
           |-- 192.0.2.48
              /32 universe UNICAST
           |-- 192.0.2.49
              /32 universe UNICAST
           |-- 192.0.2.50
              /32 universe UNICAST
[...]

For internal nodes, the numbers after the prefix are:

  1. the number of bits handled by the node,
  2. the number of full children (they only handle one bit),
  3. the number of empty children.

Moreover, if the kernel was compiled with CONFIG_IP_FIB_TRIE_STATS, some interesting statistics are available in /proc/net/fib_triestat4:

$ cat /proc/net/fib_triestat
Basic info: size of leaf: 48 bytes, size of tnode: 40 bytes.
Id 100:
        Aver depth:     2.33
        Max depth:      3
        Leaves:         6
        Prefixes:       6
        Internal nodes: 3
          2: 3
        Pointers: 12
Null ptrs: 4
Total size: 1  kB
[...]

When a routing table is very dense, a node can handle many bits. For example, a densily populated routing table with 1 million entries packed in a /12 can have one internal node handling 20 bits. In this case, route lookup is essentially reduced to a lookup in a vector.

The following graph shows the number of internal nodes used relative to the number of routes for different scenarios (routes extracted from an Internet full view, /32 routes spreaded over 4 different subnets with various densities). When routes are densily packed, the number of internal nodes are quite limited.

Internal nodes and null pointers

Performance

So how performant is a route lookup? The maximum depth stays low (about 6 for a full view), so a lookup should be quite fast. With the help of a small kernel module, we can accurately benchmark5 the fib_lookup() function:

Maximum depth and lookup time

The lookup time is loosely tied to the maximum depth. When the routing table is densily populated, the maximum depth is low and the lookup times are fast.

When forwarding at 10 Gbps, the time budget for a packet would be about 50 ns. Since this is also the time needed for the route lookup alone in some cases, we wouldn’t be able to forward at line rate with only one core. Nonetheless, the results are pretty good and they are expected to scale linearly with the number of cores.

The measurements are done with a Linux kernel 4.11 from Debian unstable. I have gathered performance metrics accross kernel versions in “Performance progression of IPv4 route lookup on Linux”.

Another interesting figure is the time it takes to insert all those routes into the kernel. Linux is also quite efficient in this area since you can insert 2 million routes in less than 10 seconds:

Insertion time

Memory usage

The memory usage is available directly in /proc/net/fib_triestat. The statistic provided doesn’t account for the fib_info structures, but you should only have a handful of them (one for each possible next-hop). As you can see on the graph below, the memory use is linear with the number of routes inserted, whatever the shape of the routes is.

Memory usage

The results are quite good. With only 256 MiB, about 2 million routes can be stored!

Routing rules

Unless configured without CONFIG_IP_MULTIPLE_TABLES, Linux supports several routing tables and has a system of configurable rules to select the table to use. These rules can be configured with ip rule. By default, there are three of them:

$ ip rule show
0:      from all lookup local
32766:  from all lookup main
32767:  from all lookup default

Linux will first lookup for a match in the local table. If it doesn’t find one, it will lookup in the main table and at last resort, the default table.

Builtin tables

The local table contains routes for local delivery:

$ ip route show table local
broadcast 127.0.0.0 dev lo proto kernel scope link src 127.0.0.1
local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1
broadcast 192.168.117.0 dev eno1 proto kernel scope link src 192.168.117.55
local 192.168.117.55 dev eno1 proto kernel scope host src 192.168.117.55
broadcast 192.168.117.63 dev eno1 proto kernel scope link src 192.168.117.55

This table is populated automatically by the kernel when addresses are configured. Let’s look at the three last lines. When the IP address 192.168.117.55 was configured on the eno1 interface, the kernel automatically added the appropriate routes:

  • a route for 192.168.117.55 for local unicast delivery to the IP address,
  • a route for 192.168.117.255 for broadcast delivery to the broadcast address,
  • a route for 192.168.117.0 for broadcast delivery to the network address.

When 127.0.0.1 was configured on the loopback interface, the same kind of routes were added to the local table. However, a loopback address receives a special treatment and the kernel also adds the whole subnet to the local table. As a result, you can ping any IP in 127.0.0.0/8:

$ ping -c1 127.42.42.42
PING 127.42.42.42 (127.42.42.42) 56(84) bytes of data.
64 bytes from 127.42.42.42: icmp_seq=1 ttl=64 time=0.039 ms

--- 127.42.42.42 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.039/0.039/0.039/0.000 ms

The main table usually contains all the other routes:

$ ip route show table main
default via 192.168.117.1 dev eno1 proto static metric 100
192.168.117.0/26 dev eno1 proto kernel scope link src 192.168.117.55 metric 100

The default route has been configured by some DHCP daemon. The connected route (scope link) has been automatically added by the kernel (proto kernel) when configuring an IP address on the eno1 interface.

The default table is empty and has little use. It has been kept when the current incarnation of advanced routing has been introduced in Linux 2.1.68 after a first tentative using “classes” in Linux 2.1.156.

Performance

Since Linux 4.1 (commit 0ddcf43d5d4a), when the set of rules is left unmodified, the main and local tables are merged and the lookup is done with this single table (and the default table if not empty). Moreover, since Linux 3.0 (commit f4530fa574df), without specific rules, there is no performance hit when enabling the support for multiple routing tables. However, as soon as you add new rules, some CPU cycles will be spent for each datagram to evaluate them. Here is a couple of graphs demonstrating the impact of routing rules on lookup times:

Routing rules impact on performance

For some reason, the relation is linear when the number of rules is between 1 and 100 but the slope increases noticeably past this threshold. The second graph highlights the negative impact of the first rule (about 30 ns).

A common use of rules is to create virtual routers: interfaces are segregated into domains and when a datagram enters through an interface from domain A, it should use routing table A:

# ip rule add iif vlan457 table 10
# ip rule add iif vlan457 blackhole
# ip rule add iif vlan458 table 20
# ip rule add iif vlan458 blackhole

The blackhole rules may be removed if you are sure there is a default route in each routing table. For example, we add a blackhole default with a high metric to not override a regular default route:

# ip route add blackhole default metric 9999 table 10
# ip route add blackhole default metric 9999 table 20
# ip rule add iif vlan457 table 10
# ip rule add iif vlan458 table 20

To reduce the impact on performance when many interface-specific rules are used, interfaces can be attached to VRF instances and a single rule can be used to select the appropriate table:

# ip link add vrf-A type vrf table 10
# ip link set dev vrf-A up
# ip link add vrf-B type vrf table 20
# ip link set dev vrf-B up
# ip link set dev vlan457 master vrf-A
# ip link set dev vlan458 master vrf-B
# ip rule show
0:      from all lookup local
1000:   from all lookup [l3mdev-table]
32766:  from all lookup main
32767:  from all lookup default

The special l3mdev-table rule was automatically added when configuring the first VRF interface. This rule will select the routing table associated to the VRF owning the input (or output) interface.

VRF was introduced in Linux 4.3 (commit 193125dbd8eb), the performance was greatly enhanced in Linux 4.8 (commit 7889681f4a6c) and the special routing rule was also introduced in Linux 4.8 (commit 96c63fa7393d, commit 1aa6c4f6b8cd). You can find more details about it in the kernel documentation.

Conclusion

The takeaways from this article are:

  • route lookup times hardly increase with the number of routes,
  • densily packed /32 routes lead to amazingly fast route lookups,
  • memory use is low (128 MiB par million routes),
  • no optimization is done on routing rules.

  1. The routing cache was subject to reasonably easy to launch denial of service attacks. It was also believed to not be efficient for high volume sites like Google but I have first-hand experience it was not the case for moderately high volume sites. 

  2. IP-address lookup using LC-tries”, IEEE Journal on Selected Areas in Communications, 17(6):1083-1092, June 1999. 

  3. For internal nodes, the key_vector structure is embedded into a tnode structure. This structure contains information rarely used during lookup, notably the reference to the parent that is usually not needed for backtracking as Linux keeps the nearest candidate in a variable. 

  4. One leaf can contain several routes (struct fib_alias is a list). The number of “prefixes” can therefore be greater than the number of leaves. The system also keeps statistics about the distribution of the internal nodes relative to the number of bits they handle. In our example, all the three internal nodes are handling 2 bits. 

  5. The measurements are done in a virtual machine with one vCPU. The host is an Intel Core i5-4670K running at 3.7 GHz during the experiment (CPU governor was set to performance). The benchmark is single-threaded. It runs a warm-up phase, then executes about 100,000 timed iterations and keeps the median. Timings of individual runs are computed from the TSC

  6. Fun fact: the documentation of this first tentative of more flexible routing is still available in today’s kernel tree and explains the usage of the “default class”

by Vincent Bernat at June 21, 2017 09:19 AM

June 20, 2017

Sean's IT Blog

Introducing Horizon 7.2

What’s new in Horizon 7.2?  A lot, actually.  There are some big features that will impact customers greatly, some beta features are now general availability, and some scalability enhancements. 

Horizon 7 Helpdesk Tool

One of the big drawbacks of Horizon is that it doesn’t have a very good tool for Helpdesk staff to support the environment.  While Horizon Administrator has role-based access control to limit what non-administrators can access and change, it doesn’t provide a good tool to enable a helpdesk technician to look up what desktop a user is in, grab some key metrics about their session, and remote in to assist the user.

VMware has provided a fling that can do some of this – the Horizon Toolbox.  But flings aren’t officially supported and aren’t always updated to support the latest version. 

Horizon 7.2 addresses this issue with the addition of the Horizon Helpdesk tool.  Horizon Helpdesk is an HTML5-based web interface designed for level 1 and 2 support desk engineers.  Unlike Citrix Director, which offers similar features for XenApp and XenDesktop environments, Horizon Helpdesk is integrated into the Connection Servers and installed by default.

Horizon Helpdesk provides a tool for helpdesk technicians who are supporting users with Horizon virtual desktops and published applications.  They’re able to log into the web-based tool, search for specific users, and view their active sessions and desktop and application entitlements.  Technicians can view details about existing sessions, including various metrics from within the virtual desktops and use the tool to launch a remote assistance.

Skype for Business Support is Now GA

The Horizon Virtualization Pack for Skype for Business was released under Technical Preview with Horizon 7.1.  It’s now fully supported on Windows endpoints in Horizon 7.2, and a private beta will be available for Linux-based clients soon. 

The Virtualization Pack allows Horizon clients to offload some of the audio and video features of Skype for business to the endpoint, and it provides optimized communication paths directly between the endpoints for multimedia calls.  What exactly does that mean?  Well, prior to the virtualization pack, multimedia streams would need to be sent from the endpoint to the virtual desktop, processed inside the VM, and then sent to the recipient.  The virtualization pack offloads the multimedia processing to the endpoint, and all media is sent between endpoints using a separate point-to-point stream.  This happens outside of the normal display protocol virtual channels.

image

Skype for Business has a number of PBX features, but not all of these features are supported in the first production-ready version of this plugin.  The following features are not supported:

  • Multiparty Conferencing
  • E911 Location Services
  • Call to Response Group
  • Call Park and Call Pickup from Park
  • Call via X (Work/Home, Cell, etc)
  • Customized Ringtones
  • Meet Now Conferencing
  • Call Recording

Instant Clone Updates

A few new features have been added to Horizon Instant Clones.  These new features are the ability to reuse existing Active Directory Computer accounts, support for configuring SVGA settings – such as video memory, resolution and number of monitors supported –  on the virtual desktop parent, and support for placing Instant Clones on local datastores.

Scalability Updates

VMware has improved Horizon stability in each of the previous releases, usually by expanding the number of connections, Horizon Pods, and sites that are supported in a Cloud Pod Architecture.  This release is no exception.  CPA now supports up to 120,000 sessions across 12 Horizon Pods in five sites.  While the number of sites has not increased, the number of supported sessions has increased by 40,000 compared to Horizon 7.1.

This isn’t the only scalability enhancement in Horizon 7.2.  The largest enhancement is in the number of desktops that will be supported per vCenter Server.  Prior to this version of Horizon, only 2000 desktops of any kind were supported per vCenter.  This means that a Horizon Pod that supported that maximum number of sessions would require five vCenter servers.  Horizon 7.2 doubles the number of supported desktops per vCenter, meaning that each vCenter Server can now support up to 4000 desktops of any type. 

Other New Features

Some of the other new features in Horizon 7.2 are:

  • Support for deploying Full Clone desktops to DRS Storage Clusters
  • A Workspace One mode configure access policies around desktops and published applications
  • Client Drive Redirection and USB Redirection are now supported on Linux
  • The ability to rebuild full clone desktops and reuse existing machine accounts.

Key Takeaways

The major features of Horizon 7.2 are very attractive, and they address things that customers have been asking for a while.  If you’re planning a greenfield environment, or it’s been a while since you’ve upgraded, there are a number of compelling reasons to look at implementing this version.


by seanpmassey at June 20, 2017 05:17 PM

June 19, 2017

Sarah Allen

customer-driven development

Sometimes when I talk about customer use cases for software that I’m building, it confuses the people I work with. What does that have to do with engineering? Is my product manager out on leave and I’m standing in?

I’ve found that customer stories connect engineers to the problems we’re solving. When we’re designing a system, we ask ourselves “what if” questions all the time. We need to explore the bounds of the solutions we’re creating, understanding the edge cases. We consider the performance characteristics, scalability requirements and all sorts of other important technical details. Through all this we imagine how the system will interact. It is easier when the software has a human interface, when it interacts with regular people. It is harder, but just as important, when we write software that only interacts with other software systems.

Sometimes the software that we are building can seem quite unrelated from the human world, but it isn’t at all. We then need to understand the bigger system we’re building. At some point, the software has a real-world impact, and we need to understand that, or we can miss creating the positive effects we intend.

On many teams over many years, I’ve had the opportunity to work with other engineers who get this. When it works there’s a rhythm to it, a heartbeat that I stop hearing consciously because it is always there. We talk to customers early and often. We learn about their problems, the other technologies they use, and what is the stuff they understand that we don’t. We start describing our ideas for solutions in their own words. This is not marketing. This influences what we invent. Understanding how to communicate about the thing we’re building changes what we build. We imagine this code we will write, that calls some other code, which causes other software to do a thing, and through all of the systems and layers there is some macro effect, which is important and time critical. Our software may have the capability of doing a thousand things, but we choose the scenarios for performance testing, we decide what is most normal, most routine, and that thing needs to be tied directly to those real effects.

Sometimes we refer to “domain knowledge” if our customers have special expertise, and we need to know that stuff, at least a little bit, so we can relate to our customers (and it’s usually pretty fun to explore someone else’s world a bit). However, the most important thing our customers know, that we need to discover, is what will solve their problems. They don’t know it in their minds — what they describe that they think will solve their problems often doesn’t actually. They know it when they feel it, when they somehow interact with our software and it gives them agency and amplifies their effect in the world.

Our software works when it empowers people to solve their problems and create impact. We can measure that. We can watch that happen. For those of us who have experienced that as software engineers, there’s no going back. We can’t write software any other way.

Customer stories, first hand knowledge from the people whose problems I’m solving spark my imagination, but I’m careful to use those stories with context from quantitative data. I love the product managers who tell me about rapidly expanding markets and how they relate to the use cases embedded in those stories, and research teams who validate whether the story I heard the other day is common across our customers or an edge case. We build software on a set of assumptions. Those assumptions need to be based on reality, and we need to validate early and often whether the thing we’re making is actually going to have a positive effect on reality.

Customer stories are like oxygen to a development team that works like this. Research and design teams who work closely with product managers are like water. When other engineers can’t explain the customer use cases for their software, when they argue about what the solution should be based only on the technical requirements, sometimes I can’t breathe. Then I find the people who are like me, who work like this, and I can hear a heartbeat, I can breathe again, and if feels like this thing we are making just might be awesome.

by sarah at June 19, 2017 01:08 PM

June 18, 2017

Debian Administration

Debian Stretch Released

Today the Debian project is pleased to announce the release of the next stable release of Debian GNU/Linux, code-named Stretch.

by Steve at June 18, 2017 03:44 AM

June 17, 2017

OpenSSL

Removing Some Code

This is another update on our effort to re-license the OpenSSL software. Our previous post in March was about the launch of our effort to reach all contributors, with the hope that they would support this change.

So far, about 40% of the people have responded. For a project that is as old as OpenSSL (including its predecessor, SSLeay, it’s around 20 years) that’s not bad. We’ll be continuing our efforts over the next couple of months to contact everyone.

Of those people responding, the vast majority have been in favor of the license change – less then a dozen objected. This post describes what we’re doing about those and how we came to our conclusions. The goal is to be very transparent and open with our processes.

First, we want to mention that we respect their decision. While it is inconvenient to us, we acknowledge their rights to keep their code under the terms that they originally agreed to. We are asking permission to change the license terms, and it is fully within their rights to say no.

The license website is driven by scripts that are freely available in the license section of our tools repository on GitHub. When we started, we imported every single git commit, looked for anything that resembled an email address, and created tables for each commit, each user we found, and a log that connects users to commits. This did find false positives: sometimes email Message-ID’s showed up, and there were often mentions of folks just as a passing side-comment, or because they were in the context diff. (That script is also in the tools repository, of course.)

The user table also has columns to record approval or rejection of the license change, and comments. Most of the comments have been supportive, and (a bit surprisingly) only one or two were obscene.

The whattoremove script finds the users who refused, and all commits they were named in. We then examined each commit – there were 86 in all – and filtered out those that were cherry-picks to other releases. We are planning on only changing the license for the master branch, so the other releases branches aren’t applicable. There were also some obvious false positives. At the end of this step, we had 43 commits to review.

We then re-read every single commit, looking for what we could safely ignore as not relevant. We found the following:

  • Nine of them showed up only because the person was mentioned in a comment.
  • Sixteen of them were changes to the OpenBSD configuration data. The OpenSSL team had completely rewritten this, refactoring all BSD configurations into a common hierarchy, and the config stuff changed a great deal for 1.1.0.
  • Seven were not copyrightable as because they were minimal changes (e.g., fixing a typo in a comment).
  • One was a false positive.

This left us with 10 commits. Two of them them were about the CryptoDev engine. We are removing that engine, as can be seen in this PR, but we expect to have a replacement soon (for at least Linux and FreeBSD). As for the other commits, we are removing that code, as can be seen in this first PR. and this second PR. Fortunately, none of them are particularly complex.

Copyright, however, is a complex thing – at times it makes debugging look simple. If there is only one way to write something, then that content isn’t copyrightable. If the work is very small, such as a one-line patch, then it isn’t copyrightable. We aren’t lawyers, and have no wish to become such, but those viewpoints are valid. It is not official legal advice, however.

For the next step, we hope that we will be able to “put back” the code that we removed. We are looking to the community for help, and encourage you to create the appropriate GitHub pull requests. We ask the community to look at that second PR and provide fixes.

And finally, we ask everyone to look at the list of authors and see if their name, or the name of anyone they know, is listed. If so please email us.

June 17, 2017 12:00 PM

June 14, 2017

Colin Percival

Oil changes, safety recalls, and software patches

Every few months I get an email from my local mechanic reminding me that it's time to get my car's oil changed. I generally ignore these emails; it costs time and money to get this done (I'm sure I could do it myself, but the time it would cost is worth more than the money it would save) and I drive little enough — about 2000 km/year — that I'm not too worried about the consequences of going for a bit longer than nominally advised between oil changes. I do get oil changes done... but typically once every 8-12 months, rather than the recommended 4-6 months. From what I've seen, I don't think I'm alone in taking a somewhat lackadaisical approach to routine oil changes.

June 14, 2017 04:40 AM

June 13, 2017

OpenSSL

New Committers

We announced back in October that we would be changing from a single OpenSSL Project Team to having an OpenSSL management committee and a set of committers which we planned to expand to enable the greater involvement from the community.

Now that we have in place committer guidelines, we have invited the first set of external (non-OMC) community members to become committers and they have each accepted the invitation.

What this means for the OpenSSL community is that there is now a larger group of active developers who have the ability to review and commit code to our source code repository. These new committers will help the existing committers with our review process (i.e., their reviews count towards approval) which we anticipate will help us keep on top of the github issues and pull request queues.

The first group of non-OMC committers (in alphabetical order) are:

As always, we welcome comments on submissions and technical reviews of pull requests from anyone in the community.

Note: All submissions must be reviewed and approved by at least two committers, one of whom must also be an OMC member as described in the project bylaws

June 13, 2017 12:00 PM

pagetable

Building the Original Commodore 64 KERNAL Source

Many reverse-engineered versions of “KERNAL”, the C64′s ROM operating system exist, and some of them even in a form that can be built into the original binary. But how about building the original C64 KERNAL source with the original tools?

The Commodore engineer Dennis Jarvis rescued many disks with Commodore source, which he gave to Steve Gray for preservation. One of these disks, c64kernal.d64, contains the complete source of the original KERNAL version (901227-01), including all build tools. Let’s build it!

Building KERNAL

First, we need a Commodore PET. Any PET will do, as long as it has enough RAM. 32 KB should be fine. For a disk drive, a 2040 or 4040 is recommended if you want to create the cross-reference database (more RAM needed for many open files), otherwise a 2031 will do just fine. The VICE emulator will allow you to configure such a system.

First you should delete the existing cross-reference files on disk. With the c1541 command line tool that comes with VICE, this can be done like this:

c1541 c64kernal.d64 -delete xr* -validate

Now attach the disk image and either have the emulator autostart it, or

LOAD"ASX4.0",8
RUN

This will run the “PET RESIDENT ASSEMBLER”. Now answer the questions like this:

PET RESIDENT ASSEMBLER V102780
(C) 1980 BY COMMODORE BUSINESS MACHINES

OBJECT FILE (CR OR D:NAME): KERNAL.HEX
HARD COPY (CR/Y OR N)? N
CROSS REFERENCE (CR/NO OR Y)? N
SOURCE FILE NAME? KERNAL

The assembler will now do its work and create the file “KERNAL.HEX” on disk. This will take about 16 minutes, so you probably want to switch your emulator into “Warp Mode”. When it is done, quit the emulator to make sure the contents of the disk image are flushed.

Using c1541, we can extract the output of the assembler:

c1541 c64kernal.d64 -read kernal.hex

The file is in MOS Technology Hex format and can be converted using srecord:

tr '\r' '\n' < kernal.hex > kernal_lf.hex
srec_cat kernal_lf.hex -MOS_Technologies \
-offset -0xe000 \
-fill 0xaa 0x0000 0x1fff \
-o kernal.bin -Binary

This command makes sure to fill the gaps with a value of 0xAA like the original KERNAL ROM.

The resulting file is identical with 901227-01 except for the following:

  • $E000-$E4AB is empty instead of containing the overflow BASIC code
  • $E4AC does not contain the checksum (the BASIC program “chksum” on the disk is used to calculate it)
  • $FFF6 does not contain the “RRBY” signature

Also note that $FF80, the KERNAL version byte is $AA – the first version of KERNAL did not yet use this location for the version byte, so the effective version byte is the fill byte.

Browsing the Source

An ASCII/LF version of the source code is available at https://github.com/mist64/cbmsrc.

;****************************************
;*                                      *
;* KK  K EEEEE RRRR  NN  N  AAA  LL     *
;* KK KK EE    RR  R NNN N AA  A LL     *
;* KKK   EE    RR  R NNN N AA  A LL     *
;* KKK   EEEE  RRRR  NNNNN AAAAA LL     *
;* KK K  EE    RR  R NN NN AA  A LL     *
;* KK KK EE    RR  R NN NN AA  A LL     *
;* KK KK EEEEE RR  R NN NN AA  A LLLLL  *
;*                                      *
;***************************************
;
;***************************************
;* PET KERNAL                          *
;*   MEMORY AND I/O DEPENDENT ROUTINES *
;* DRIVING THE HARDWARE OF THE         *
;* FOLLOWING CBM MODELS:               *
;*   COMMODORE 64 OR MODIFED VIC-40    *
;* COPYRIGHT (C) 1982 BY               *
;* COMMODORE BUSINESS MACHINES (CBM)   *
;***************************************
.SKI 3
;****LISTING DATE --1200 14 MAY 1982****
.SKI 3
;***************************************
;* THIS SOFTWARE IS FURNISHED FOR USE  *
;* USE IN THE VIC OR COMMODORE COMPUTER*
;* SERIES ONLY.                        *
;*                                     *
;* COPIES THEREOF MAY NOT BE PROVIDED  *
;* OR MADE AVAILABLE FOR USE ON ANY    *
;* OTHER SYSTEM.                       *
;*                                     *
;* THE INFORMATION IN THIS DOCUMENT IS *
;* SUBJECT TO CHANGE WITHOUT NOTICE.   *
;*                                     *
;* NO RESPONSIBILITY IS ASSUMED FOR    *
;* RELIABILITY OF THIS SOFTWARE. RSR   *
;*                                     *
;***************************************
.END

by Michael Steil at June 13, 2017 07:44 AM

June 01, 2017

Anton Chuvakin - Security Warrior

Monthly Blog Round-Up – May 2017

Here is my next monthly "Security Warrior" blog round-up of top 5 popular posts/topics this
month:
  1. “New SIEM Whitepaper on Use Cases In-Depth OUT!” (dated 2010) presents a whitepaper on select SIEM use cases described in depth with rules and reports [using now-defunct SIEM product]; also see this SIEM use case in depth and this for a more current list of popular SIEM use cases. Finally, see our 2016 research on developing security monitoring use cases here!
  2. Simple Log Review Checklist Released!” is often at the top of this list – this aging checklist is still a very useful tool for many people. “On Free Log Management Tools” (also aged a bit by now) is a companion to the checklist (updated version)
  3. Why No Open Source SIEM, EVER?” contains some of my SIEM thinking from 2009. Is it relevant now? You be the judge.  Succeeding with SIEM requires a lot of work, whether you paid for the software, or not. BTW, this post has an amazing “staying power” that is hard to explain – I suspect it has to do with people wanting “free stuff” and googling for “open source SIEM” … 
  4. This month, my classic PCI DSS Log Review series is extra popular! The series of 18 posts cover a comprehensive log review approach (OK for PCI DSS 3+ even though it predates it), useful for building log review processes and procedures, whether regulatory or not. It is also described in more detail in our Log Management book and mentioned in our PCI book (now in its 4th edition!) – note that this series is mentioned in some PCI Council materials. 
  5. “SIEM Resourcing or How Much the Friggin’ Thing Would REALLY Cost Me?” is a quick framework for assessing the SIEM project (well, a program, really) costs at an organization (a lot more details on this here in this paper).
In addition, I’d like to draw your attention to a few recent posts from my Gartner blog [which, BTW, now has more than 5X of the traffic of this blog]: 

EPIC WIN post:
Current research on vulnerability management:
Current research on cloud security monitoring:
Recent research on security analytics and UBA / UEBA:
Miscellaneous fun posts:

(see all my published Gartner research here)
Also see my past monthly and annual “Top Popular Blog Posts” – 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016.

Disclaimer: most content at SecurityWarrior blog was written before I joined Gartner on August 1, 2011 and is solely my personal view at the time of writing. For my current security blogging, go here.

Previous post in this endless series:

by Anton Chuvakin (anton@chuvakin.org) at June 01, 2017 04:40 PM

May 25, 2017

Raymii.org

Openstack Horizon, remove the loading modal with uBlock Origin

The OpenStack dashboard, Horizon, is a great piece of software to manage your OpenStack resources via the web. However, it has, in my opinion, a very big usability issue. The loading dialog that appears after you click a link. It blocks the entire page and all other links. So, whenever I click, I have to wait three to five seconds before I can do anything else. Clicked the wrong menu item? Sucks to be you, here have some loading. Clicked a link and quickly want to open something in a new tab while the page is still loading? Nope, not today. It's not that browsers have had a function to show that a page is loading, no, of course, the loading indication that has been there forever is not good enough. Let's re-invent the wheel and significantly impact the user experience. With two rules in uBlock Origin this loading modal is removed and you can work normally again in Horizon

May 25, 2017 12:00 AM

May 22, 2017

HolisticInfoSec.org

Toolsmith #125: ZAPR - OWASP ZAP API R Interface

It is my sincere hope that when I say OWASP Zed Attack Proxy (ZAP), you say "Hell, yeah!" rather than "What's that?". This publication has been a longtime supporter, and so many brilliant contibutors and practitioners have lent to OWASP ZAPs growth, in addition to @psiinon's extraordinary project leadership. OWASP ZAP has been 1st or 2nd in the last four years of @ToolsWatch best tool survey's for a damned good reason. OWASP ZAP usage has been well documented and presented over the years, and the wiki gives you tons to consider as you explore OWASP ZAP user scenarios.
One of the more recent scenarios I've sought to explore recently is use of the OWASP ZAP API. The OWASP ZAP API is also well documented, more than enough detail to get you started, but consider a few use case scenarios.
First, there is a functional, clean OWASP ZAP API UI, that gives you a viewer's perspective as you contemplate programmatic opportunities. OWASP ZAP API interaction is URL based, and you can invoke both access views and invoke actions. Explore any component and you'll immediately find related views or actions. Drilling into to core via http://localhost:8067/UI/core/ (I run OWASP ZAP on 8067, your install will likely be different), gives me a ton to choose from.
You'll need your API key in order to build queries. You can find yours via Tools | Options | API | API Key. As an example, drill into numberOfAlerts (baseurl ), which gets the number of alerts, optionally filtering by URL. You'll then be presented with the query builder, where you can enter you key, define specific parameter, and decide your preferred output format including JSON, HTML, and XML.
Sure, you'll receive results in your browser, this query will provide answers in HTML tables, but these aren't necessarily optimal for programmatic data consumption and manipulation. That said, you learn the most important part of this lesson, a fully populated OWASP ZAP API GET URL: http://localhost:8067/HTML/core/view/numberOfAlerts/?zapapiformat=HTML&apikey=2v3tebdgojtcq3503kuoq2lq5g&formMethod=GET&baseurl=.
This request would return




in HTML. Very straightforward and easy to modify per your preferences, but HTML results aren't very machine friendly. Want JSON results instead? Just swap  out HTML with JSON in the URL, or just choose JSON in the builder. I'll tell you than I prefer working with JSON when I use the OWASP ZAP API via the likes of R. It's certainly the cleanest, machine-consumable option, though others may argue with me in favor of XML.
Allow me to provide you an example with which you can experiment, one I'll likely continue to develop against as it's kind of cool for active reporting on OWASP ZAP scans in flight or on results when session complete. Note, all my code, crappy as it may be, is available for you on GitHub. I mean to say, this is really v0.1 stuff, so contribute and debug as you see fit. It's also important to note that OWASP ZAP needs to be running, either with an active scanning session, or a stored session you saved earlier. I scanned my website, holisticinfosec.org, and saved the session for regular use as I wrote this. You can even see reference to the saved session by location below.
R users are likely aware of Shiny, a web application framework for R, and its dashboard capabilities. I also discovered that rCharts are designed to work interactively and beautifully within Shiny.
R includes packages that make parsing from JSON rather straightforward, as I learned from Zev Ross. RJSONIO makes it as easy as fromJSON("http://localhost:8067/JSON/core/view/alerts/?zapapiformat=JSON&apikey=2v3tebdgojtcq3503kuoq2lq5g&formMethod=GET&baseurl=&start=&count=")
to pull data from the OWASP ZAP API. We use the fromJSON "function and its methods to read content in JSON format and de-serializes it into R objects", where the ZAP API URL is that content.
I further parsed alert data using Zev's grabInfo function and organized the results into a data frame (ZapDataDF). I then further sorted the alert content from ZapDataDF into objects useful for reporting and visualization. Within each alert objects are values such as the risk level, the alert message, the CWE ID, the WASC ID, and the Plugin ID. Defining each of these values into parameter useful to R is completed with the likes of:
I then combined all those results into another data frame I called reportDF, the results of which are seen in the figure below.
reportDF results
Now we've got some content we can pivot on.
First, let's summarize the findings and present them in their resplendent glory via ZAPR: OWASP ZAP API R Interface.
Code first, truly simple stuff it is:
Summary overview API calls

You can see that we're simply using RJSONIO's fromJSON to make specific ZAP API call. The results are quite tidy, as seen below.
ZAPR Overview
One of my favorite features in Shiny is the renderDataTable function. When utilized in a Shiny dashboard, it makes filtering results a breeze, and thus is utilized as the first real feature in ZAPR. The code is tedious, review or play with it from GitHub, but the results should speak for themselves. I filtered the view by CWE ID 89, which in this case is a bit of a false positive, I have a very flat web site, no database, thank you very much. Nonetheless, good to have an example of what would definitely be a high risk finding.


Alert filtering

Alert filtering is nice, I'll add more results capabilities as I develop this further, but visualizations are important too. This is where rCharts really come to bear in Shiny as they are interactive. I've used the simplest examples, but you'll get the point. First, a few, wee lines of R as seen below.
Chart code
The results are much more satisfying to look at, and allow interactivity. Ramnath Vaidyanathan has done really nice work here. First, OWASP ZAP alerts pulled via the API are counted by risk in a bar chart.
Alert counts

As I moused over Medium, we can see that there were specifically 17 results from my OWASP ZAP scan of holisticinfosec.org.
Our second visualization are the CWE ID results by count, in an oft disdained but interactive pie chart (yes, I have some work to do on layout).


CWE IDs by count

As we learned earlier, I only had one CWE ID 89 hit during the session, and the visualization supports what we saw in the data table.
The possibilities are endless to pull data from the OWASP ZAP API and incorporate the results into any number of applications or report scenarios. I have a feeling there is a rich opportunity here with PowerBI, which I intend to explore. All the code is here, along with the OWASP ZAP session I refer to, so you can play with it for yourself. You'll need OWASP ZAP, R, and RStudio to make it all work together, let me know if you have questions or suggestions.
Cheers, until next time.

by Russ McRee (noreply@blogger.com) at May 22, 2017 06:35 AM

May 20, 2017

Sarah Allen

serverless patterns at google i/o 2017

At Google I/O last week, we presented how to build robust mobile applications for the distributed cloud about building mobile apps in this new world of “serverless development.” When people hear “serverless” sometimes they think we can write code on the server that is just like client-side code, but that’s not really the point.

We have many of the same concerns in developing the code that we always have when we write client-server sofware — we just don’t need to manage servers directly, which drastically reduces operational challenges. In this talk we dive into some specific software development patterns that help us write robust code that scales well.

Speakers:

  • Sarah Allen (me!) I lead the engineering team that works at the intersection of Firebase and Google Cloud on the server side.
  • Judy Tuan introduced me to Firebase over 5 years ago, when she led our team at AngelHack SF (on May 3, 2012) to build an iPhone app that would paint 3D shapes by waving your phone around using the accelerometer. That event happened to be Firebase’s first public launch, and she met Andrew Lee who convinced her to use Firebase in our app. She’s an independent software developer, and also working with Tech Workers Coalition.
  • Jen-Mei Wu is a software architect at Indiegogo, and also volunteers at Liberating Ourselves Locally, a people-of-color-led, gender-diverse, queer and trans inclusive hacker/maker space in East Oakland.

Jen-Mei kicked off the talk by introducing a use case for serverless development based on her work at the maker space, which often works to help non-profits. They are limited in their ability to deploy custom software because they work with small organizations who are staffed by non-technical folk. It doesn’t make sense to set them up with a need to devote resources to updating to the underlying software and operating systems needed to run a web application. With Firebase, the server software is automatically kept up to date with all the needed security patches for all of the required dependencies, it scales up when needed, and scales down to zero cost when not in active use.

The primary use case that motivated the talk from my perspective is for businesses that need to get started quickly, then scale up as usage grows. I found it inspiring to hear about how Firebase supports very small organizations with the same products and infrastructure that auto-scale to a global, distributed network supporting businesses with billions of users.

The concept for this talk was for some real-world developers (Judy and Jen-Mei) to envision an application that would illustrate common patterns in mobile application development, then I recruited a few Firebase engineers to join their open source team and we built the app together.

Severless Patterns

We identified some key needs that are common to mobile applications:

  • People use the app
  • The app stores data
  • The app show the data to people
  • There is some core business logic (“secret sauce”)
  • People interact with each other in some way

The talk details some of the core development patterns:

  • Server-side “special sauce”
  • Authentication & access control
  • Databinding to simplify UI development

The App: Hubbub

Ever go to a conference and feel like you haven’t connected to the people you would have liked to meet? This app seeks to solve that! Using your public GitHub data, we might be able to connect you with other people who share your technical interests. Maybe not, but at least you’ll have lunch with somebody.

You authenticate with GitHub, then the app creates a profile for you by pulling a lot of data from the GitHub API. Your profile might include languages you code in, a list of connections based on your pull request history or the other committers on projects that you have contributed to. Then there’s a list of events, which have a time and place, which people can sign up for. Before the event, people will be placed in groups with a topic they might be interested in. They show up at the appointed time and place, and then to find their assigned group, they open a beacon screen in the app which shows an image that is unique to their group (a pattern of one or more hubbubs with their topic name) and they find each other by holding up their phones.

We built most of the app, enough to really work through the key development patterns, but didn’t have time to hook up the profile generation data collection and implement a good matching algorithm. We ended up using a simple random grouping and were able to test the app at Google I/O for lunch on Friday. Some people showed up and it worked!

You can see the code at: https://github.com/all-the-hubbub


Many thanks to all the people who joined our open source team to develop the app:

and who helped with the presentation:

  • Virginia Poltrack, graphics
  • Yael Kazaz, public speaking, story-telling coach
  • Puf and Megan, rehearsal feedback
  • Todd Kerpelman our Google I/O mentor
  • Laura Nozay and all of the wonderful Google I/O staff who made the event possible

by sarah at May 20, 2017 05:01 PM

May 15, 2017

Sean's IT Blog

Integrating Duo MFA and Amazon WorkSpaces

I recently did some work integrating Duo MFA with Amazon WorkSpaces.  The integration work ran into a few challenges, and I wanted to blog about those challenges to help others in the future.

Understanding Amazon WorkSpaces

If you couldn’t tell by the name, Amazon WorkSpaces is a cloud-hosted Desktop-as-a-Service offering that runs on Amazon Web Services (AWS).  It utilizes AWS’s underlying infrastructure to deploy desktop workloads, either using licensing provided by Amazon or the customer.  The benefit of this over on-premises VDI solutions is that Amazon manages the management infrastructure.  Amazon also provides multiple methods to integrate with the customer’s Active Directory environment, so users can continue to use their existing credentials to access the deployed desktops.

WorkSpaces uses an implementation of the PCoIP protocol for accessing desktops from the client application, and the client application is available on Windows, Linux, MacOS, iOS, and Android.  Amazon also has a web-based client that allows users to access their desktop from a web browser.  This feature is not enabled by default.

Anyway, that’s a very high-level overview of WorkSpaces.

Understanding How Users Connect to WorkSpaces

Before we can talk about how to integrate any multi-factor authentication solution into WorkSpaces, let’s go through the connection path and how that impacts how MFA is used.  WorkSpaces differs from traditional on-premises VDI in that all user connections to the desktop go through a public-facing service.  This happens even if I have a VPN tunnel or use DirectConnect.

The following image illustrates the connection path between the users and the desktop when a VPN connection is in place between the on-premises environment and AWS:

ad_connector_architecture_vpn

Image courtesy of AWS, retrieved from http://docs.aws.amazon.com/workspaces/latest/adminguide/prep_connect.html

As you can see from the diagram, on-premises users don’t connect to the WorkSpaces over the VPN tunnel – they do it over the Internet.  They are coming into their desktop from an untrusted connection, even if the endpoint they are connecting from is on a trusted network.

Note: The page that I retrieved this diagram from shows users connecting to WorkSpaces over a DirectConnect link when one is present.  This diagram shows a DirectConnect link that has a public endpoint, and all WorkSpaces connections are routed through this public endpoint.  In this configuration, logins are treated the same as logins coming in over the Internet.

When multi-factor authentication is configured for WorkSpaces, it is required for all user logins.  There is no way to configure rules to apply MFA to users connecting from untrusted networks because WorkSpaces sees all connections coming in on a public, Internet-facing service.

WorkSpaces MFA Requirements

WorkSpaces only works with MFA providers that support RADIUS.  The RADIUS servers can be located on-premises or in AWS.  I recommend placing the RADIUS proxies in another VPC in AWS where other supporting infrastructure resources, like AD Domain Controllers, are located.  These servers can be placed in the WorkSpaces VPC, but I wouldn’t recommend it.

WorkSpaces can use multiple RADIUS servers,and it will load-balance requests across all configured RADIUS servers.  RADIUS can be configured on both the Active Directory Connectors and the managed Microsoft AD directory services options. 

Note: The Duo documentation states that RADIUS can only be configured on the Active Directory Connector. Testing shows that RADIUS works on both enterprise WorkSpaces Directory Services options.

Configuring Duo

Before we can enable MFA in WorkSpaces, Duo needs to be configured.  At least one Duo proxy needs to be reachable from the WorkSpaces VPC.  This could be on a Windows or Linux EC2 instance in the WorkSpaces VPC, but it will most likely be an EC2 instance in a shared services or management VPC or in an on-premises datacenter.  The steps for installing the Duo Proxies are beyond the scope of this article, but they can be found in the Duo documentation.

Before the Duo Authentication Proxies are configured, we need to configure a new application in the Duo Admin console.  This step generates an integration key and secret key that will be used by the proxy and configure how new users will be handled.  The steps for creating the new application in Duo are:

1. Log into your Duo Admin Console at https://admin.duosecurity.com

2. Login as  your user and select the two-factor authentication method for your account.  Once you’ve completed the two-factor sign-on, you will have access to your Duo Admin panel.

3. Click on Applications in the upper left-hand corner.

4. Click on Protect an Application

5. Search for RADIUS.

6. Click Protect this Application  underneath the RADIUS application.

image

7. Record the Integration Key, Secret Key, and API Hostname.  These will be used when configuring the Duo Proxy in a few steps.

image

8. Provide a Name for the Application.  Users will see this name in their mobile device if they use Duo Push.

image

9. Set the New User Policy to Deny Access.  Deny Access is required because the WorkSpaces setup process will fail if Duo is configured to fail open or auto-enroll.

image

10. Click Save Changes to save the new application.

Get the AWS Directory Services IP Addresses

The RADIUS Servers need to know which endpoints to accept RADIUS requests from.  When WorkSpaces is configured for MFA, these requests come from the Directory Services instance.  This can either be the Active Directory Connectors or the Managed Active Directory domain controllers.

The IP addresses that will be required can be found in the AWS Console under WorkSpaces –> Directories.  The field to use depends on the directory service type you’re using.  If you’re using Active Directory Connectors, the IP addresses are in the Directory IP Address field.  If you’re using the managed Microsoft AD service, the Directory IP address field has a value of “None,” so you will need to use the  IP addresses in the DNS Address field.

Configure AWS Security Group Rules for RADIUS

By default, the AWS security group for the Directory Services instance heavily restrict the inbound and outbound traffic.  In order to enable RADIUS, you’ll need to add inbound and outbound UDP 1812 to the destination destination IPs or subnet where the proxies are located.

The steps for updating the AWS security group rules are:

1. Log into the AWS Console.

2. Click on the Services menu and select WorkSpaces.

3. Click on Directories.

4. Copy the value in the Directory ID field of the directory that will have RADIUS configured.

5. Click on the Services menu and select VPC.

6. Click on Security Groups.

7. Paste the Directory ID into the Search field to find the Security Group attached to the Directory Services instance.

8. Select the Security Group.

9. Click on the Inbound Rules tab.

10. Click Edit.

11. Click Add Another Rule.

12. Select Custom UDP Rule in the Type dropdown box.

13. Enter 1812 in the port range field.

14. Enter the IP Address of the RADIUS Server in the Source field.

15. Repeat Steps 11 through 14 as necessary to create inbound rules for each Duo Proxy.

16. Click Save.

17. Click on the Outbound Rules tab.

18. Click Edit.

19. Click Add Another Rule.

20. Select Custom UDP Rule in the Type dropdown box.

21. Enter 1812 in the port range field.

22. Enter the IP Address of the RADIUS Server in the Source field.

23. Repeat Steps 11 through 14 as necessary to create inbound rules for each Duo Proxy.

24. Click Save.

Configure the Duo Authentication Proxies

Once the Duo side is configured and the AWS security rules are set up, we need to configure our Authentication Proxy or Proxies.  This needs to be done locally on each Authentication Proxy.  The steps for configuring the Authentication Proxy are:

Note:  This step assumes the authentication proxy is running on Windows.The steps for installing the authentication proxy are beyond the scope of this article.  You can find the directions for installing the proxy at: https://duo.com/docs/

1. Log into your Authentication Proxy.

2. Open the authproxy.cfg file located in C:\Program Files (x86)\Duo Security Authentication Proxy\conf.

Note: The authproxy.cfg file uses the Unix newline setup, and it may require a different text editor than Notepad.  If you can install it on your server, I recommend Notepad++ or Sublime Text. 

3. Add the following lines to your config file.  This requires the Integration Key, Secret Key, and API Hostname from Duo and the IP Addresses of the WorkSpaces Active Directory Connectors or Managed Active Directory domain controllers.  Both RADIUS servers need to use the same RADIUS shared secret, and it should be a complex string.

[duo_only_client]

[radius_server_duo_only]
ikey=<Integration Key>
skey=<Secret Key>
api_host=<API Hostname>.duosecurity.com
failmode=safe
radius_ip_1=Directory Service IP 1
radius_secret_1=Secret Key
radius_ip_2=Directory Service IP 2
radius_secret_2=Secret Key
port=1812

4. Save the configuration file.

5. Restart the Duo Authentication Proxy Service to apply the new configuration.

Integrating WorkSpaces with Duo

Now that our base infrastructure is configured, it’s time to set up Multi-Factor Authentication in WorkSpaces.  MFA is configured on the Directory Services instance.  If multiple directory services instances are required to meet different WorkSpaces use cases, then this configuration will need to be performed on each directory service instance. 

The steps for enabling WorkSpaces with Duo are:

1. Log into the AWS Console.

2. Click on the Services menu and select Directory Service.

3. Click on the Directory ID of the directory where MFA will be enabled.

4. Click the Multi-Factor authentication tab.

5. Click the Enable Multi-Factor Authentication checkbox.

6. Enter the IP address or addresses of your Duo Authentication Proxies in the RADIUS Server IP Address(es) field.

7. Enter the RADIUS Shared Secret in the Shared Secret Code and Confirm Shared Secret Code fields.

8. Change the Protocol to PAP.

9. Change the Server Timeout to 20 seconds.

10. Change the Max Retries to 3.

11. Click Update Directory.

12. If RADIUS setup is successful, the RADIUS Status should say Completed.

image

Using WorkSpaces with Duo

Once RADIUS is configured, the WorkSpaces client login prompt will change.  Users will be prompted for an MFA code along with their username and password.  This can be the one-time password generated in the mobile app, a push, or a text message with a one-time password.  If the SMS option is used, the user’s login will fail, but they will receive a text message with a one-time password. They will need to log in again using one of the codes received in the text message.

image

Troubleshooting

If the RADIUS setup does not complete successfully, there are a few things that you can check.  The first step is to verify that port 1812 is open on the Duo Authentication Proxies.  Also verify that the security group the directory services instance is configured to allow port 1812 inbound and outbound. 

One issue that I ran into in previous setups was the New User Policy in the Duo settings.  If it is not set to Deny Access, the process setup process will fail.  WorkSpaces is expecting a response code of Access Reject.  If it receives anything else, the process stalls out and fails about 15 minutes later. 

Finally, review the Duo Authentication Proxy logs.  These can be found in C:\Program Files (x86)\Duo Security Authentication Proxy\log on Authentication Proxies running Windows.  A tool like WinTail can be useful for watching the logs in real-time when trying to troubleshoot issues or verify that the RADIUS Services are functioning correctly.


by seanpmassey at May 15, 2017 06:08 PM

May 14, 2017

SysAdmin1138

Estimates and compounding margin of error

The two sides of this story are:

  • Software requirements are often... mutable over time.
  • Developer work-estimates are kind of iffy.

Here is a true story about the later. Many have similar tales to tell, but this one is mind. It's about metrics, agile, and large organizations. If you're expecting this to turn into a frAgile post, you're not wrong. But it's about failure-modes.

The setup

My employer at the time decided to go all-in on agile/scrum. It took several months, but every department in the company was moved to it. As they were an analytics company, their first reflex was to try to capture as much workflow data as possible so they can do magic data-analysis things on it. Which meant purchasing seats in an Agile SaaS product so everyone could track their work in it.

The details

By fiat from on-high, story points were effectively 'days'.

Due to the size of the development organization, there were three levels of Product Managers over the Individual Contributors I counted myself a part of.

The apex Product Manager, we had two, were for the two products we sold.

Marketing was also in Agile, as was Sales.

The killer feature

Because we were an analytics company, the CEO wanted a "single pane of glass" to give a snapshot of how close we were to achieving our product goals. Gathering metrics on all of our sprint-velocities, story/epic completion percentages, and story estimates, allowed us to give him that pane of glass. It had:

  • Progress bars for how close our products were to their next major milestones.
  • How many sprints it will take to get there.
  • How many sprints it would take to get to the milestone beyond it.

Awesome!

The failure

That pane of glass was a lying piece of shit.

The dashboard we had to build was based on so many fuzzy measurements that it was a thumb in the wind approximation for how fast we were going, and in what direction. The human bias to trust numbers derived using Science! is a strong one, and they were inappropriately trusted. Which lead to pressure from On High for highly accurate estimates, as the various Product Managers realized what was going on and attempted to compensate (remove uncertainty from one of the biggest sources of it).

Anyone who has built software knows that problems come in three types:

  1. Stuff that was a lot easier than you thought
  2. Stuff that was pretty much as bad as you thought.
  3. Hell-projects of tar-pits, quicksand, ambush-yaks, and misery.

In the absence of outside pressures, story estimates usually are pitched at type 2 efforts; the honest estimate. Workplace culture introduces biases to this, urging devs to skew one way or the other. Skew 'easier', and you'll end up overshooting your estimates a lot. Skew 'harder' and your velocity will look great, but capacity planning will suffer.

This leads to an interesting back and forth! Dev-team skews harder for estimates. PM sees that team typically exceeds its capacity in productivity, so adds more capacity in later sprints. In theory equilibrium is reached between work-estimation and work-completion-rate. In reality, it means that the trustability of number is kind of low and always will be.

The irreducible complexity

See, the thing is, marketing and sales both need to know when a product will be released so they can kick off marketing campaigns and start warming up the sales funnel. Some kinds of ad-buys are done weeks or more in advance, so slipping product-ship at the last minute can throw off the whole marketing cadence. Trusting in (faulty) numbers means it may look like release will be in 8 weeks, so its safe to start baking that in.

Except those numbers aren't etched in stone. They're graven in the finest of morning dew.

As that 8 week number turns into 6, then 4, then 2, pressure to hit the mark increases. For a company selling on-prem software, you can afford to miss your delivery deadline so long as you have a hotfix/service-pack process in place to deliver stability updates quickly. You see this a lot with game-dev: the shipping installer is 8GB, but there are 2GB of day-1 patches to download before you can play. SaaS products need to work immediately on release, so all-nighters may become the norm for major features tied to marketing campaigns.

Better estimates would make this process a lot more trustable. But, there is little you can do to actually improve estimate quality.

by SysAdmin1138 at May 14, 2017 03:30 PM

May 13, 2017

Anton Chuvakin - Security Warrior

Monthly Blog Round-Up – April 2017

Here is my next monthly "Security Warrior" blog round-up of top 5 popular posts/topics this month:
  1. “New SIEM Whitepaper on Use Cases In-Depth OUT!” (dated 2010) presents a whitepaper on select SIEM use cases described in depth with rules and reports [using now-defunct SIEM product]; also see this SIEM use case in depth and this for a more current list of popular SIEM use cases. Finally, see our 2016 research on developing security monitoring use cases here!
  2. Why No Open Source SIEM, EVER?” contains some of my SIEM thinking from 2009. Is it relevant now? You be the judge.  Succeeding with SIEM requires a lot of work, whether you paid for the software, or not. BTW, this post has an amazing “staying power” that is hard to explain – I suspect it has to do with people wanting “free stuff” and googling for “open source SIEM” … 
  3. This month, my classic PCI DSS Log Review series is extra popular! The series of 18 posts cover a comprehensive log review approach (OK for PCI DSS 3+ even though it predates it), useful for building log review processes and procedures, whether regulatory or not. It is also described in more detail in our Log Management book and mentioned in our PCI book (now in its 4th edition!) – note that this series is mentioned in some PCI Council materials.
  4. Simple Log Review Checklist Released!” is often at the top of this list – this aging checklist is still a very useful tool for many people. “On Free Log Management Tools” (also aged a bit by now) is a companion to the checklist (updated version)
  5. “SIEM Resourcing or How Much the Friggin’ Thing Would REALLY Cost Me?” is a quick framework for assessing the SIEM project (well, a program, really) costs at an organization (a lot more details on this here in this paper).
In addition, I’d like to draw your attention to a few recent posts from my Gartner blog [which, BTW, now has about 5X of the traffic of this blog]: 
 
Current research on cloud security monitoring:
 
Recent research on security analytics and UBA / UEBA:
 
EPIC WIN post:
 
Miscellaneous fun posts:

(see all my published Gartner research here)
Also see my past monthly and annual “Top Popular Blog Posts” – 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016.

Disclaimer: most content at SecurityWarrior blog was written before I joined Gartner on August 1, 2011 and is solely my personal view at the time of writing. For my current security blogging, go here.

Previous post in this endless series:

by Anton Chuvakin (anton@chuvakin.org) at May 13, 2017 12:16 AM

May 11, 2017

Colin Percival

A plan for open source software maintainers

I've been writing open source software for about 15 years now; while I'm still wet behind the ears compared to FreeBSD greybeards like Kirk McKusick and Poul-Henning Kamp, I've been around for long enough to start noticing some patterns. In particular:
  • Free software is expensive. Software is expensive to begin with; but good quality open source software tends to be written by people who are recognized as experts in their fields (partly thanks to that very software) and can demand commensurate salaries.
  • While that expensive developer time is donated (either by the developers themselves or by their employers), this influences what their time is used for: Individual developers like doing things which are fun or high-status, while companies usually pay developers to work specifically on the features those companies need. Maintaining existing code is important, but it is neither fun nor high-status; and it tends to get underweighted by companies as well, since maintenance is inherently unlikely to be the most urgent issue at any given time.
  • Open source software is largely a "throw code over the fence and walk away" exercise. Over the past 15 years I've written freebsd-update, bsdiff, portsnap, scrypt, spiped, and kivaloo, and done a lot of work on the FreeBSD/EC2 platform. Of these, I know bsdiff and scrypt are very widely used and I suspect that kivaloo is not; but beyond that I have very little knowledge of how widely or where my work is being used. Anecdotally it seems that other developers are in similar positions: At conferences I've heard variations on "you're using my code? Wow, that's awesome; I had no idea" many times.
  • I have even less knowledge of what people are doing with my work or what problems or limitations they're running into. Occasionally I get bug reports or feature requests; but I know I only hear from a very small proportion of the users of my work. I have a long list of feature ideas which are sitting in limbo simply because I don't know if anyone would ever use them — I suspect the answer is yes, but I'm not going to spend time implementing these until I have some confirmation of that.
  • A lot of mid-size companies would like to be able to pay for support for the software they're using, but can't find anyone to provide it. For larger companies, it's often easier — they can simply hire the author of the software (and many developers who do ongoing maintenance work on open source software were in fact hired for this sort of "in-house expertise" role) — but there's very little available for a company which needs a few minutes per month of expertise. In many cases, the best support they can find is sending an email to the developer of the software they're using and not paying anything at all — we've all received "can you help me figure out how to use this" emails, and most of us are happy to help when we have time — but relying on developer generosity is not a good long-term solution.
  • Every few months, I receive email from people asking if there's any way for them to support my open source software contributions. (Usually I encourage them to donate to the FreeBSD Foundation.) Conversely, there are developers whose work I would like to support (e.g., people working on FreeBSD wifi and video drivers), but there isn't any straightforward way to do this. Patreon has demonstrated that there are a lot of people willing to pay to support what they see as worthwhile work, even if they don't get anything directly in exchange for their patronage.

May 11, 2017 04:10 AM

May 10, 2017

Villagers with Pitchforks

What You Should Know – Air Conditioning Repair 101

To so many homeowners, owning an air conditioner is not only a luxury but also a huge necessity. This important unit keeps your home cool and comfortable during the extremely hot season. It is, therefore, your responsibility as a homeowner, of ensuring that the air circulating your house remains cool and is of good quality always, by watching out for your air conditioning system. When your AC unit starts to show signs of damage, inefficiency or malfunction, the appropriate repairs ought to be done before the problem escalates into something bigger and much more costly. As a matter of fact, delaying air conditioning repairs will only lead to more problems, costlier damages, jeopardized comfort, higher energy bills, and reduced equipment lifetime. If you own an AC system in your home, below are some pointers you might want to have a look at in our article, air conditioning repair 101: what homeowners should have in mind.

How Your Air Conditioner Works

Your air conditioner cools the air around your home by removing the heat from the warm air. Warm air is sucked from the indoor space and blown through a set of pipes in the AC called the evaporator coils. The evaporator coils are filled with a liquid called a refrigerant. After absorbing heat from the air, this liquid changes state to gas and is then pumped to the condenser component of the air conditioning system. In the condenser, the absorbed heat is released into the outdoor air while the refrigerant goes back into liquid state and the cycle continues. Most air conditioners have a thermostat, which is the device that regulates the amount and temperature of cool air supplied throughout your home.

A Word about Air Filters And Your Air Conditioner

These are found in the ductwork system or other strategic points within the HVAC system. They help in removing particles from the air both to keep the air conditioner system clean and the air supplied from the unit. As the process causes the air filters to get loaded with particles, the air filters may become loaded clogged with debris, which may consequently lead to a reduction in air flow. This can, in turn, pave the way to energy inefficiency, potentially leading to high electricity bills, increased frequency of repairs, and a reduction in the equipment’s lifetime. This means that regular air filter replacement is not only an important aspect of central air conditioning system maintenance but also plays a huge part in preventing the need for air conditioning repairs in the first place.

Costly Air Conditioner Repairs Are Usually Preventable

Some air conditioning repairs such as condenser replacement can be highly expensive. One of the ways you can prevent costly AC repairs is by regularly cleaning your filters, cleaning your condensate drains, and checking out for warning signs of a bad or malfunctioning air conditioning unit. Scheduling annual professional maintenance can also be a helpful way, especially with contractors that a reputed for preventive maintenance services.

Choose Your AC Repair Contractor Wisely

Having as much information as possible on DYI tips and what to do when something goes wrong is overly important. However, knowing how to hire an experienced, professional air conditioning repair contractor is even more important. This is because a reputed HVAC expert is well trained and qualified to handle the specific job without hitches and will be able to detect the most complex and hidden problems. When choosing an HVAC service contractor, always be on the lookout for common scams in the air HVAC industry and be prepared on how to avoid them.

According to the manager of the Altitude Comfort Heating and Air – air conditioning repair Denver division, choosing the right HVAC contractor is the absolute most important thing to focus on when you need HVAC service. “Everything else will take care of itself if you choose a really good company to take care of your problem.”

The post What You Should Know – Air Conditioning Repair 101 appeared first on The Keeping Room.

by Level at May 10, 2017 08:56 PM

How To Become An Air Conditioning Repair Tech In Colorado

Colorado is one of the few states that doesn’t issue licenses for HVAC work. While there’s no state-wide license you need to attain, it’s still a good idea to become certified by a reputable trade school before starting work. Not only will you meet national, city, and county requirements, you’ll also have the ability to prove to your employer and clients that you’ve got the skills necessary to work as an HVAC technician.

Local Requirements

Colorado doesn’t have a state-wide license requirement. This doesn’t mean you can legally perform HVAC work without any certification anywhere in the state. It’s up to the local government of each city and county to determine what licenses are required for aspiring technicians.

Notably, Denver, CO has fairly strict requirements when it comes to HVAC licenses. All journeyman licenses require passing an ICC test in a related subject and several years of work in the field. While these licenses aren’t required to perform work, they’re very highly sought after by employers and potential clients.

Licenses are required to perform work in Colorado Springs and the surrounding parts of El Paso County. Applicants have to pass a background check and demonstrate proficiency by passing an exam. Unlike Denver, you can be fined or jailed if you’re caught performing work without an El Paso license. You can find the specific requirements for the Denver, CO Heating, ventilating, A/C Only Supervisor’s Certificate online.

National Qualifications

Students at trade schools in Colorado often focus on getting a certificate that holds value across the country. Often, the first step involves getting certified under section 608 of the EPA. This certification is mandatory for anyone who works with appliances that contain refrigerants — in other words, you need it to do maintenance on air conditioners, and you’ll definitely have to have it to perform air conditioning repair service.

Additionally, many students prepare for various ICC exams for HVAc technicians. These nationwide standards are the basis for many local, state and national certificates. Mastering the material covered by ICC exams gives you the freedom to become certified almost anywhere with just a little bit of review.

Job Training

If you’ve applied for a job recently, you’ve noticed that more and more companies want to hire people with experience. Trade schools like the Lincoln School of Technology help give you some of that experience. It’s much easier to get your foot in the door and find your first job as an HVAC technician when you’ve spent time in a classroom practicing the skills necessary for basic maintenance and repair. Not only will you get a paper certification, you’ll also be able to prove your skills when you ace tricky interview questions.

Further Experience

Most advanced licenses, like Denver’s journeyman licenses, require that you keep a log detailing your employer, your duties, and roughly how many hours you work. You need a minimum of two to four years of on-the-job experience to apply for these licenses. Once you’ve put in the time, however, these expert certificates will make getting a job easy. Potential employers will know that you’ve put in the time and acquired the skills to truly master your craft.

A Rewarding Career

HVAC technicians enjoy a challenging career with lots of opportunities for advancement. By starting your journey at a technical school, you’ll prove to your employer that you’re serious and enter the job with the skills that you need. This often means higher starting pay, faster advancement, and more freedom when it comes to finding the job you want. If you want to earn good money while helping local community members solve their problems, enroll in a reputable trade school like Lincoln Tech today.

The post How To Become An Air Conditioning Repair Tech In Colorado appeared first on The Keeping Room.

by Level at May 10, 2017 08:55 PM

How To Troubleshoot Your Air Conditioner

Air conditioning units are very much sought after especially throughout summertime when it gets very humid. It can get difficult to rest at night when the temperature rises beyond comfort. But like other home appliances, they can easily break down because of excessive use. Below are ways on how to troubleshoot your air conditioner.

Step 1

If you want to troubleshoot your air conditioner you need to find the best, surest and simplest ways to fix most overlooked problems. If your air conditioner is not working, this is the first step. If your AC is not turning on, you need to check on the thermostat. You may find out that the thermostat as been turned off or even set at a very low temperature that cannot allow your AC to turn on. You must also ensure that your electrical breaker hasn’t been tripped. At this point, you can trip the breaker on and off again to ensure that is in the ON position. There are several good air conditioner troubleshooting charts that can be found online.

Step 2

This is where you inspect the condition of the blower unit to ensure that it is in good working condition. At this point, you check whether the AC is producing excess air flow in the dwelling. The unit is perfectly located in the main AC housing outside of your house and looks exactly like a hamster wheel. Clean the blower unit and check if there is any damaged or bent blade. Any bent or damaged blades can cause inconvenience and ineffectiveness. If you find that you need to buy parts for your air conditioner, you can enter your model and serial number into some websites, and search for parts for the most common repair problems.

Check the duct system that comes from the AC to the vent in your house. These ducts are mostly located at the basement or attic. Ensure that there is no kink, pinched or disconnected duct. Depending on the condition of the AC filter, you can either decide to clean or replace every month. A clogged or a dirty filter will affect the air flow. This is why they require regular maintenance. This AC filter is perfectly located along the duct that returns air form home to the AC.

Step 3

This is a very important step if you want to effectively troubleshoot your air conditioner. Check whether your condenser fan is properly functioning. If it is not functioning, you can try spin the fan by your hand. The AC blades should spin on their own for 5-10 seconds. If blades do not spin on their own, there must be something wrong with the bearings and you’ll be required to replace the motor. Some manufacturers also have troubleshooting guides online for their specific air conditioner models.

Step 4

If your AC’s air isn’t cold enough, you can turn on the unit ad allow the unit to run for 10-15 minutes. Shut the unit off and go to AC unit and locate the copper suction line. Feel the line with your hand, the line should be ice cold. If the suction line is not cold, your system may require more refrigerant. You can contact an expert who can repair and install all the necessary refrigerants and seal all air conditioner leaks. He must be a reputable and a licensed air conditioning repair contractor.

Step 5

It is at this step where you should open the AC housing and check if the compressor and fan are functioning/running. The compressor is found in the main AC and looks like a round canister with a tubing coming out. If the fan is running and the compressor is not, air must be flowing via your home’s vent and it won’t be cold. If the compressor is hot and not running, this means the compressor has broken down, and you require a professional.

The post How To Troubleshoot Your Air Conditioner appeared first on The Keeping Room.

by Level at May 10, 2017 08:55 PM

May 08, 2017

TaoSecurity

Latest Book Inducted into Cybersecurity Canon

Thursday evening Mrs B and I were pleased to attend an awards seminar for the Cybersecurity Canon. This is a project sponsored by Palo Alto Networks and led by Rick Howard. The goal is "identify a list of must-read books for all cybersecurity practitioners."

Rick reviewed my fourth book The Practice of Network Security Monitoring in 2014 and someone nominated it for consideration in 2016. I was unaware earlier this year that my book was part of a 32-title "March Madness" style competition. My book won the five rounds, resulting in its conclusion in the 2017 inductee list! Thank you to all those that voted for my book.

Ben Rothke awarded me the Canon trophy.
Ben Rothke interviewed me prior to the induction ceremony. We discussed some current trends in security and some lessons from the book. I hope to see that interviewed published by Palo Alto Networks and/or the Cybersecurity canon project in the near future.

In my acceptance speech I explained how I wrote the book because I had not yet dedicated a book to my youngest daughter, since she was born after my third book was published.

A teaching moment at Black Hat Abu Dhabi in December 2012 inspired me to write the book. While teaching network security monitoring, one of the students asked "but where do I install the .exe on the server?"

I realized this student had no idea of physical access to a wire, or using a system to collect and store network traffic, or any of the other fundamental concepts inherent to NSM. He thought NSM was another magical software package to install on his domain controller.

Four foreign language editions.
Thanks to the interpretation assistance of a local Arabic speaker, I was able to get through to him. However, the experience convinced me that I needed to write a new book that built NSM from the ground up, hence the selection of topics and the order in which I presented them.

While my book has not (yet?) been translated into Arabic, there are two Chinese language editions, a Korean edition, and a Polish edition! I also know of several SOCs who provide a copy of the book to all incoming analysts. The book is also a text in several college courses.

I believe the book remains relevant for anyone who wants to learn the NSM methodology to detect and respond to intrusions. While network traffic is the example data source used in the book, the NSM methodology is data source agnostic.

In 2002 Bamm Visscher and I defined NSM as "the collection, analysis, and escalation of indications and warnings to detect and respond to intrusions." This definition makes no reference to network traffic.

It is the collection-analysis-escalation framework that matters. You could perform NSM using log files, or host-centric data, or whatever else you use for indications and warning.

I have no plans for another cybersecurity book. I am currently editing a book about combat mindset written by the head instructor of my Krav Maga style and his colleague.
Thanks for asking for an autograph!

Palo Alto hosted a book signing and offered free books for attendees. I got a chance to speak with Steven Levy, whose book Hackers was also inducted. I sat next to him during the book signing, as shown in the picture at right.

Thank you to Palo Alto Networks, Rick Howard, Ben Rothke, and my family for making inclusion in the Cybersecurity Canon possible. The awards dinner was a top-notch event. Mrs B and I enjoyed meeting a variety of people, including students in local cybersecurity degree programs.

I closed my acceptance speech with the following from the end of the Old Testament, at the very end of 2nd Maccabees. It captures my goal when writing books:

"So I too will here end my story. If it is well told and to the point, that is what I myself desired; if it is poorly done and mediocre, that was the best I could do."

If you'd like a copy of The Practice of Network Security Monitoring the best deal is to buy print and electronic editions from the publisher's Web site. Use code NSM101 to save 30%. I like having the print version for easy review, and I carry the digital copy on my tablet and phone.

Thank you to everyone who voted and who also bought a copy of my book!

Update: I forgot to thank Doug Burks, who created Security Onion, the software used to demonstrate NSM in the book. Doug also contributed the appendix explaining certain SO commands. Thank you Doug! Also thank you to Bill Pollack and his team at No Starch Press, who edited and published the book!

by Richard Bejtlich (noreply@blogger.com) at May 08, 2017 03:20 PM

May 07, 2017

Electricmonk.nl

Merging two Python dictionaries by deep-updating

Say we have two Python dictionaries:

{
    'name': 'Ferry',
    'hobbies': ['programming', 'sci-fi']
}

and

{
    'hobbies': ['gaming']
}

What if we want to merge these two dictionaries such that "gaming" is added to the "hobbies" key of the first dictionary? I couldn't find anything online that did this already, so I wrote the following function for it:

# Copyright Ferry Boender, released under the MIT license.
def deepupdate(target, src):
    """Deep update target dict with src
    For each k,v in src: if k doesn't exist in target, it is deep copied from
    src to target. Otherwise, if v is a list, target[k] is extended with
    src[k]. If v is a set, target[k] is updated with v, If v is a dict,
    recursively deep-update it.

    Examples:
    >>> t = {'name': 'Ferry', 'hobbies': ['programming', 'sci-fi']}
    >>> deepupdate(t, {'hobbies': ['gaming']})
    >>> print t
    {'name': 'Ferry', 'hobbies': ['programming', 'sci-fi', 'gaming']}
    """
    for k, v in src.items():
        if type(v) == list:
            if not k in target:
                target[k] = copy.deepcopy(v)
            else:
                target[k].extend(v)
        elif type(v) == dict:
            if not k in target:
                target[k] = copy.deepcopy(v)
            else:
                deepupdate(target[k], v)
        elif type(v) == set:
            if not k in target:
                target[k] = v.copy()
            else:
                target[k].update(v.copy())
        else:
            target[k] = copy.copy(v)

It uses a combination of deepcopy(), updating and self recursion to perform a complete merger of the two dictionaries.

As mentioned in the comment, the above function is released under the MIT license, so feel free to use it any of your programs.

by admin at May 07, 2017 02:56 PM

May 05, 2017

R.I.Pienaar

Talk about Choria and NATS

Recently I was given the opportunity by the NATS.io folk to talk about Choria and NATS on one of their community events. The recording of the talk as well as the slide deck can be found below.

Thanks again for having me NATS.io team!

by R.I. Pienaar at May 05, 2017 05:54 AM

May 04, 2017

OpenSSL

Using TLS1.3 With OpenSSL

The forthcoming OpenSSL 1.1.1 release will include support for TLSv1.3. The new release will be binary and API compatible with OpenSSL 1.1.0. In theory, if your application supports OpenSSL 1.1.0, then all you need to do to upgrade is to drop in the new version of OpenSSL when it becomes available and you will automatically start being able to use TLSv1.3. However there are some issues that application developers and deployers need to be aware of. In this blog post I am going to cover some of those things.

Differences with TLS1.2 and below

TLSv1.3 is a major rewrite of the specification. There was some debate as to whether it should really be called TLSv2.0 - but TLSv1.3 it is. There are major changes and some things work very differently. A brief, incomplete, summary of some things that you are likely to notice follows:

  • There are new ciphersuites that only work in TLSv1.3. The old ciphersuites cannot be used for TLSv1.3 connections.
  • The new ciphersuites are defined differently and do not specify the certificate type (e.g. RSA, DSA, ECDSA) or the key exchange mechanism (e.g. DHE or ECHDE). This has implications for ciphersuite configuration.
  • Clients provide a “key_share” in the ClientHello. This has consequences for “group” configuration.
  • Sessions are not established until after the main handshake has been completed. There may be a gap between the end of the handshake and the establishment of a session (or, in theory, a session may not be established at all). This could have impacts on session resumption code.
  • Renegotiation is not possible in a TLSv1.3 connection
  • More of the handshake is now encrypted.
  • More types of messages can now have extensions (this has an impact on the custom extension APIs and Certificate Transparency)
  • DSA certificates are no longer allowed in TLSv1.3 connections

Note that at this stage only TLSv1.3 is supported. DTLSv1.3 is still in the early days of specification and there is no OpenSSL support for it at this time.

Current status of the TLSv1.3 standard

As of the time of writing TLSv1.3 is still in draft. Periodically a new version of the draft standard is published by the TLS Working Group. Implementations of the draft are required to identify the specific draft version that they are using. This means that implementations based on different draft versions do not interoperate with each other.

OpenSSL 1.1.1 will not be released until (at least) TLSv1.3 is finalised. In the meantime the OpenSSL git master branch contains our development TLSv1.3 code which can be used for testing purposes (i.e. it is not for production use). You can check which draft TLSv1.3 version is implemented in any particular OpenSSL checkout by examining the value of the TLS1_3_VERSION_DRAFT_TXT macro in the tls1.h header file. This macro will be removed when the final version of the standard is released.

In order to compile OpenSSL with TLSv1.3 support you must use the “enable-tls1_3” option to “config” or “Configure”.

Currently OpenSSL has implemented the “draft-20” version of TLSv1.3. Many other libraries are still using older draft versions in their implementations. Notably many popular browsers are using “draft-18”. This is a common source of interoperability problems. Interoperability of the draft-18 version has been tested with BoringSSL, NSS and picotls.

Within the OpenSSL git source code repository there are two branches: “tls1.3-draft-18” and “tls1.3-draft-19”, which implement the older TLSv1.3 draft versions. In order to test interoperability with other TLSv1.3 implementations you may need to use one of those branches. Note that those branches are considered temporary and are likely to be removed in the future when they are no longer needed.

Ciphersuites

OpenSSL has implemented support for five TLSv1.3 ciphersuites as follows:

  • TLS13-AES-256-GCM-SHA384
  • TLS13-CHACHA20-POLY1305-SHA256
  • TLS13-AES-128-GCM-SHA256
  • TLS13-AES-128-CCM-8-SHA256
  • TLS13-AES-128-CCM-SHA256

Of these the first three are in the DEFAULT ciphersuite group. This means that if you have no explicit ciphersuite configuration then you will automatically use those three and will be able to negotiate TLSv1.3.

All the TLSv1.3 ciphersuites also appear in the HIGH ciphersuite alias. The CHACHA20, AES, AES128, AES256, AESGCM, AESCCM and AESCCM8 ciphersuite aliases include a subset of these ciphersuites as you would expect based on their names. Key exchange and authentication properties were part of the ciphersuite definition in TLSv1.2 and below. This is no longer the case in TLSv1.3 so ciphersuite aliases such as ECDHE, ECDSA, RSA and other similar aliases do not contain any TLSv1.3 ciphersuites.

If you explicitly configure your ciphersuites then care should be taken to ensure that you are not inadvertently excluding all TLSv1.3 compatible ciphersuites. If a client has TLSv1.3 enabled but no TLSv1.3 ciphersuites configured then it will immediately fail (even if the server does not support TLSv1.3) with an error message like this:

1
140460646909376:error:141A90B5:SSL routines:ssl_cipher_list_to_bytes:no ciphers available:ssl/statem/statem_clnt.c:3562:No ciphers enabled for max supported SSL/TLS version

Similarly if a server has TLSv1.3 enabled but no TLSv1.3 ciphersuites it will also immediately fail, even if the client does not support TLSv1.3, with an error message like this:

1
140547390854592:error:141FC0B5:SSL routines:tls_setup_handshake:no ciphers available:ssl/statem/statem_lib.c:108:No ciphers enabled for max supported SSL/TLS version

For example, setting a ciphersuite selection string of ECDHE:!COMPLEMENTOFDEFAULT will work in OpenSSL 1.1.0 and will only select those ciphersuites that are in DEFAULT and also use ECDHE for key exchange. However no TLSv1.3 ciphersuites are in the ECDHE group so this ciphersuite configuration will fail in OpenSSL 1.1.1 if TLSv1.3 is enabled.

You may want to explicitly list the TLSv1.3 ciphersuites you want to use to avoid problems. For example:

1
"TLS13-CHACHA20-POLY1305-SHA256:TLS13-AES-128-GCM-SHA256:TLS13-AES-256-GCM-SHA384:ECDHE:!COMPLEMENTOFDEFAULT"

You can test which ciphersuites are included in a given ciphersuite selection string using the openssl ciphers -s -v command:

1
$ openssl ciphers -s -v "ECDHE:!COMPLEMENTOFDEFAULT"

Ensure that at least one ciphersuite supports TLSv1.3

Groups

In TLSv1.3 the client selects a “group” that it will use for key exchange. At the time of writing, OpenSSL only supports ECDHE groups for this (it is possible that DHE groups will also be supported by the time OpenSSL 1.1.1 is actually released). The client then sends “key_share” information to the server for its selected group in the ClientHello.

The list of supported groups is configurable. It is possible for a client to select a group that the server does not support. In this case the server requests that the client sends a new key_share that it does support. While this means a connection will still be established (assuming a mutually supported group exists), it does introduce an extra server round trip - so this has implications for performance. In the ideal scenario the client will select a group that the server supports in the first instance.

In practice most clients will use X25519 or P-256 for their initial key_share. For maximum performance it is recommended that servers are configured to support at least those two groups and clients use one of those two for its initial key_share. This is the default case (OpenSSL clients will use X25519).

The group configuration also controls the allowed groups in TLSv1.2 and below. If applications have previously configured their groups in OpenSSL 1.1.0 then you should review that configuration to ensure that it still makes sense for TLSv1.3. The first named (i.e. most preferred group) will be the one used by an OpenSSL client in its intial key_share.

Applications can configure the group list by using SSL_CTX_set1_groups() or a similar function (see here for further details). Alternatively, if applications use SSL_CONF style configuration files then this can be configured using the Groups or Curves command (see here).

Sessions

In TLSv1.2 and below a session is established as part of the handshake. This session can then be used in a subsequent connection to achieve an abbreviated handshake. Applications might typically obtain a handle on the session after a handshake has completed using the SSL_get1_session() function (or similar). See here for further details.

In TLSv1.3 sessions are not established until after the main handshake has completed. The server sends a separate post-handshake message to the client containing the session details. Typically this will happen soon after the handshake has completed, but it could be sometime later (or not at all).

The specification recommends that applications only use a session once (although this is not enforced). For this reason some servers send multiple session messages to a client. To enforce the “use once” recommendation applications could use SSL_CTX_remove_session() to mark a session as non-resumable (and remove it from the cache) once it has been used.

The old SSL_get1_session() and similar APIs may not operate as expected for client applications written for TLSv1.2 and below. Specifically if a client application calls SSL_get1_session() before the server message containing session details has been received then an SSL_SESSION object will still be returned, but any attempt to resume with it will not succeed and a full handshake will occur instead. In the case where multiple sessions have been sent by the server then only the last session will be returned by SSL_get1_session().

Client application developers should consider using the SSL_CTX_sess_set_new_cb() API instead (see here). This provides a callback mechanism which gets invoked every time a new session is established. This can get invoked multiple times for a single connection if a server sends multiple session messages.

Note that SSL_CTX_sess_set_new_cb() was also available in OpenSSL 1.1.0. Applications that already used that API will still work, but they may find that the callback is invoked at unexpected times, i.e. post-handshake.

An OpenSSL server will immediately attempt to send session details to a client after the main handshake has completed. To server applications this post-handshake stage will appear to be part of the main handshake, so calls to SSL_get1_session() should continue to work as before.

Custom Extensions and Certificate Transparency

In TLSv1.2 and below the initial ClientHello and ServerHello messages can contain “extensions”. This allows the base specifications to be extended with additional features and capabilities that may not be applicable in all scenarios or could not be foreseen at the time that the base specifications were written. OpenSSL provides support for a number of “built-in” extensions.

Additionally the custom extensions API provides some basic capabilities for application developers to add support for new extensions that are not built-in to OpenSSL.

Built on top of the custom extensions API is the “serverinfo” API. This provides an even more basic interface that can be configured at run time. One use case for this is Certificate Transparency. OpenSSL provides built-in support for the client side of Certificate Transparency but there is no built-in server side support. However this can easily be achieved using “serverinfo” files. A serverinfo file containing the Certificate Transparency information can be configured within OpenSSL and it will then be sent back to the client as appropriate.

In TLSv1.3 the use of extensions is expanded significantly and there are many more messages that can include them. Additionally some extensions that were applicable to TLSv1.2 and below are no longer applicable in TLSv1.3 and some extensions are moved from the ServerHello message to the EncryptedExtensions message. The old custom extensions API does not have the ability to specify which messages the extensions should be associated with. For that reason a new custom extensions API was required.

The old API will still work, but the custom extensions will only be added where TLSv1.2 or below is negotiated. To add custom extensions that work for all TLS versions application developers will need to update their applications to the new API (see here for details).

The “serverinfo” data format has also been updated to include additional information about which messages the extensions are relevant to. Applications using “serverinfo” files may need to update to the “version 2” file format to be able to operate in TLSv1.3 (see here and here for details).

Renegotiation

TLSv1.3 does not have renegotiation so calls to SSL_renegotiate() or SSL_renegotiate_abbreviated() will immediately fail if invoked on a connection that has negotiated TLSv1.3.

The most common use case for renegotiation is to update the connection keys. The function SSL_key_update() can be used for this purpose in TLSv1.3 (see https://www.openssl.org/docs/manmaster/man3/SSL_key_update.html).

DSA certificates

DSA certificates are no longer allowed in TLSv1.3. If your server application is using a DSA certificate then TLSv1.3 connections will fail with an error message similar to the following:

1
140348850206144:error:14201076:SSL routines:tls_choose_sigalg:no suitable signature algorithm:ssl/t1_lib.c:2308:

Please use an ECDSA or RSA certificate instead.

Conclusion

TLSv1.3 represents a significant step forward and has some exciting new features but there are some hazards for the unwary when upgrading. Mostly these issues have relatively straight forward solutions. Application developers should review their code and consider whether anything should be updated in order to work more effectively with TLSv1.3. Similarly application deployers should review their configuration.

May 04, 2017 11:00 AM

Michael Biven

Chai 2000 - 2017

I lost my little buddy of 17 years this week.

Chai, I’m happy you had a peaceful morning in the patio at our new house doing what you loved. Sitting under bushes and in the sun. Even in your last day you still comforted us. You were relaxed throughout the morning, when I picked you up and held you as we drove to the vet till the end.

Chai, thank you for everything, we love you and you will be missed.

May 04, 2017 08:04 AM

May 03, 2017

Marios Zindilis

Back-end Validation for Django Model Field Choices

In Django, you can provide a list of values that a field can have when creating a model. For example, here's a minimalistic model for storing musicians:

from django.db import models


class Artist(models.Model):

    TYPE_CHOICES = (
        ('Person', 'Person'),
        ('Group', 'Group'),
        ('Other', 'Other'),)

    name = models.CharField(max_length=100)
    type = models.CharField(max_length=20, choices=TYPE_CHOICES)

Using this code in the Admin interface, as well as in Forms, will make the "Type" field be represented as a drop-down box. Therefore, if all input comes from Admin and Forms, the value of "Type" is expected to be one of those defined in the TYPE_CHOICES tuple. There are 2 things worth noting:

  1. Using choices is a presentation convenience. There is no restriction of the value that can be sent to the database, even coming from the front-end. For example, if you use the browser's inspect feature, you can edit the drop-down list, change one of the values and submit it. The back-end view will then happily consume it and save it to the database.

  2. There is no validation at any stage of the input lifecycle. You may have a specified list of values in the Admin or Forms front-end, but your application may accept input from other sources as well, such as management commands, or bulk imports from fixtures. In none of those cases will a value that is not in the TYPE_CHOICES structure be rejected.

Validation

So let's find a way to fix this by adding some validation in the back-end.

Django has this amazing feature called signals. They allow you to attach extra functionality to some actions, by emitting a signal from that action. For example, in this case we will use the pre_save signal, which gets executed just before a model instance is saved to the database.

Here's the example model, with the signal connected to it:

from django.db import models


class Artist(models.Model):

    TYPE_CHOICES = (
        ('Person', 'Person'),
        ('Group', 'Group'),
        ('Other', 'Other'),)

    name = models.CharField(max_length=100)
    type = models.CharField(max_length=20, choices=TYPE_CHOICES)


models.signals.pre_save.connect(validate_artist_name_choice, sender=Artist)

That last line will call a function named validate_artist_name_choice just before an Artist is saved in the database. Let's write that function:


def validate_artist_name_choice(sender, instance, **kwargs):
    valid_types = [t[0] for t in sender.TYPE_CHOICES]
    if instance.type not in valid_types:
        from django.core.exceptions import ValidationError
        raise ValidationError(
            'Artist Type "{}" is not one of the permitted values: {}'.format(
                instance.type,
               ', '.join(valid_types)))

This function can be anywhere in your code, as long as you can import it in the model. Personally, if a pre_save function is specific to one model, I like to put it in the same module as the model itself, but that's just my preference.

So let's try to create a couple of Artists and see what happens. In this example, my App is called "v".

>>> from v.models import Artist
>>> 
>>> some_artist = Artist(name='Some Artist', type='Person')
>>> some_artist.save()
>>> 
>>> some_other_artist = Artist(name='Some Other Artist', type='Alien Lifeform')
>>> some_other_artist.save()
Traceback (most recent call last):
  File "/usr/lib/python3.5/code.py", line 91, in runcode
    exec(code, self.locals)
  File "<console>", line 1, in <module>
  File "/home/vs/.local/lib/python3.5/site-packages/django/db/models/base.py", line 796, in save
    force_update=force_update, update_fields=update_fields)
  File "/home/vs/.local/lib/python3.5/site-packages/django/db/models/base.py", line 820, in save_base
    update_fields=update_fields)
  File "/home/vs/.local/lib/python3.5/site-packages/django/dispatch/dispatcher.py", line 191, in send
    response = receiver(signal=self, sender=sender, **named)
  File "/home/vs/Tests/validator/v/models.py", line 14, in validate_artist_name_choice
    ', '.join(valid_types)))
django.core.exceptions.ValidationError: ['Artist Type "Alien Lifeform" is not one of the permitted values: Person, Group, Other']

Brilliant! We now have a expressive exception that we can catch in our views or scripts or whatever else we want to use to create model instances. This exception is thrown either when we call the .save() method of a model instance, or when we use the .create() method of a model class. Using the example model, we would get the same exception if we ran:

Artist.objects.create(name='Some Other Artist', type='Alien Lifeform')

Or even:

Artist.objects.get_or_create(name='Some Other Artist', type='Alien Lifeform')

Test It!

Here's a test that triggers the exception:

from django.test import TestCase
from django.core.exceptions import ValidationError
from .models import Artist


class TestArtist(TestCase):
    def setUp(self):
        self.test_artist = Artist(name='Some Artist', type='Some Type')

    def test_artist_type_choice(self):
        with self.assertRaises(ValidationError):
            self.test_artist.save()

Conclusion

Using Django's pre_save functionality, you can validate your application's input in the back-end, and avoid any surprises. Go forth and validate!

May 03, 2017 11:00 PM

Vincent Bernat

VXLAN: BGP EVPN with Cumulus Quagga

VXLAN is an overlay network to encapsulate Ethernet traffic over an existing (highly available and scalable, possibly the Internet) IP network while accomodating a very large number of tenants. It is defined in RFC 7348. For an uncut introduction on its use with Linux, have a look at my “VXLAN & Linux” post.

VXLAN deployment

In the above example, we have hypervisors hosting a virtual machines from different tenants. Each virtual machine is given access to a tenant-specific virtual Ethernet segment. Users are expecting classic Ethernet segments: no MAC restrictions1, total control over the IP addressing scheme they use and availability of multicast.

In a large VXLAN deployment, two aspects need attention:

  1. discovery of other endpoints (VTEPs) sharing the same VXLAN segments, and
  2. avoidance of BUM frames (broadcast, unknown unicast and multicast) as they have to be forwarded to all VTEPs.

A typical solution for the first point is using multicast. For the second point, this is source-address learning.

Introduction to BGP EVPN

BGP EVPN (RFC 7432 and draft-ietf-bess-evpn-overlay for its application to VXLAN) is a standard control protocol to efficiently solves those two aspects without relying on multicast nor source-address learning.

BGP EVPN relies on BGP (RFC 4271) and its MP-BGP extensions (RFC 4760). BGP is the routing protocol powering the Internet. It is highly scalable and interoperable. It is also extensible and one of its extension is MP-BGP. This extension can carry reachability information (NLRI) for multiple protocols (IPv4, IPv6, L3VPN and in our case EVPN). EVPN is a special family to advertise MAC addresses and the remote equipments they are attached to.

There are basically two kinds of reachability information a VTEP sends through BGP EVPN:

  1. the VNIs they have interest in (type 3 routes), and
  2. for each VNI, the local MAC addresses (type 2 routes).

The protocol also covers other aspects of virtual Ethernet segments (L3 reachability information from ARP/ND caches, MAC mobility and multi-homing2) but we won’t describe them here.

To deploy BGP EVPN, a typical solution is to use several route reflectors (both for redundancy and scalability), like in the picture below. Each VTEP opens a BGP session to at least two route reflectors, sends its information (MACs and VNIs) and receives others’. This reduces the number of BGP sessions to configure.

VXLAN deployment with route reflectors

Compared to other solutions to deploy VXLAN, BGP EVPN has three main advantages:

  • interoperability with other vendors (notably Juniper and Cisco),
  • proven scalability (a typical BGP routers handle several millions of routes), and
  • possibility to enforce fine-grained policies.

On Linux, Cumulus Quagga is a fairly complete implementation of BGP EVPN (type 3 routes for VTEP discovery, type 2 routes with MAC or IP addresses, MAC mobility when a host changes from one VTEP to another one) which requires very little configuration.

This is a fork of Quagga and currently used in Cumulus Linux, a network operating system based on Debian powering switches from various brands. At some point, BGP EVPN support will be contributed back to FRR, a community-maintained fork of Quagga3.

It should be noted the BGP EVPN implementation of Cumulus Quagga currently only supports IPv4.

Route reflector setup

Before configuring each VTEP, we need to configure two or more route reflectors. There are many solutions. I will present three of them:

  • using Cumulus Quagga,
  • using GoBGP, an implementation of BGP in Go,
  • using Juniper JunOS.

For reliability purpose, it’s possible (and easy) to use one implementation for some route reflectors and another implementation for the other ones.

The proposed configurations are quite minimal. However, it is possible to centralize policies on the route reflectors (e.g. routes tagged with some community can only be readvertised to some group of VTEPs).

Using Quagga

The configuration is pretty simple. We suppose the configured route reflector has 203.0.113.254 configured as a loopback IP.

router bgp 65000
  bgp router-id 203.0.113.254
  bgp cluster-id 203.0.113.254
  bgp log-neighbor-changes
  no bgp default ipv4-unicast
  neighbor fabric peer-group
  neighbor fabric remote-as 65000
  neighbor fabric capability extended-nexthop
  neighbor fabric update-source 203.0.113.254
  bgp listen range 203.0.113.0/24 peer-group fabric
  !
  address-family evpn
   neighbor fabric activate
   neighbor fabric route-reflector-client
  exit-address-family
  !
  exit
!

A peer group fabric is defined and we leverage the dynamic neighbor feature of Cumulus Quagga: we don’t have to explicitely define each neighbor. Any client from 203.0.113.0/24 and presenting itself as part of AS 65000 can connect. All sent EVPN routes will be accepted and reflected to the other clients.

You don’t need to run Zebra, the route engine talking with the kernel. Instead, start bgpd with the --no_kernel flag.

Using GoBGP

GoBGP is a clean implementation of BGP in Go4. It exposes an RPC API for configuration (but accepts a configuration file and comes with a command-line client).

It doesn’t support dynamic neighbors yet, but5 you can use the API, the command-line client or some templating language to automate peer declarations. A configuration with only one neighbor is like this:

global:
  config:
    as: 65000
    router-id: 203.0.113.254
    local-address-list:
      - 203.0.113.254
neighbors:
  - config:
      neighbor-address: 203.0.113.1
      peer-as: 65000
    afi-safis:
      - config:
          afi-safi-name: l2vpn-evpn
    route-reflector:
      config:
        route-reflector-client: true
        route-reflector-cluster-id: 203.0.113.254

More neighbors can be added from the command line:

$ gobgp neighbor add 203.0.113.2 as 65000 \
>         route-reflector-client 203.0.113.254 \
>         --address-family evpn

GoBGP won’t try to interact with the kernel which is fine as a route reflector.

Using Juniper JunOS

A variety of Juniper products can be a BGP route reflector, notably:

The main factor is the CPU and the memory. The QFX5100 is low on memory and won’t support large deployments without some additional policing.

Here is a configuration similar to the Quagga one:

interfaces {
    lo0 {
        unit 0 {
            family inet {
                address 203.0.113.254/32;
            }
        }
    }
}

protocols {
    bgp {
        group fabric {
            family evpn {
                signaling {
                    /* Do not try to install EVPN routes */
                    no-install;
                }
            }
            type internal;
            cluster 203.0.113.254;
            local-address 203.0.113.254;
            allow 203.0.113.0/24;
        }
    }
}

routing-options {
    router-id 203.0.113.254;
    autonomous-system 65000;
}

VTEP setup

The next step is to configure each VTEP/hypervisor. Each VXLAN is locally configured using a bridge for local virtual interfaces, like illustrated in the below schema. The bridge is taking care of the local MAC addresses (notably, using source-address learning) and the VXLAN interface takes care of the remote MAC addresses (received with BGP EVPN).

Bridged VXLAN device

VXLANs can be provisioned with the following script. Source-address learning is disabled as we will rely solely on BGP EVPN to synchronize FDBs between the hypervisors.

for vni in 100 200; do
    # Create VXLAN interface
    ip link add vxlan${vni} type vxlan
        id ${vni} \
        dstport 4789 \
        local 203.0.113.2 \
        nolearning
    # Create companion bridge
    brctl addbr br${vni}
    brctl addif br${vni} vxlan${vni}
    brctl stp br${vni} off
    ip link set up dev br${vni}
    ip link set up dev vxlan${vni}
done
# Attach each VM to the appropriate segment
brctl addif br100 vnet10
brctl addif br100 vnet11
brctl addif br200 vnet12

The configuration of Cumulus Quagga is similar to the one used for a route reflector, except we use the advertise-all-vni directive to publish all local VNIs.

router bgp 65000
  bgp router-id 203.0.113.2
  no bgp default ipv4-unicast
  neighbor fabric peer-group
  neighbor fabric remote-as 65000
  neighbor fabric capability extended-nexthop
  neighbor fabric update-source dummy0
  ! BGP sessions with route reflectors
  neighbor 203.0.113.253 peer-group fabric
  neighbor 203.0.113.254 peer-group fabric
  !
  address-family evpn
   neighbor fabric activate
   advertise-all-vni
  exit-address-family
  !
  exit
!

If everything works as expected, the instances sharing the same VNI should be able to ping each other. If IPv6 is enabled on the VMs, the ping command shows if everything is in order:

$ ping -c10 -w1 -t1 ff02::1%eth0
PING ff02::1%eth0(ff02::1%eth0) 56 data bytes
64 bytes from fe80::5254:33ff:fe00:8%eth0: icmp_seq=1 ttl=64 time=0.016 ms
64 bytes from fe80::5254:33ff:fe00:b%eth0: icmp_seq=1 ttl=64 time=4.98 ms (DUP!)
64 bytes from fe80::5254:33ff:fe00:9%eth0: icmp_seq=1 ttl=64 time=4.99 ms (DUP!)
64 bytes from fe80::5254:33ff:fe00:a%eth0: icmp_seq=1 ttl=64 time=4.99 ms (DUP!)

--- ff02::1%eth0 ping statistics ---
1 packets transmitted, 1 received, +3 duplicates, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.016/3.745/4.991/2.152 ms

Verification

Step by step, let’s check how everything comes together.

Getting VXLAN information from the kernel

On each VTEP, Quagga should be able to retrieve the information about configured VXLANs. This can be checked with vtysh:

# show interface vxlan100
Interface vxlan100 is up, line protocol is up
  Link ups:       1    last: 2017/04/29 20:01:33.43
  Link downs:     0    last: (never)
  PTM status: disabled
  vrf: Default-IP-Routing-Table
  index 11 metric 0 mtu 1500
  flags: <UP,BROADCAST,RUNNING,MULTICAST>
  Type: Ethernet
  HWaddr: 62:42:7a:86:44:01
  inet6 fe80::6042:7aff:fe86:4401/64
  Interface Type Vxlan
  VxLAN Id 100
  Access VLAN Id 1
  Master (bridge) ifindex 9 ifp 0x56536e3f3470

The important points are:

  • the VNI is 100, and
  • the bridge device was correctly detected.

Quagga should also be able to retrieve information about the local MAC addresses :

# show evpn mac vni 100
Number of MACs (local and remote) known for this VNI: 2
MAC               Type   Intf/Remote VTEP      VLAN
50:54:33:00:00:0a local  eth1.100
50:54:33:00:00:0b local  eth2.100

BGP sessions

Each VTEP has to establish a BGP session to the route reflectors. On the VTEP, this can be checked by running vtysh:

# show bgp neighbors 203.0.113.254
BGP neighbor is 203.0.113.254, remote AS 65000, local AS 65000, internal link
 Member of peer-group fabric for session parameters
  BGP version 4, remote router ID 203.0.113.254
  BGP state = Established, up for 00:00:45
  Neighbor capabilities:
    4 Byte AS: advertised and received
    AddPath:
      L2VPN EVPN: RX advertised L2VPN EVPN
    Route refresh: advertised and received(new)
    Address family L2VPN EVPN: advertised and received
    Hostname Capability: advertised
    Graceful Restart Capabilty: advertised
[...]
 For address family: L2VPN EVPN
  fabric peer-group member
  Update group 1, subgroup 1
  Packet Queue length 0
  Community attribute sent to this neighbor(both)
  8 accepted prefixes

  Connections established 1; dropped 0
  Last reset never
Local host: 203.0.113.2, Local port: 37603
Foreign host: 203.0.113.254, Foreign port: 179

The output includes the following information:

  • the BGP state is Established,
  • the address family L2VPN EVPN is correctly advertised, and
  • 8 routes are received from this route reflector.

The state of the BGP sessions can also be checked from the route reflectors. With GoBGP, use the following command:

# gobgp neighbor 203.0.113.2
BGP neighbor is 203.0.113.2, remote AS 65000, route-reflector-client
  BGP version 4, remote router ID 203.0.113.2
  BGP state = established, up for 00:04:30
  BGP OutQ = 0, Flops = 0
  Hold time is 9, keepalive interval is 3 seconds
  Configured hold time is 90, keepalive interval is 30 seconds
  Neighbor capabilities:
    multiprotocol:
        l2vpn-evpn:     advertised and received
    route-refresh:      advertised and received
    graceful-restart:   received
    4-octet-as: advertised and received
    add-path:   received
    UnknownCapability(73):      received
    cisco-route-refresh:        received
[...]
  Route statistics:
    Advertised:             8
    Received:               5
    Accepted:               5

With JunOS, use the below command:

> show bgp neighbor 203.0.113.2
Peer: 203.0.113.2+38089 AS 65000 Local: 203.0.113.254+179 AS 65000
  Group: fabric                Routing-Instance: master
  Forwarding routing-instance: master
  Type: Internal    State: Established
  Last State: OpenConfirm   Last Event: RecvKeepAlive
  Last Error: None
  Options: <Preference LocalAddress Cluster AddressFamily Rib-group Refresh>
  Address families configured: evpn
  Local Address: 203.0.113.254 Holdtime: 90 Preference: 170
  NLRI evpn: NoInstallForwarding
  Number of flaps: 0
  Peer ID: 203.0.113.2     Local ID: 203.0.113.254     Active Holdtime: 9
  Keepalive Interval: 3          Group index: 0    Peer index: 2
  I/O Session Thread: bgpio-0 State: Enabled
  BFD: disabled, down
  NLRI for restart configured on peer: evpn
  NLRI advertised by peer: evpn
  NLRI for this session: evpn
  Peer supports Refresh capability (2)
  Stale routes from peer are kept for: 300
  Peer does not support Restarter functionality
  NLRI that restart is negotiated for: evpn
  NLRI of received end-of-rib markers: evpn
  NLRI of all end-of-rib markers sent: evpn
  Peer does not support LLGR Restarter or Receiver functionality
  Peer supports 4 byte AS extension (peer-as 65000)
  NLRI's for which peer can receive multiple paths: evpn
  Table bgp.evpn.0 Bit: 20000
    RIB State: BGP restart is complete
    RIB State: VPN restart is complete
    Send state: in sync
    Active prefixes:              5
    Received prefixes:            5
    Accepted prefixes:            5
    Suppressed due to damping:    0
    Advertised prefixes:          8
  Last traffic (seconds): Received 276  Sent 170  Checked 276
  Input messages:  Total 61     Updates 3       Refreshes 0     Octets 1470
  Output messages: Total 62     Updates 4       Refreshes 0     Octets 1775
  Output Queue[1]: 0            (bgp.evpn.0, evpn)

If a BGP session cannot be established, the logs of each BGP daemon should mention the cause.

Sent routes

From each VTEP, Quagga needs to send:

  • one type 3 route for each local VNI, and
  • one type 2 route for each local MAC address.

The best place to check the received routes is on one of the route reflectors. If you are using JunOS, the following command will display the received routes from the provided VTEP:

> show route table bgp.evpn.0 receive-protocol bgp 203.0.113.2

bgp.evpn.0: 10 destinations, 10 routes (10 active, 0 holddown, 0 hidden)
  Prefix                  Nexthop              MED     Lclpref    AS path
  2:203.0.113.2:100::0::50:54:33:00:00:0a/304 MAC/IP
*                         203.0.113.2                  100        I
  2:203.0.113.2:100::0::50:54:33:00:00:0b/304 MAC/IP
*                         203.0.113.2                  100        I
  3:203.0.113.2:100::0::203.0.113.2/304 IM
*                         203.0.113.2                  100        I
  3:203.0.113.2:200::0::203.0.113.2/304 IM
*                         203.0.113.2                  100        I

There is one type 3 route for VNI 100 and another one for VNI 200. There are also two type 2 routes for two MAC addresses on VNI 100. To get more information, you can add the keyword extensive. Here is a type 3 route advertising 203.0.113.2 as a VTEP for VNI 1009:

> show route table bgp.evpn.0 receive-protocol bgp 203.0.113.2 extensive

bgp.evpn.0: 11 destinations, 11 routes (11 active, 0 holddown, 0 hidden)
* 3:203.0.113.2:100::0::203.0.113.2/304 IM (1 entry, 1 announced)
     Accepted
     Route Distinguisher: 203.0.113.2:100
     Nexthop: 203.0.113.2
     Localpref: 100
     AS path: I
     Communities: target:65000:268435556 encapsulation:vxlan(0x8)
[...]

Here is a type 2 route announcing the location of the 50:54:33:00:00:0a MAC address for VNI 100:

> show route table bgp.evpn.0 receive-protocol bgp 203.0.113.2 extensive

bgp.evpn.0: 11 destinations, 11 routes (11 active, 0 holddown, 0 hidden)
* 2:203.0.113.2:100::0::50:54:33:00:00:0a/304 MAC/IP (1 entry, 1 announced)
     Accepted
     Route Distinguisher: 203.0.113.2:100
     Route Label: 100
     ESI: 00:00:00:00:00:00:00:00:00:00
     Nexthop: 203.0.113.2
     Localpref: 100
     AS path: I
     Communities: target:65000:268435556 encapsulation:vxlan(0x8)
[...]

With Quagga, you can get a similar output with vtysh:

# show bgp evpn route
BGP table version is 0, local router ID is 203.0.113.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]

   Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 203.0.113.2:100
*>i[2]:[0]:[0]:[48]:[50:54:33:00:00:0a]
                    203.0.113.2                   100      0 i
*>i[2]:[0]:[0]:[48]:[50:54:33:00:00:0b]
                    203.0.113.2                   100      0 i
*>i[3]:[0]:[32]:[203.0.113.2]
                    203.0.113.2                   100      0 i
Route Distinguisher: 203.0.113.2:200
*>i[3]:[0]:[32]:[203.0.113.2]
                    203.0.113.2                   100      0 i
[...]

With GoBGP, use the following command:

# gobgp global rib -a evpn | grep rd:203.0.113.2:200
    Network  Next Hop             AS_PATH              Age        Attrs
*>  [type:macadv][rd:203.0.113.2:100][esi:single-homed][etag:0][mac:50:54:33:00:00:0a][ip:<nil>][labels:[100]]203.0.113.2                               00:00:17   [{Origin: i} {LocalPref: 100} {Extcomms: [VXLAN], [65000:268435556]}]
*>  [type:macadv][rd:203.0.113.2:100][esi:single-homed][etag:0][mac:50:54:33:00:00:0b][ip:<nil>][labels:[100]]203.0.113.2                               00:00:17   [{Origin: i} {LocalPref: 100} {Extcomms: [VXLAN], [65000:268435556]}]
*>  [type:macadv][rd:203.0.113.2:200][esi:single-homed][etag:0][mac:50:54:33:00:00:0a][ip:<nil>][labels:[200]]203.0.113.2                               00:00:17   [{Origin: i} {LocalPref: 100} {Extcomms: [VXLAN], [65000:268435656]}]
*>  [type:multicast][rd:203.0.113.2:100][etag:0][ip:203.0.113.2]203.0.113.2                               00:00:17   [{Origin: i} {LocalPref: 100} {Extcomms: [VXLAN], [65000:268435556]}]
*>  [type:multicast][rd:203.0.113.2:200][etag:0][ip:203.0.113.2]203.0.113.2                               00:00:17   [{Origin: i} {LocalPref: 100} {Extcomms: [VXLAN], [65000:268435656]}]

Received routes

Each VTEP should have received the type 2 and type 3 routes from its fellow VTEPs, through the route reflectors. You can check with the show bgp evpn route command of vtysh.

Does Quagga correctly understand the received routes? The type 3 routes are translated to an assocation between the remote VTEPs and the VNIs:

# show evpn vni
Number of VNIs: 2
VNI        VxLAN IF              VTEP IP         # MACs   # ARPs   Remote VTEPs
100        vxlan100              203.0.113.2     4        0        203.0.113.3
                                                                   203.0.113.1
200        vxlan200              203.0.113.2     3        0        203.0.113.3
                                                                   203.0.113.1

The type 2 routes are translated to an association between the remote MACs and the remote VTEPs:

# show evpn mac vni 100
Number of MACs (local and remote) known for this VNI: 4
MAC               Type   Intf/Remote VTEP      VLAN
50:54:33:00:00:09 remote 203.0.113.1
50:54:33:00:00:0a local  eth1.100
50:54:33:00:00:0b local  eth2.100
50:54:33:00:00:0c remote 203.0.113.3

FDB configuration

The last step is to ensure Quagga has correctly provided the received information to the kernel. This can be checked with the bridge command:

# bridge fdb show dev vxlan100 | grep dst
00:00:00:00:00:00 dst 203.0.113.1 self permanent
00:00:00:00:00:00 dst 203.0.113.3 self permanent
50:54:33:00:00:0c dst 203.0.113.3 self
50:54:33:00:00:09 dst 203.0.113.1 self

All good! The two first lines are the translation of the type 3 routes (any BUM frame will be sent to both 203.0.113.1 and 203.0.113.3) and the two last ones are the translation of the type 2 routes.

Interoperability

One of the strength of BGP EVPN is the interoperability with other network vendors. To demonstrate it works as expected, we will configure a Juniper vMX to act as a VTEP.

First, we need to configure the physical bridge10. This is similar to the use of ip link and brctl with Linux. We only configure one physical interface with two old-school VLANs paired with matching VNIs.

interfaces {
    ge-0/0/1 {
        unit 0 {
            family bridge {
                interface-mode trunk;
                vlan-id-list [ 100 200 ];
            }
        }
    }
}
routing-instances {
    switch {
        instance-type virtual-switch;
        interface ge-0/0/1.0;
        bridge-domains {
            vlan100 {
                domain-type bridge;
                vlan-id 100;
                vxlan {
                    vni 100;
                    ingress-node-replication;
                }
            }
            vlan200 {
                domain-type bridge;
                vlan-id 200;
                vxlan {
                    vni 200;
                    ingress-node-replication;
                }
            }
        }
    }
}

Then, we configure BGP EVPN to advertise all known VNIs. The configuration is quite similar to the one we did with Quagga:

protocols {
    bgp {
        group fabric {
            type internal;
            multihop;
            family evpn signaling;
            local-address 203.0.113.3;
            neighbor 203.0.113.253;
            neighbor 203.0.113.254;
        }
    }
}

routing-instances {
    switch {
        vtep-source-interface lo0.0;
        route-distinguisher 203.0.113.3:1; # ❶
        vrf-import EVPN-VRF-VXLAN;
        vrf-target {
            target:65000:1;
            auto;
        }
        protocols {
            evpn {
                encapsulation vxlan;
                extended-vni-list all;
                multicast-mode ingress-replication;
            }
        }
    }
}

routing-options {
    router-id 203.0.113.3;
    autonomous-system 65000;
}

policy-options {
    policy-statement EVPN-VRF-VXLAN {
        then accept;
    }
}

We also need a small compatibility patch for Cumulus Quagga11.

The routes sent by this configuration are very similar to the routes sent by Quagga. The main differences are:

  • on JunOS, the route distinguisher is configured statically (in ❶), and
  • on JunOS, the VNI is also encoded as an Ethernet tag ID.

Here is a type 3 route, as sent by JunOS:

> show route table bgp.evpn.0 receive-protocol bgp 203.0.113.3 extensive

bgp.evpn.0: 13 destinations, 13 routes (13 active, 0 holddown, 0 hidden)
* 3:203.0.113.3:1::100::203.0.113.3/304 IM (1 entry, 1 announced)
     Accepted
     Route Distinguisher: 203.0.113.3:1
     Nexthop: 203.0.113.3
     Localpref: 100
     AS path: I
     Communities: target:65000:268435556 encapsulation:vxlan(0x8)
     PMSI: Flags 0x0: Label 6: Type INGRESS-REPLICATION 203.0.113.3
[...]

Here is a type 2 route:

> show route table bgp.evpn.0 receive-protocol bgp 203.0.113.3 extensive

bgp.evpn.0: 13 destinations, 13 routes (13 active, 0 holddown, 0 hidden)
* 2:203.0.113.3:1::200::50:54:33:00:00:0f/304 MAC/IP (1 entry, 1 announced)
     Accepted
     Route Distinguisher: 203.0.113.3:1
     Route Label: 200
     ESI: 00:00:00:00:00:00:00:00:00:00
     Nexthop: 203.0.113.3
     Localpref: 100
     AS path: I
     Communities: target:65000:268435656 encapsulation:vxlan(0x8)
[...]

We can check that the vMX is able to make sense of the routes it receives from its peers running Quagga:

> show evpn database l2-domain-id 100
Instance: switch
VLAN  DomainId  MAC address        Active source                  Timestamp        IP address
     100        50:54:33:00:00:0c  203.0.113.1                    Apr 30 12:46:20
     100        50:54:33:00:00:0d  203.0.113.2                    Apr 30 12:32:42
     100        50:54:33:00:00:0e  203.0.113.2                    Apr 30 12:46:20
     100        50:54:33:00:00:0f  ge-0/0/1.0                     Apr 30 12:45:55

On the other end, if we look at one of the Quagga-based VTEP, we can check the received routes are correctly understood:

# show evpn vni 100
VNI: 100
 VxLAN interface: vxlan100 ifIndex: 9 VTEP IP: 203.0.113.1
 Remote VTEPs for this VNI:
  203.0.113.3
  203.0.113.2
 Number of MACs (local and remote) known for this VNI: 4
 Number of ARPs (IPv4 and IPv6, local and remote) known for this VNI: 0
# show evpn mac vni 100
Number of MACs (local and remote) known for this VNI: 4
MAC               Type   Intf/Remote VTEP      VLAN
50:54:33:00:00:0c local  eth1.100
50:54:33:00:00:0d remote 203.0.113.2
50:54:33:00:00:0e remote 203.0.113.2
50:54:33:00:00:0f remote 203.0.113.3

Get in touch if you have some success with other vendors!


  1. For example, they may use bridges to connect containers together. 

  2. Such a feature can replace proprietary implementations of MC-LAG allowing several VTEPs to act as a endpoint for a single link aggregation group. This is not needed on our scenario where hypervisors act as VTEPs

  3. The development of Quagga is slow and “closed”. New features are often stalled. FRR is placed under the umbrella of the Linux Foundation, has a GitHub-centered development model and an election process. It already has several interesting enhancements (notably, BGP add-path, BGP unnumbered, MPLS and LDP). 

  4. I am unenthusiastic about projects whose the sole purpose is to rewrite something in Go. However, while being quite young, GoBGP is quite valuable on its own (good architecture, good performance). 

  5. Since GoBGP 1.21, dynamic neighbors are supported. Have a look at this configuration example

  6. The 48-port version is around $10,000 with the BGP license. 

  7. An empty chassis with a dual routing engine (RE-S-1800X4-16G) is around $30,000. 

  8. I don’t know how pricey the vRR is. For evaluation purposes, it can be downloaded for free if you are a customer. 

  9. The value 100 used in the route distinguishier (203.0.113.2:100) is not the one used to encode the VNI. The VNI is encoded in the route target (65000:268435556), in the 24 least signifiant bits (268435556 & 0xffffff equals 100). As long as VNIs are unique, we don’t have to understand those details. 

  10. For some reason, the use of a virtual switch is mandatory. This is specific to this platform: a QFX doesn’t require this. 

  11. The encoding of the VNI into the route target is being standardized in draft-ietf-bess-evpn-overlay. Juniper already implements this draft. 

by Vincent Bernat at May 03, 2017 09:56 PM

May 01, 2017

Ben's Practical Admin Blog

It’s Nice to be Appreciated

Quite out of the blue I had a parcel arrive on my doorstep while I was on leave. It was notable in that I had received all the parcels I thought I was expecting.

As it turns out the fine administration over at ITPA felt that I do quite a bit to help out this organisation and wanted to thank me with a signed copy of Tom Limoncelli’s tome of The Practice of System and Network Administration.

I feel suitably humble. It’s nice to be appreciated 🙂


by Ben at May 01, 2017 09:20 AM

Electricmonk.nl

ScriptForm v1.3: Webserver that automatically generates forms to serve as frontends to scripts

I've just released v1.3 of Scriptform.

ScriptForm is a stand-alone webserver that automatically generates forms from JSON to serve as frontends to scripts. It takes a JSON file which contains form definitions, constructs web forms from this JSON and serves these to users over HTTP. The user can select a form and fill it out. When the user submits the form, it is validated and the associated script is called. Data entered in the form is passed to the script through the environment.

You can get the new release from the Releases page.

More information, including screenshots, can be found on the Github page.

by admin at May 01, 2017 06:51 AM

April 28, 2017

The Lone Sysadmin

Intel’s Memory Drive Implementation for Optane Guarantees its Doom

A few weeks ago Intel started releasing their Optane product, a commercialization of the 3D Xpoint (Crosspoint) technology they’ve been talking about for a few years. Predictably, there has been a lot of commentary in all directions. Did you know it’s game changing, or that it’s a solution looking for a problem? It’s storage. It […]

The post Intel’s Memory Drive Implementation for Optane Guarantees its Doom appeared first on The Lone Sysadmin. Head over to the source to read the full post!

by Bob Plankers at April 28, 2017 08:26 PM

April 22, 2017

Pantz.org

DNS over TLS. Secure the DNS!

Secure the Web

A while back we had the "secure the web" initiative, where everyone was inspired to enable encryption (https) on their websites. This was so we could thwart things like eavesdropping and content hijacking. In 2016 about about half of website visits where https. This is great and things seem to be only getting better. ISP's can not see the content in https traffic. Not seeing your traffic content anymore makes them sad. What makes them happy? They can still see all of your DNS requests.

Your ISP can see the websites you visit

Every ISP assigns you some of their DNS servers for you to use when you connect to them for your internet connection. Every time you type in a website name in your browser bar, this request goes to their DNS servers to look up an number called an IP address. After this happens an IP address is returned to your computer, and the connection to the website is made. Your ISP now has a log of the website you requested attached to the rest of the information they have about you. Then they build profiles about you, and sell that info to 3rd parties to target advertising to you in many different ways. Think you'll be slick and switch out their DNS servers with someone else's like Google's free DNS servers (8.8.8.8)? Think again. Any request through your ISP to any DNS server on the internet is unencrypted. Your ISP can slurp up all same requests and get the same info they did just like when you were using their DNS servers. Just like when they could see all of your content with http before https. This also means that your DNS traffic could possibly be intercepted and hijacked by other people or entities. TLS prevents this.

Securing the DNS

The thing that secures https is Transport Layer Security (TLS). It is a set of cryptographic protocols that provides communications security over a computer network. Now that we are beginning to secure the websites I think it is high time we secure the DNS. Others seem to agree. In 2016 rfc7858 and rfc8094 was submitted to Internet Engineering Task Force (IETF) which describes the use of DNS over TLS and DNS over DTLS. Hopefully these will eventually become a standard and all DNS traffic will be more secure in transit. Can you have DNS over TLS today? Yes you can!

Trying DNS over TLS now

DNS over TLS is in its infancy currently, but there are ways to try it out now. You could try using Stubby, a program that acts as a local DNS Privacy stub resolver (using DNS-over-TLS). You will have compile Stubby on Linux or OS X to use it. You could also setup your own DNS server at home and point it to some upstream forwarders that support DNS over TLS. This is what I have done to start testing this. I use Unbound as my local DNS server on my lan for all of my client machines. I used the configuration settings provided by our friends over at Calomel.org to setup my Ubound server to use DNS over TLS. Here is a list of other open source DNS software that can use DNS over TLS. With my Unbound setup, all of my DNS traffic is secured from interception and modification. So how is my testing going?

The early days

Since this is not a IETF standard yet, there are not a lot of providers of DNS over TLS resolvers. I have had to rearrange my list of DNS over TLS providers a few times when some of the servers were just not resolving hostnames. The latency is also higher than using your local ISP's DNS servers or using someones like Googles DNS servers. This is not very noticeable since my local DNS server caches the lookups. I have a feeling the generous providers of these DNS over TLS services are being overwhelmed and can not handle the load of the requests. This is where bigger companies come into play.

Places like Google or OpenDNS do not support DNS over TLS yet, but I'm hoping that they will get on board with this. Google especially since they have been big proponents of making all websites https. They also have the infrastructure to pull this off. Even if someone like Google turned this on, that means they get your DNS traffic instead of your ISP. Will this ever end?

Uggh, people can still see my traffic

Let's face it, if your connected to the internet, at some point someone gets to see where your going. You just have to choose who you want/trust to give this information to. If you point your DNS servers to Google they get to see your DNS requests. If I point my DNS at these test DNS over TLS servers then they get to see my DNS traffic. It seems like the lesser of 2 evils to send your DNS to 2nd party DNS servers then to your ISP. If you use your ISP's DNS servers they know the exact name attached to the IP address they assigned you and the customer that is making the query. I have been holding off telling you the bad news. https SNI will still give up the domain names you visit.

Through all of this even if you point your DNS traffic to a DNS over TLS server your ISP can still see many of the sites you go to. This is thanks to something in https called Server Name Indication (SNI). When you make a connection to an https enabled website there is a process called a handshake. This is the exchange of information before the encryption starts. During this unencrypted handshake (ClientHello) one of the things that is sent by you is a remote host name. This allows the server on the other end to choose appropriate certificate based on the requested host name. This happens when multiple virtual hosts reside on the same server, and this a very common setup. Unfortunately, your ISP can see this, slurp it up, and log this to your account/profile. So now what?

Would a VPN help? Yes, but remember now your DNS queries go to your VPN provider. What is nice is your ISP will not see any of your traffic anymore. That pesky SNI issue mentioned above goes away when using a VPN. But now your trusted endpoint is your VPN provider. They now can log all the sites you go to. So choose wisely when picking a VPN provider. Read their policy on saving logs, choose one that will allow you to pay with Bitcoin so you will be anonymous as possible. With a VPN provider you also have to be careful about DNS leaking. If your VPN client is not configured right, or you forget to turn it on or, any other myriad of ways a VPN can fail, your traffic will go right back to your ISP.

Even VPN's don't make you anonymous

So you have encrypted your DNS and web traffic with TLS and your using a VPN. Good for you, now your privacy is a bit better, but not anonymity. Your still being tracked. This time it is AD networks and services you use. I'm not going to go into this as many other people have written on this topic. Just know that your being tracked one way or another.

I know this all seems hopeless, but securing the web's infrasturcture bit by bit helps improve privacy just a little more. DNS like http is unencrypted. There was a big push to get websites to encrypt their data, now there needs to be the same attention given to DNS.

by Pantz.org at April 22, 2017 09:59 PM

April 19, 2017

That grumpy BSD guy

Forcing the password gropers through a smaller hole with OpenBSD's PF queues

While preparing material for the upcoming BSDCan PF and networking tutorial, I realized that the pop3 gropers were actually not much fun to watch anymore. So I used the traffic shaping features of my OpenBSD firewall to let the miscreants inflict some pain on themselves. Watching logs became fun again.

Yes, in between a number of other things I am currently in the process of creating material for new and hopefully better PF and networking session.

I've been fishing for suggestions for topics to include in the tutorials on relevant mailing lists, and one suggestion that keeps coming up (even though it's actually covered in the existling slides as well as The Book of PF) is using traffic shaping features to punish undesirable activity, such as


What Dan had in mind here may very well end up in the new slides, but in the meantime I will show you how to punish abusers of essentially any service with the tools at hand in your OpenBSD firewall.

Regular readers will know that I'm responsible for maintaining a set of mail services including a pop3 service, and that our site sees pretty much round-the-clock attempts at logging on to that service with user names that come mainly from the local part of the spamtrap addresses that are part of the system to produce our hourly list of greytrapped IP addresses.

But do not let yourself be distracted by this bizarre collection of items that I've maintained and described in earlier columns. The actual useful parts of this article follow - take this as a walkthrough of how to mitigate a wide range of threats and annoyances.

First, analyze the behavior that you want to defend against. In our case that's fairly obvious: We have a service that's getting a volume of unwanted traffic, and looking at our logs the attempts come fairly quickly with a number of repeated attempts from each source address. This similar enough to both the traditional ssh bruteforce attacks and for that matter to Dan's website scenario that we can reuse some of the same techniques in all of the configurations.

I've written about the rapid-fire ssh bruteforce attacks and their mitigation before (and of course it's in The Book of PF) as well as the slower kind where those techniques actually come up short. The traditional approach to ssh bruteforcers has been to simply block their traffic, and the state-tracking features of PF let you set up overload criteria that add the source addresses to the table that holds the addresses you want to block.

I have rules much like the ones in the example in place where there I have a SSH service running, and those bruteforce tables are never totally empty.

For the system that runs our pop3 service, we also have a PF ruleset in place with queues for traffic shaping. For some odd reason that ruleset is fairly close to the HFSC traffic shaper example in The Book of PF, and it contains a queue that I set up mainly as an experiment to annoy spammers (as in, the ones that are already for one reason or the other blacklisted by our spamd).

The queue is defined like this:

   queue spamd parent rootq bandwidth 1K min 0K max 1K qlimit 300

yes, that's right. A queue with a maximum throughput of 1 kilobit per second. I have been warned that this is small enough that the code may be unable to strictly enforce that limit due to the timer resolution in the HFSC code. But that didn't keep me from trying.

And now that I had another group of hosts that I wanted to just be a little evil to, why not let the password gropers and the spammers share the same small patch of bandwidth?

Now a few small additions to the ruleset are needed for the good to put the evil to the task. We start with a table to hold the addresses we want to mess with. Actually, I'll add two, for reasons that will become clear later:

table <longterm> persist counters
table <popflooders> persist counters 

 
The rules that use those tables are:

block drop log (all) quick from <longterm> 


pass in quick log (all) on egress proto tcp from <popflooders> to port pop3 flags S/SA keep state \ 
(max-src-conn 1, max-src-conn-rate 1/1, overload <longterm> flush global, pflow) set queue spamd 

pass in log (all) on egress proto tcp to port pop3 flags S/SA keep state \ 
(max-src-conn 5, max-src-conn-rate 6/3, overload <popflooders> flush global, pflow) 
 
The last one lets anybody connect to the pop3 service, but any one source address can have only open five simultaneous connections and at a rate of six over three seconds.

Any source that trips up one of these restrictions is overloaded into the popflooders table, the flush global part means any existing connections that source has are terminated, and when they get to try again, they will instead match the quick rule that assigns the new traffic to the 1 kilobyte queue.

The quick rule here has even stricter limits on the number of allowed simultaneous connections, and this time any breach will lead to membership of the longterm table and the block drop treatment.

The for the longterm table I already had in place a four week expiry (see man pfctl for detail on how to do that), and I haven't gotten around to deciding what, if any, expiry I will set up for the popflooders.

The results were immediately visible. Monitoring the queues using pfctl -vvsq shows the tiny queue works as expected:

 queue spamd parent rootq bandwidth 1K, max 1K qlimit 300
  [ pkts:     196136  bytes:   12157940  dropped pkts: 398350 bytes: 24692564 ]
  [ qlength: 300/300 ]
  [ measured:     2.0 packets/s, 999.13 b/s ]


and looking at the pop3 daemon's log entries, a typical encounter looks like this:

Apr 19 22:39:33 skapet spop3d[44875]: connect from 111.181.52.216
Apr 19 22:39:33 skapet spop3d[75112]: connect from 111.181.52.216
Apr 19 22:39:34 skapet spop3d[57116]: connect from 111.181.52.216
Apr 19 22:39:34 skapet spop3d[65982]: connect from 111.181.52.216
Apr 19 22:39:34 skapet spop3d[58964]: connect from 111.181.52.216
Apr 19 22:40:34 skapet spop3d[12410]: autologout time elapsed - 111.181.52.216
Apr 19 22:40:34 skapet spop3d[63573]: autologout time elapsed - 111.181.52.216
Apr 19 22:40:34 skapet spop3d[76113]: autologout time elapsed - 111.181.52.216
Apr 19 22:40:34 skapet spop3d[23524]: autologout time elapsed - 111.181.52.216
Apr 19 22:40:34 skapet spop3d[16916]: autologout time elapsed - 111.181.52.216


here the miscreant comes in way too fast and only manages to get five connections going before they're shunted to the tiny queue to fight it out with known spammers for a share of bandwidth.

I've been running with this particular setup since Monday evening around 20:00 CEST, and by late Wednesday evening the number of entries in the popflooders table had reached approximately 300.

I will decide on an expiry policy at some point, I promise. In fact, I welcome your input on what the expiry period should be.

One important takeaway from this, and possibly the most important point of this article, is that it does not take a lot of imagination to retool this setup to watch for and protect against undesirable activity directed at essentially any network service.

You pick the service and the ports it uses, then figure out what are the parameters that determine what is acceptable behavior. Once you have those parameters defined, you can choose to assign to a minimal queue like in this example, block outright, redirect to something unpleasant or even pass with a low probability.

All of those possibilities are part of the normal pf.conf toolset on your OpenBSD system. If you want, you can supplement these mechanisms with a bit of log file parsing that produces output suitable for feeding to pfctl to add to the table of miscreants. The only limits are, as always, the limits of your imagination (and possibly your programming abilities). If you're wondering why I like OpenBSD so much, you can find at least a partial answer in my OpenBSD and you presentation.

FreeBSD users will be pleased to know that something similar is possible on their systems too, only substituting the legacy ALTQ traffic shaping with its somewhat arcane syntax for the modern queues rules in this article.

Will you be attending our PF and networking session in Ottawa, or will you want to attend one elsewhere later? Please let us know at the email address in the tutorial description.



Update 2017-04-23: A truly unexpiring table, and downloadable datasets made available

Soon after publishing this article I realized that what I had written could easily be taken as a promise to keep a collection of POP3 gropers' IP addresses around indefinitely, in a table where the entries never expire.

Table entries do not expire unless you use a pfctl(8) command like the ones mentioned in the book and other resources I referenced earlier in the article, but on the other hand table entries will not survive a reboot either unless you arrange to have table contents stored to somewhere more permanent and restored from there. Fortunately our favorite toolset has a feature that implements at least the restoring part.

Changing the table definition quoted earler to read

 table <popflooders> persist counters file "/var/tmp/popflooders"

takes part of the restoring, and the backing up is a matter of setting up a cron(8) job to dump current contents of the table to the file that will be loaded into the table at ruleset load.

Then today I made another tiny change and made the data available for download. The popflooders table is dumped at five past every full hour to pop3gropers.txt, a file desiged to be read by anything that takes a list of IP addresses and ignores lines starting with the # comment character. I am sure you can think of suitable applications.

In addition, the same script does a verbose dump, including table statistiscs for each entry, to pop3gropers_full.txt for readers who are interested in such things as when an entry was created and how much traffic those hosts produced, keeping in mind that those hosts are not actually blocked here, only subjected to a tiny bandwidth.

As it says in the comment at the top of both files, you may use the data as you please for your own purposes, for any re-publishing or integration into other data sets please contact me via the means listed in the bsdly.net whois record.

As usual I will answer any reasonable requests for further data such as log files, but do not expect prompt service and keep in mind that I am usually in the Central European time zone (CEST at the moment).

I suppose we should see this as a tiny, incremental evolution of the "Cybercrime Robot Torture As A Service" (CRTAAS) concept.

Update 2017-04-29: While the world was not looking, I supplemented the IP address dumps with versions including one with geoiplocation data added and a per country summary based on the geoiplocation data.

Spending a few minutes with an IP address dump like the one described here and whois data is a useful excersise for anyone investigating incidents of this type. This .csv file is based on the 2017-04-29T1105 dump (preserved for reference), and reveals that not only is the majority of attempts from one country but also a very limited number of organizations within that country are responsible for the most active networks.

The spammer blacklist (see this post for background) was of course ripe for the same treatment, so now in addition to the familiar blacklist, that too comes with a geoiplocation annotated version and a per country summary.

Note that all of those files except the .csv file with whois data are products of automatic processes. Please contact me (the email address in the files works) if you have any questions or concerns.

Update 2017-05-17: After running with the autofilling tables for approximately a month, and I must confess, extracting bad login attempts that didn't actually trigger the overload at semi-random but roughly daily intervals, I thought I'd check a few things about the catch. I already knew roughly how many hosts total, but how many were contactin us via IPv6? Let's see:

[Wed May 17 19:38:02] peter@skapet:~$ doas pfctl -t popflooders -T show | wc -l
    5239
[Wed May 17 19:38:42] peter@skapet:~$ doas pfctl -t popflooders -T show | grep -c \:
77

Meaning, that of a total 5239 miscreant trapped, only 77, or just sort of 1.5 per ent tried contacting us via IPv6. The cybercriminals, or at least the literal bottom feeders like the pop3 password gropers, are still behind the times in a number of ways.

Update 2017-06-13: BSDCan 2017 past, and the PF and networking tutorial with OpenBSD session had 19 people signed up for it. We made the slides available on the net here during the presentation and announced them on Twitter and elsewhere just after the session concluded. The revised tutorial was fairly well received, and it is likely that we will be offering roughly equivalent but not identical sessions at future BSD events or other occasions as demand dictates.

Update 2017-07-05: Updated the overload criteria for the longterm table to what I've had running for a while: max-src-conn 1, max-src-conn-rate 1/1. 

by Peter N. M. Hansteen (noreply@blogger.com) at April 19, 2017 09:45 PM

April 15, 2017

[bc-log]

Using stunnel and TinyProxy to obfuscate HTTP traffic

Recently there has been a lot of coverage in both tech and non-tech news outlets about internet privacy and how to prevent snooping both from service providers and governments. In this article I am going to show one method of anonymizing internet traffic; using a TLS enabled HTTP/HTTPS Proxy.

In this article we will walk through using stunnel to create a TLS tunnel with an instance of TinyProxy on the other side.

How does this help anonymize internet traffic

TinyProxy is an HTTP & HTTPS proxy server. By setting up a TinyProxy instance on a remote server and configuring our HTTP client to use this proxy. We can route all of our HTTP & HTTPS traffic through that remote server. This is a useful technique for getting around network restrictions that might be imposed by ISP's or Governments.

However, it's not enough to simply route HTTP/HTTPS traffic to a remote server. This in itself does not add any additional protection to the traffic. In fact, with an out of the box TinyProxy setup, all of the HTTP traffic to TinyProxy would still be unencrypted, leaving it open to packet capture and inspection. This is where stunnel comes into play.

I've featured it in earlier articles but for those who are new to stunnel, stunnel is a proxy that allows you to create a TLS tunnel between two or more systems. In this article we will use stunnel to create a TLS tunnel between the HTTP client system and TinyProxy.

By using a TLS tunnel between the HTTP client and TinyProxy our HTTP traffic will be encrypted between the local system and the proxy server. This means anyone trying to inspect HTTP traffic will be unable to see the contents of our HTTP traffic.

This technique is also useful for reducing the chances of a man-in-the-middle attack to HTTPS sites.

I say reducing because one of the caveats of this article is, while routing our HTTP & HTTPS traffic through a TLS tunneled HTTP proxy will help obfuscate and anonymize our traffic. The system running TinyProxy is still susceptible to man-in-the-middle attacks and HTTP traffic snooping.

Essentially, with this article, we are not focused on solving the problem of unencrypted traffic, we are simply moving our problem to a network where no one is looking. This is essentially the same approach as VPN service providers, the advantage of running your own proxy is that you control the proxy.

Now that we understand the end goal, let's go ahead and get started with the installation of TinyProxy.

Installing TinyProxy

The installation of TinyProxy is fairly easy and can be accomplished using the apt-get command on Ubuntu systems. Let's go ahead and install TinyProxy on our future proxy server.

server: $ sudo apt-get install tinyproxy

Once the apt-get command finishes, we can move to configuring TinyProxy.

Configuring TinyProxy

By default TinyProxy starts up listening on all interfaces for a connection to port 8888. Since we don't want to leave our proxy open to the world, let's change this by configuring TinyProxy to listen to the localhost interface only. We can do this by modifying the Listen parameter within the /etc/tinyproxy.conf file.

Find:

#Listen 192.168.0.1

Replace With:

Listen 127.0.0.1

Once complete, we will need to restart the TinyProxy service in order for our change to take effect. We can do this using the systemctl command.

server: $ sudo systemctl restart tinyproxy

After systemctl completes, we can validate that our change is in place by checking whether port 8888 is bound correctly using the netstat command.

server: $ netstat -na
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 127.0.0.1:8888          0.0.0.0:*               LISTEN

Based on the netstat output it appears TinyProxy is setup correctly. With that done, let's go ahead and setup stunnel.

Installing stunnel

Just like TinyProxy, the installation of stunnel is as easy as executing the apt-get command.

server: $ sudo apt-get install stunnel

Once apt-get has finished we will need to enable stunnel by editing the /etc/default/stunnel4 configuration file.

Find:

# Change to one to enable stunnel automatic startup
ENABLED=0

Replace:

# Change to one to enable stunnel automatic startup
ENABLED=1

By default on Ubuntu, stunnel is installed in a disabled mode. By changing the ENABLED flag from 0 to 1 within /etc/default/stunnel4, we are enabling stunnel to start automatically. However, our configuration of stunnel does not stop there.

Our next step with stunnel, will involve defining our TLS tunnel.

TLS Tunnel Configuration (Server)

By default stunnel will look in /etc/stunnel for any files that end in .conf. In order to configure our TLS stunnel we will be creating a file named /etc/stunnel/stunnel.conf. Once created, we will insert the following content.

[tinyproxy]
accept = 0.0.0.0:3128
connect = 127.0.0.1:8888
cert = /etc/ssl/cert.pem
key = /etc/ssl/key.pem

The contents of this configuration file are fairly straight forward, but let's go ahead and break down what each of these items mean. We will start with the accept option.

The accept option is similar to the listen option from TinyProxy. This setting will define what interface and port stunnel will listen to for incoming connections. By setting this to 0.0.0.0:3128 we are defining that stunnel should listen on all interfaces on port 3128.

The connect option is used to tell stunnel what IP and port to connect to. In our case this needs to be the IP and port that TinyProxy is listening on; 127.0.0.1:8888.

An easy way to remember how accept and connect should be configured is that accept is where incoming connections should come from, and connect is where they should go to.

Our next two configuration options are closely related, cert & key. The cert option is used to define the location of an SSL certificate that will be used to establish our TLS session. The key option is used to define the location of the key used to create the SSL certificate.

We will set these to be located in /etc/ssl and in the next step, we will go ahead and create both the key and certificate.

Creating a self-signed certificate

The first step in creating a self-signed certificate is to create an private key. To do this we will use the following openssl command.

server: $ sudo openssl genrsa -out /etc/ssl/key.pem 4096

The above will create a 4096 bit RSA key. From this key, we will create a public certificate using another openssl command. During the execution of this command there will be a series of questions. These questions are used to populate the key with organization information. Since we will be using this for our own purposes we will not worry too much about the answers to these questions.

server: $ sudo openssl req -new -x509 -key /etc/ssl/key.pem -out /etc/ssl/cert.pem -days 1826
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:Arizona
Locality Name (eg, city) []:Phoenix
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Example.com
Organizational Unit Name (eg, section) []:
Common Name (e.g. server FQDN or YOUR name) []:proxy.example.com
Email Address []:proxy@example.com

Once the questions have been answered the openssl command will create our certificate file.

After both the certificate and key have been created, we will need to restart stunnel in order for our changes to take effect.

server: $ sudo systemctl restart stunnel4

At this point we have finished configuring the proxy server. We now need to setup our client.

Setting up stunnel (client)

Like the proxy server, our first step in setting up stunnel on our client is installing it with the apt-get command.

client: $ sudo apt-get install stunnel

We will also once again need to enabling stunnel within the /etc/default/stunnel4 configuration file.

Find:

# Change to one to enable stunnel automatic startup
ENABLED=0

Replace:

# Change to one to enable stunnel automatic startup
ENABLED=1

After enabling stunnel we will need to restart the service with the systemctl command.

client: $ sudo systemctl restart stunnel4

From here we can move on to configuring the stunnel client.

TLS Tunnel Configuration (Client)

The configuration of stunnel in "client-mode" is a little different than the "server-mode" configuration we set earlier. Let's go ahead and add our client configuration into the /etc/stunnel/stunnel.conf file.

client = yes

[tinyproxy]
accept = 127.0.0.1:3128
connect = 192.168.33.10:3128
verify = 4
CAFile = /etc/ssl/cert.pem

As we did before, let's break down the configuration options shown above.

The first option is client, this option is simple, as it defines whether stunnel should be operating in a client or server mode. By setting this to yes, we are defining that we would like to use client mode.

We covered accept and connect before and if we go back to our description above we can see that stunnel will accept connections on 127.0.0.1:3128 and then tunnel them to 192.168.33.10:3128, which is the IP and port that our stunnel proxy server is listening on.

The verify option is used to define what level of certificate validation should be performed. The option of 4 will cause stunnel to verify the remote certificate with a local certificate defined with the CAFile option. In the above example, I copied the /etc/ssl/cert.pem we generated on the server to the client and set this as the CAFile.

These last two options are important, without setting verify and CAFile stunnel will open an TLS connection without necessarily checking the validity of the certificate. By setting verify to 4 and CAFile to the same cert.pem we generated earlier, we are giving stunnel a way to validate the identity of our proxy server. This will prevent our client from being hit with a man-in-the-middle attack.

Once again, let's restart stunnel to make our configurations take effect.

client: $ sudo systemctl restart stunnel4

With our configurations complete, let's go ahead and test our proxy.

Testing our TLS tunneled HTTP Proxy

In order to test our proxy settings we will use the curl command. While we are using a command line web client in this article, it is possible to use this same type of configuration with GUI based browsers such as Chrome or Firefox.

Before testing however, I will need to set the http_proxy and https_proxy environmental variables. These will tell curl to leverage our proxy server.

client: $ export http_proxy="http://localhost:3128"
client: $ export https_proxy="https://localhost:3128"

With our proxy server settings in place, let's go ahead and execute our curl command.

client: $ curl -v https://google.com
* Rebuilt URL to: https://google.com/
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 3128 (#0)
* Establish HTTP proxy tunnel to google.com:443
> CONNECT google.com:443 HTTP/1.1
> Host: google.com:443
> User-Agent: curl/7.47.0
> Proxy-Connection: Keep-Alive
>
< HTTP/1.0 200 Connection established
< Proxy-agent: tinyproxy/1.8.3
<
* Proxy replied OK to CONNECT request
* found 173 certificates in /etc/ssl/certs/ca-certificates.crt
* found 692 certificates in /etc/ssl/certs
* ALPN, offering http/1.1
* SSL connection using TLS1.2 / ECDHE_ECDSA_AES_128_GCM_SHA256
*    server certificate verification OK
*    server certificate status verification SKIPPED
*    common name: *.google.com (matched)
*    server certificate expiration date OK
*    server certificate activation date OK
*    certificate public key: EC
*    certificate version: #3
*    subject: C=US,ST=California,L=Mountain View,O=Google Inc,CN=*.google.com
*    start date: Wed, 05 Apr 2017 17:47:49 GMT
*    expire date: Wed, 28 Jun 2017 16:57:00 GMT
*    issuer: C=US,O=Google Inc,CN=Google Internet Authority G2
*    compression: NULL
* ALPN, server accepted to use http/1.1
> GET / HTTP/1.1
> Host: google.com
> User-Agent: curl/7.47.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Location: https://www.google.com/
< Content-Type: text/html; charset=UTF-8
< Date: Fri, 14 Apr 2017 22:37:01 GMT
< Expires: Sun, 14 May 2017 22:37:01 GMT
< Cache-Control: public, max-age=2592000
< Server: gws
< Content-Length: 220
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: SAMEORIGIN
< Alt-Svc: quic=":443"; ma=2592000; v="37,36,35"
<
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="https://www.google.com/">here</A>.
</BODY></HTML>
* Connection #0 to host localhost left intact

From the above output we can see that our connection was routed through TinyProxy.

< Proxy-agent: tinyproxy/1.8.3

And we were able to connect to Google.

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="https://www.google.com/">here</A>.
</BODY></HTML>

Given these results, it seems we have a working TLS based HTTP/HTTPS proxy. Since this proxy is exposed on the internet what would happen if this proxy was found by someone simply scanning subnets for nefarious purposes.

Securing our tunnel further with PreShared Keys

In theory as it stands today they could use our proxy for their own purposes. This means we need some way to ensure that only our client can use this proxy; enter PreShared Keys.

Much like an API token, stunnel supports an authentication method called PSK or PreShared Keys. This is essentially what it sounds like. A token that has been shared between the client and the server in advance and used for authentication. To enable PSK authentication we simply need to add the following two lines to the /etc/stunnel/stunnel.conf file on both the client and the server.

ciphers = PSK
PSKsecrets = /etc/stunnel/secrets

By setting ciphers to PSK we are telling stunnel to use PSK based authentication. The PSKsecrets option is used to provide stunnel a file that contains the secrets in a clientname:token format.

In the above we specified the /etc/stunnel/secrets file. Let's go ahead and create that file and enter a sample PreShared Key.

client1:SjolX5zBNedxvhj+cQUjfZX2RVgy7ZXGtk9SEgH6Vai3b8xiDL0ujg8mVI2aGNCz

Since the /etc/stunnel/secrets file contains sensitive information. Let's ensure that the permissions on the file are set appropriately.

$ sudo chmod 600 /etc/stunnel/secrets

By setting the permissions to 600 we are ensuring only the root user (the owner of the file) can read this file. These permissions will prevent other users from accessing the file and stumbling across our authentication credentials.

After setting permissions we will need to once again restart the stunnel service.

$ sudo systemctl restart stunnel4

With our settings are complete on both the client and server, we now have a reasonably secure TLS Proxy service.

Summary

In this article we covered setting up both a TLS and HTTP Proxy. As I mentioned before this setup can be used to help hide HTTP and HTTPS traffic on a given network. However, it is important to remember that while this setup makes the client system a much more tricky target, the proxy server itself could still be targeted for packet sniffing and man-in-the-middle attacks.

The key is to wisely select the location of where the proxy server is hosted.

Have feedback on the article? Want to add your 2 cents? Shoot a tweet out about it and tag @madflojo.


Posted by Benjamin Cane

April 15, 2017 07:30 AM