Security - Benjamin Shelton's Musings

VPNs are No Panacea

Monday April 29th, 2019

I sometimes encounter the question “should I use a VPN?” with the inevitable shower of comments along the lines of “yes, it’ll make you more secure!” or “it’ll protect your privacy!” Occasionally, I see VPNs recommended as a solution against doxxing, such as when someone comments about their profession or business, competitors, or potential employers. Perhaps someone in industry has quipped about sending “anonymous” emails criticizing a particular organization or offered unsavory political opinions that would otherwise get them fired.

First, I should state that I am no security expert. I just happen to write software, and I have a vague curiosity into the world of information security. I enjoy reading the opinions of individuals who are considered experts in the field, and they almost uniformly warn of the same folly: VPNs are no panacea!

I think this advice is offered as therapeutic more than curative. In particular, it seems plausible users attracted to VPNs may place unwarranted trust in the software and provider, engaging in activities that suggest a degree of carelessness. Caution is nevertheless a desirable trait even under the warm embrace of cryptography. I’ll explain why.

A VPN may be useful to disguise your activity if you’re posting on Twitter, and you wish to avoid the danger of clicking on links that may be able to collate information about your activities or track usage behavior. VPNs may also provide some limited protection if you’re prone to torrenting your entertainment (up to a certain dollar amount, after which legal recourse against you becomes economically viable–maybe even necessary). However, even in the latter case, use of VPNs is of dubious utility, and they may not always keep you anonymous. Just this month (April 2019), NordVPN has been the subject of increased scrutiny over sending information to a series of unusual domains (billed as an anti-censorship strategy). Three months ago, NordVPN was also accused of tracking its users. None of this is surprising. Usage of a VPN is surrendering your privacy to a single firm (in most cases) in the hopes they will protect you from others doing naughty things with your browsing habits while simultaneously doing nothing of the sort themselves.

Nota bene: This behavior isn’t limited to NordVPN. They’re simply one of the most popular providers and therefore examined by more people with an equivalent increase in negative press. Regardless of intent, I find I can’t fault them for running analytic tracking on their user base: There are cases where traffic analysis (latency, throughput, timeouts, etc) may be useful for providing better quality-of-service and improving customer experience. In the event of an endpoint failure, I’m sure such analysis can be incredibly helpful re-routing packets toward other endpoints within a margin of acceptable latency. If the unusual domains their applications have directed traffic to this month is an anti-censorship countermeasure, I have to commend them for a bold strategy, even if it makes a few users nervous. To be clear: I neither endorse NordVPN nor am I overly critical of their decisions. I don’t care either way. I don’t use VPNs.

Now we can get to the meat of this discussion.

I believe the most important consideration as a user of a VPN service is to quantify your threat model. To illustrate, let’s take an example from earlier: For most people, doxxing isn’t a significant threat. Those at greatest risk often draw attention to themselves, either deliberately through their actions (whistle-blowers), or through online interactions (gaming, comments, etc.) that turn sour. Some may be victims of cyberstalking. In these cases, a VPN may be useless, because the victims often post sufficient information online to identify who they are, where they live, and numerous other details about their lives that a determined third party can piece together. Simply put: VPNs aren’t magic and they cannot protect you from yourself.

For most of us, our opinions and online interactions aren’t important or interesting enough to attract attention. If you think your opinions are interesting enough to be the target of a harassment campaign, then perhaps a VPN may be useful, but it isn’t the only tool you should rely on. To put it another way, if you’re afraid you might be identified online, you must firewall everything about your life that may be exposed through writing, your interactions with other people, and the media you post.

Ask yourself this question: What’s your threat model?

Most of the people I’ve spoken with who espouse the use of a VPN do so because they’re concerned about their identity being leaked, they may worry about employers identifying them online, they don’t want to become targets of harassment, or they simply wish to share politically unpopular opinions online that might draw the ire of one group or another (this may cross over into any of the prior points). As a free speech advocate, I can sympathize with their desire for further anonymity; losing your job because you’re the subject of a targeted harassment campaign is the antithesis of a free society. Neither should people be subject to hecklers or harassment, especially of the sort that crosses over from the online world to their front doorstep. Unfortunately, this is the world we live in.

A VPN isn’t going to provide unlimited protection against adversaries, and neither will a VPN protect users from disseminating information about themselves to interested but malevolent third parties. They can provide an additional layer of security when using Internet connections in a public location (airports, hotels, coffee shops, etc.), and they may be able to circumvent regional restrictions on entertainment or information (Google, YouTube, Netflix) by the state or licensing institutions. You should not expect a VPN to keep you completely anonymous, but they may be useful as part of a defense-in-depth strategy.

However, cautious use of the Internet can bring you 80% of the way toward a safer online presence. In particular: Don’t click links you don’t trust; avoid sharing secure information with services that are not offered via TLS (HTTPS); if you reply to an unknown third party via email, be cautious of using SMTP with your provider (this may divulge your client IP) and stick with the web or mobile interfaces; and don’t post information about yourself you don’t want publicly accessible. You may not have a choice in some cases, depending on your line of work, so this advice may not be applicable. I do believe it is broadly useful for the majority of people. Take heed.

There are limitations to VPNs that less technologically-inclined consumers may not be aware of. Key to understanding this is to understand the technology behind VPNs (typically IPsec with some authentication layer) and their history as a tool to extend company or school network boundaries off-premise, providing employees and students a means of connecting to internal services. It was never designed as a mechanism preventing the controlling institution (in this case the VPN provider) from classifying or logging traffic. Partial anonymity is a useful side effect but it wasn’t the design goal. Neither was complete security.

VPNs can have surprising utility if your adversary is intermediate tracking, or you don’t trust your ISP. Providers like Comcast have demonstrated this by injecting advertisements into sites their users visit, and others have been accused of using traffic analysis to track user behavior, possibly for targeted advertising. VPNs can protect against this threat by acting as a secure tunnel between your computer and your VPN provider’s endpoint.

Before I conclude this post, I should leave my readers with some particularly interesting tidbits of research that may be helpful in deciding whether your use case justifies paying for a commercial VPN. There was a paper written in 2016 titled “Characterization of Encrypted and VPN Traffic using Time-related Features.” This paper discusses techniques in traffic analysis to determine the protocol and type of traffic transmitted over encrypted connections, including VPNs, and could differentiate between VoIP, browsing, or streaming behaviors. There are other related papers including “Realtime Classification for Encrypted Traffic” (2010) and “Analyzing HTTPS Encrypted Traffic to Identify User’s Operating System, Browser, and Application” (2017); the latter describes attacks capable of defeating countermeasures intended to obfuscate payloads in transit. Although I cannot find it at this time, I also recall reading a paper that presented deep packet analysis techniques to defeat random noise injected into streams, successfully categorizing the encrypted traffic despite efforts to thwart would-be adversaries. This is an area of active research, and I expect with advancements in deep learning and greater access to GPUs capable of training neural networks tuned toward traffic analysis, VPNs may not present significant defense against adversaries that couldn’t already be achieved via other forms of encryption, e.g. TLS. Yes, I am aware of SNI-related information leaks due to how TLS presently works.

To put it more succinctly: You have to decide on your threat model.

No comments.

***

Musing about… Prepared Statements

Sunday October 7th, 2012

A spurt of curiosity this evening–more specifically, one of those circumstances we each have from time to time wherein a handful of unrelated thoughts flutter about the conscious mind like a pair of butterflies flitting from flower to flower–consumed me sufficiently that I decided to do a brief Google search on prepared statements. I’m unsure where such a motive originated, but I’m fairly convinced that it was at least tangentially related to some misinformation I’ve heard of late related to web programming advice and also possibly due to my surprise that few commercial PHP bulletin board packages actually use prepared statements.

Before I begin, let’s consider for a moment that last and most disconcerting statement: Few commercial PHP forums use prepared statements. To the uninitiated, this might seem to be a matter of nick-picking unimportant to the real world. To the rest of you, it may come as a sad commentary on the state of modern programming and commercial software (perhaps, fittingly, as a commentary on the average run-of-the-mill PHP programmer). Prepared statements certainly aren’t new, and while they’ve been a part of PHP for a number of years now, it’s infuriating that they hardly see common use.

PHP first shipped PDO with PHP 5.1 (available as a PECL extension for PHP 5.0, circa 2004-2005). Intriguingly, for systems that don’t provide PDO support (or the appropriate drivers for PDO), the MySQLi and PostgreSQL functions and classes have provided prepared statements for quite some time, and the SQLite 3 drivers have provided a prepare() method since PHP 5.3. Commercial bulletin boards, like vBulletin and IPB, have seen many revisions in the years since, and several free/open source packages including phpBB have been part of similarly major overhauls. Yet the overwhelming majority of them still make no use of prepared statements. Humorously, as of this writing, vBulletin does provide a misleadingly-named sql_prepare method in its database class, but it doesn’t emulate prepared statements–it simply provides an escape wrapper with data type introspection and casting.

PDO has been available for nearly 8 years and many RDBMS drivers for PHP have offered prepared statements for at least that long (longer in the case of PostgreSQL if memory serves correctly). Yet every year or two, new major versions of popular PHP message boards are released, and every major release sees the same legacy database code under the hood. Perhaps it’s intentional. Perhaps the developers still want to support PHP 4.x in spite of the fact that it went EOL in 2008. Perhaps they just don’t know any better. Who knows!

Why Prepared Statements?

A prepared statement or parameterized statement, as it is occasionally known by in DBA parlance, is a specially-formatted SQL string that utilizes placeholders, either question marks (?), special named parameters (such as “:name”), or other database-specific strings, to indicate to the database or the driver where data is to be inserted. This has the benefit that, in theory at least, any data managed by a prepared statement is unlikely to serve as a vector for SQL injection attacks. The reason this works is because most drivers dispatch the prepared statement and its data separately on the wire and process them independently providing a certain degree of isolation. But wait, there’s more! Because of the implementation nature of prepared statements on most platforms, the query planner can often optimize and partially compile the statement such that, if it runs again, much of the legwork has already been completed and the query can run faster. Software like forums or blogs often execute the same query multiple times–with different data–so one might think it would be a natural fit. If it’s such a good thing, why do so many popular packages forgo such a benefit?

While I can’t answer for many developers, I think I know what at least part of the answer might be. First, for enormous code bases like vBulletin (and phpBB to a lesser extent), virtually no effort is made to separate the application logic from the underlying model. I’ll be fair in my distinction: The presentation layer is thankfully separated from the mess in the form of templates, but the remaining code is a bowl of spaghetti not unlike that of many of the very first PHP applications (and Perl!) that first graced the Internet over a decade ago. Because the model (and, by extension, the SQL) is so deeply entrenched in the functional logic of the application, reworking it to use prepared statements–and consider, also, that many of these queries are generated programmatically–would be a tremendous undertaking of many man-hours. Cleaning up the code properly such that it is more of a structurally sound framework (think MVC) is most certainly out of the picture. It isn’t impossible, of course, but when you consider that some functions in many of these software packages have persisted since the dawn of time, such refactoring becomes the thing of fairy tales.

To illustrate some of my displeasure, vBulletin version 4.2 still provides an iif function which is little more than a wrapper for the ternary operator (?:) in PHP. The ternary operator has been around since at least PHP4, yet there it is, in all its glory, a legacy function still available from the early days of PHP3 when such a beast didn’t exist!

One might think that it would simply be a matter of adding some logging code to old function calls, tracing the source that called them, and then reworking the culprit code to use built in language features. It might even take less than an afternoon.

The Drawbacks Programmer Mistakes

While prepared statements (parameterized queries for those of you who are embarrassingly excited by more elaborate verbiage) aren’t a panacea (I did it again) for all things SQL injection-like, they’re a good mitigation strategy, but it’s important to use them with caution. As Jason Lam states on the ISC Diary, “I still remember 4-5 years ago when SQL injection just started to become popular, the common mitigation suggested [was] to use prepared statement [sic] as if [they were] a magic bullet. As we [now] understand the SQL injection problem better, we realize that even prepared statement can be vulnerable to SQL injection as well.”

Well, yeah. This is where I smack my forehead. Maybe I’m being overly critical as I re-read a post from 5 years ago, because I’ve had the tremendously good fortune of witnessing some magnificently terrible code in my time as a web application developer.

Mr. Lam goes on to explain the insertion of unchecked user input, but I can’t shake the feeling that there is an implicit overtone in the article that it is somehow the fault of prepared statements. Perhaps more accurately, the article is faulting most of us for having championed prepared statements as a welcome solution to a very common and widespread problem. Realistically, though, it’s not an issue with prepared statements–they work just fine. It’s an issue with developers inappropriately using the tools at their disposal and doing so in a manner that simply transfers the vulnerability from query() to prepare() by forgetting to properly manage incoming data. Though, I should say that I’m inclined to suggest that programmatically assembling a prepared statement is somewhat counter-productive. More on this later.

Ironically, while doing some research for this article, I ran across a couple of posts on Stack Overflow that presented this problem of unchecked user input as one of the primary drawbacks of prepared statements. Really? Drawbacks? If you’re not using named parameters or placeholders for your query data, you’re probably not using prepared statements correctly! But drawbacks? Gee, maybe we were a little too vigilant in telling people to use prepared statements–so much so that they did a find/replace for query and swapped it with prepare. (I’m being facetious; so, to head off any comments to the contrary, it’s not at all possible to simply swap some text, because prepared statements do require a little more work.)

The problem I have with labeling unchecked user input as a drawback of prepared statements is that it is no longer a “real” prepared statement whenever such data is concatenated with the resulting query. Yes, it is still a prepared statement, insofar as calling prepare() on the driver’s end, but it’s no longer being used like a prepared statement. Here’s a hint to new developers, particularly PHP developers since a huge percentage of them are guilty of doing stupid things like this: Never concatenate unchecked input in any query–prepared or otherwise. If you’re using a prepared statement, use it like a damned prepared statement. The moment you start piping data into the query string itself, it’s no longer going to have the benefits of a prepared statement. (I’ll give you a special exception if you’re using LIMIT and concatenating integers with your queries since not all of you may be running MySQL 5.0.7 or later.)

Will the Real Prepared Statement Please Step Forward?

In my mind, and trust me, it’s a very strange place in here, a prepared statement is one that may contain parameters and is “prepared” ahead of time for reuse (that is, compiled) by the driver or the RDBMS (usually the RDBMS). Nothing more, nothing less. The instant some unfiltered data is slapped on to the end of the query, it’s no longer a pure prepared statement; instead, it becomes a mistake. Again: Prepared statements are parameterized queries that are usually compiled by the backend for a little extra speed. A query can contain anything else that the programmer adds into it, but fundamentally, a prepared statement is something that dictates a very specific structure. It certainly cannot overcome the mistakes of a naive developer who, believing that a prepared statement will magically fix all of their (singular they, sorry linguistic purists) security-related woes, use such a tool in addition to dangerous techniques like concatenating unchecked input. Another way to look at it is thus: If prepared statements are prepared (that is, compiled) by the database for reuse, and the developer is concatenating a dynamic value to the statement, the entire benefit of preparing (compiling) that statement is immediately lost, because the RDBMS has to re-compile the statement every single time it’s sent along the wire. Please, don’t do this.

Of course, there may be reasons not to use prepared statements all the time. For one, prepared statements in MySQL versions prior to 5.1 can no longer be managed by the query cache which may impact performance (High Performance MySQL, 2nd ed., p229). DBMSs that don’t support prepared statements can, in PDO at least, can have them emulated by the PDO driver at the cost of some pre-processing performance, and using older PHP functions like the popular-but-now-deprecated mysql_* ones just outright don’t support anything but basic queries (they also don’t use the binary interface, making them somewhat slower). If you’re using only a single query with absolutely no intention of reusing it, prepared statements may incur some overhead since the query must be compiled. Furthermore, for MySQL at least, if you’re not using stored procedures, the database has no way to share compiled prepared statements among multiple connections. Yet, while a prepared statement is no substitute for caution–particularly with programmatically-generated queries–it is a useful tool in the developer’s arsenal to protect against attacks like SQL injection. Plus, if you make it a habit to use PDO (you should), not only do you get emulated prepared statements for databases that don’t support them, you also get to use the modern MySQL APIs under the hood and some consistency, which says a lot in the PHP world.

Tangentially, this is also why it boggles my mind that many sites (banks, news agencies, airlines, and even some game companies) limit what characters the user can enter for their password, and how so many companies with an online presence often have draconian limits of less than 16 characters, inclusive. Seriously: If you’re correctly storing a secure hash of the password (HMAC, bcrypt, scrypt, or at least SHA256 or similar), you don’t need to store the password directly, nor does it matter if the password is 5 or 500 characters. It’s going to be comprised of a fixed length of some limited set of ASCII characters representing hexadecimal numbers which can be stored without much fuss. The 1990s were over two decades away. I think it’s time we stopped writing code like Y2K is still a viable problem.

Also, let’s start using prepared statements a little more often.

1 comment.

***

Insecurity is Your Fault

Sunday August 29th, 2010

Earlier this month, I stumbled across this article on Slashdot and my comments inflamed a few posters with an uncomfortable truth:

Insecurity is your fault.

No, really, it is. Now that the storm has mostly blown over and much of this has been forgotten (plus I’ve gotten some time to sit down and write!), I’d like to explain why.

First, though, a brief recap for those of you who may have missed out on all the excitement.

Memcached: Security Hole or Convenient Misconfiguration?

I won’t explain what Memcached is. If you’re curious, the Wikipedia article details its usage model, behavior, and other interesting tidbits for the curious. However, the important part is that Memcached provides absolutely no method or mechanism by which to authenticate attached clients. Neither does Memcached provide an interface to protect data integrity–that’s your job. The whole Slashdot meltdown occurred when some security researchers discovered that many well-known sites failed to protect their Memcached instances, exposing sensitive data to the outside world including passwords, password hashes, e-mail addresses, and other assorted data typically stored in a Memcached instance. The surprising discovery here isn’t so much that the problem existed but that it afflicted the sites of several well known names including PBS! Really, the system administrators of the sites showcased in this study should be ashamed of themselves. (Maybe; I’m sure many of them were too busy to notice the err of their ways. I’m also sure that a sizable chunk of them felt a little more uncomfortable puckering of anatomical places better left to the imagination than is otherwise healthy.)

I also hate to point out the obvious here, but a simple firewall would have circumvented the issue entirely. The authors of the article in question emphasized this point by appending in their conclusion several strings containing “FW”–and then some. It’s not like setting up a firewall is difficult either, so I’m not sure if this is equal parts ignorance, oversight, laziness, economics, and harsh schedules for the IT staff involved. I don’t think any of these would make for a particularly good excuse. After all, it doesn’t take long to set up ipfw, iptables, and kin. There’s also plenty of documentation and quick-start guides available just a Google search away.

Unfortunately, the Slashdot article was slightly sensationalist and came close to leading readers into believing that running Memcached was a security hole. If you were to take a few minutes (do so now) to browse through the comments, you’ll find that they fall into roughly two categories:

Security is the responsibility of the system administrator, network administrator, or whomever happens to be in charge of handling the servers, their configuration, and/or network topography.
Security is the responsibility of the developer to ensure that their application doesn’t ship in a potentially hazardous state such that the exposure of critical data could be made accidentally.

Which statement do you agree with?

Take your time. I’ll be here all day.

It’s Your Fault

If you think it’s the developer’s responsibility to ensure their software doesn’t ship in a potentially hazardous state, you’re wrong, and Memcached is a perfect example why the responsibility of service configuration rests squarely on your shoulders. Of course, there are many reasons why the developers are absolved of responsibility, and I’ll attempt to outline some of them here. Essentially, though, if you’re blaming Memcached for this particular deficit, you’re demonstrating great ignorance of the F/OSS ecosystem.

How dare I point out the obvious!

Package Maintainers Know Everything

Most open source projects, particularly those centered around popular operating system distributions like FreeBSD, Gentoo, or Debian often have specific individuals who, largely on a volunteer basis, maintain a specific software package or packages. In many projects, these maintainers operate rather loosely as a committee whose primary objectives are to: 1) make sure the software installs and runs correctly on their target platform(s) and 2) configure the package to ensure that it suits the specific goals of their distribution (or not, sometimes inaction is as much of a goal as action). This is why package maintainers know everything, and I certainly don’t mean it in a derogatory sense: These individuals, typically picked by the community or by merit, understand both the software they’re maintaining as well as the environment they’re maintaining it for. Package maintainers are an integral part of a particular distribution’s microculture, and they understand what defaults are likely to work best and what won’t. There’s also a certain degree of personal preferences involved but package inclusions and behaviors are almost exclusively influenced by the community as a whole.

When it comes to Memcached, Debian-based distributions tend to configure it to listen exclusively on localhost. FreeBSD and Gentoo both leave Memcached as it ships–not configured and not turned on by default. (Ubuntu enables Memcached as soon as you install it unless you’re running Ubuntu server, in which case it remains disabled.) Since the default–not configured–instructs Memcached to listen on all available interfaces, those distributions that happen to ship Memcached in its default state are therefore the same ones that will expose it to the Big Bad Interwebs the very instant it happens to be enabled. Sort of.

When Configurations… are Not

This is a minor point of contention, but I feel it’s worthwhile to mention because it bears some relevance to the topic of Memcached security. Plus, since several comments on Slashdot made reference to some magical Memcached configuration floating around somewhere in the sky, I confess that I feel this overwhelming desire to point out the obvious. It’s a lot like honking at the motorist in front of you whose blissful ignorance of the surrounding world is responsible for backing up traffic for miles while the unyielding stoplight changes from green to yellow to red and back again–you just can’t help but lay it on thick. I’m also convinced that some of you share this blissful dream-like trance with our hypothetical motorist, because I otherwise cannot fathom why you’d think we software developers are out to get you. (Maybe we are, but that’s another story.)

Okay, here’s the earth-shattering news: Memcached has no configuration file. That’s it. That’s the anticlimactic reality my last paragraph was building up to. I don’t believe I can hear you honking yet, so let me say that again.

Memcached has no configuration file. If it has one, it’s a lie. It’s a lot like that cake. The configuration file is a lie.

Several OS vendors (FreeBSD isn’t one of them–same for Gentoo) happen to ship with their particular Memcached package a file conveniently named “memcached.conf.” To most of us servermonkeys, it would appear that this file has something to do with Memcached–and we’d be right. It’s amazing how that works. Ah, but such assumptions only go so far. You see, “memcached.conf” is just a convenience wrapper. Memcached itself has no integral facilities to read external configuration files and relies instead on command line parameters. In fact, I would argue that vendors who provide a separate configuration file of the sort may be misleading impressionable individuals into believing that Memcached ships with its own configuration file, and that those someones who got caught with their firewall down happened to get the short end of the stick by starting up the service without actually checking that file.

Admittedly, it’s easy to be mislead. Here’s a sample of the Ubuntu (server) Memcached configuration file:

# memcached default config file # 2003 - Jay Bonci

But, if you were to continue reading…

# This configuration file is read by the start-memcached script provided as # part of the Debian GNU/Linux distribution.

Oh! Memcached doesn’t read it after all! Good thing I read passed the top two lines, otherwise I would’ve been convinced it shipped with Memcached!

Fortunately, both FreeBSD and Gentoo configure Memcached through their RC system and make it fairly clear that any configuration is done directly to the Memcached daemon. FreeBSD configures Memcached via /etc/rc.conf. Gentoo configures Memcached via /etc/conf.d/memcached. Both of these configurations therefore operate in Memcached’s default mode of listening on all available interfaces.

The question, then, becomes a matter of whether Memcached is insecure by default.

If a Cache Server Listens on all Interfaces in the Forest… is it really There?

Memcached gained its reputation as a light weight, fast, minimalistic memory-based caching server partially by not integrating high-latency features like authentication, data integrity checking, and encryption. If the end user’s environment requires any one of these features, it’s up to the end user to implement them on top of Memcached–and most likely, other solutions. (You can run Memcached over an encrypted connection using a variety of techniques, for example, but they require some effort on your behalf to implement.) Failure to secure a network-accessible service isn’t the fault of the developers of that package. After all, if the server were properly firewalled, it wouldn’t be network-accessible outside the network for which it was intended.

(This, of course, assumes that the attackers don’t have access to your internal network. Granted, if this were the case, I suspect Memcached would be the least of your concerns.)

The obvious point of this entire debate–specifically, the one on Slashdot–is that Memcached shouldn’t be shipped to listen on all available interfaces. Certainly there are great points on either side of this issue, but the fundamental truth is that Memcached’s default behavior isn’t up to us to decide. The developers decide what goes, and I’m sure they have a good reason for attaching to every interface under the sun. In this case, the obvious solution isn’t to argue with me about why it wasn’t your fault you just exposed 15,000 passwords to every 16 year old pimply hacker with an Internet connection. Nay, you should argue with the developers of Memcached and explain to them why their software should listen on localhost by default. Maybe they’ll even agree with you and release a new version that does exactly that. On the other hand, they may just give a hearty laugh in your direction and explain that such a drastic change in default behavior would very likely impact the bazillion or so installations of Memcached that exist out there on the Interwebs the next time some system administrator goes to upgrade.

Generally speaking, it’s better to piss off the people who haven’t done enough research to know how to actually use the software than it is to squirt a healthy dose of Tabasco sauce into the shorts of the very people who probably have several dozen large-scale installations each and enough disposable income to donate generously to your project.

Not surprisingly, that brings me to my next point.

If You can’t Configure that Package Securely… It’s Your Fault

I really hate to harp on this point, but since it seems that there’s so many individuals out there bellyaching over the simple fact that Memcached doesn’t listen exclusively on localhost by default, I really have to hammer it until I drive my point home (or break my hammer).

It’s your fault.

Don’t like it? Tough. Write that sentence 300 times on some college ruled notebook paper and get back to me in the morning. No cheating! I’ll be counting every single one of ’em, too.

The implication of my attitude–or attitude “problem,” depending on how you take this–is that configuring any software you install is your business. If you configure it incorrectly and lose your company a million dollars, it’s your fault. If you configure it incorrectly and drive traffic to your site, earning your company a million dollars, it’s your fault. Conversely, if I write a software package, hypothetically, that advertises itself as some must-have network-connected service that’ll make your servers hum, your teeth whiter, and maybe cure world hunger and you install it without bothering to look at the manual to find out that, holy bat nuggets it really is network-connected, you won’t find a lot of sympathy from me when the entire Internet discovers a few dozen things better left secret just because you couldn’t read a two paragraph instruction manual or secure your server. It’s not my fault that my hypothetical software might be used in an inadequately protected environment anymore than it is the Memcached developers’ fault that their server can be accessed remotely without authentication.

Now that We’ve Established Blame, What Next?

If you happened to read the article(s) the Slashdot summary linked to and the first thing that came to mind was “firewall!”, consider yourself dismissed. You’re free to go home, and you’re getting an A for the course. You needn’t trouble yourself with reading through my inane ramblings, because you probably know a lot more than I do. However, if you’re still fuming out the ears blaming the Memcached developers for defaults that clearly should have been better chosen, you’re staying after class.

I torqued off a number of posters on Slashdot when I uttered these words, so I’ll utter them again: It is your responsibility as a systems administrator to understand the security implications of installing any given software package. If it’s installed on your server–doubly so if you’re the one installing it–you damn well better know what it’s going to do when it starts running. If you don’t have a clue what might happen or how to configure it, then shame on you when someone sniffs critical data from a service you didn’t configure correctly. System administration isn’t intended to be easy. System administration isn’t opening a couple of menu-driven wizards, clicking “next,” and hoping for the best. System administration is about best, secure practices; vigilance; researching and reading; understanding, mitigating, and managing risks; understanding the software you’re using; understanding the operating system you’re using; understanding your network environment and topography; awareness; and always, always, always keeping in mind that anything connected to the Internet is at risk. Sounds like a lot of work, doesn’t it?

“But,” I hear you protest, “system administration is about taking care of the users, resetting their passwords when they forget, and making sure everything is hunky-dory when they pull up their TPS reports! You’re describing a consultant’s job.”

Nope, son, that’s the help desk. They’re the colleged-aged kids up front who daydream of murderous intentions every time someone calls up thinking capslock is cruise control for cool and just can’t seem to get their password right. Consultants are the guys you hire to design (or build) the system. System administrators (probably you) are the guys who are hired to understand the system, fix it when it breaks, and be on call when Godzilla comes gnawing through the company’s upstream. If you think system administration should be easy, there are places you can go to for help.

Why Does it Matter?

Part of the reason I like the minimalistic approach FreeBSD and Gentoo both take to configuring (and I use that term loosely) Memcached is because they make no assumptions–other than the defaults, of course. It doesn’t matter whether you’re running a Memcached instance for a farm of 50 webservers and need to set it up to listen on all interfaces–uh oh! defaults!–since the machine is only attached to a private network. It doesn’t matter whether you’re going to be setting Memcached to use 64 megs (the default!) of RAM or 16 gigs. What does matter is that you’ve looked at the configuration your package maintainer–the guy who knows everything–shipped with the software you installed to figure out what it’s going to do. Better yet, it helps to read the manual page. The world isn’t full of pop-out coloring books, and if you’re going to maintain systems for a living, you’re going to have to do some reading. Manpages are a fantastic place to start.

I’ve read the excuse on Slashdot more often than I care to admit that the reason Memcached should listen by default on localhost is because there are far more instances installed by mom and pop web sites that just expect to install it and have it working than there are large scale installations that listen on multiple network interfaces (well, 0.0.0.0:11211 or :::11211). Is that true? I don’t know. Maybe there’s a significant number of webserver admins out there who have heard about this Memcached business and know only enough to know they need to install it. Maybe there’s a lot of people who just run apt-get install memcached, emerge memcached, or whatever install script happens to float their collective fleet of rubber rafts because they like to watch a bunch of text fly across their screen. I have no idea.

I do know that there’s a huge problem with the assumption that most administrators installing Memcached aren’t aware of the security implications. Why? Simple:

Memcached is a semi-niche market solution for a large scale problem. By the time you start looking around for a solution to offloading some of your server load elsewhere, you’ve probably done enough research to know what Memcached is and why you need it.
If you’re administering a large site that needs Memcached (see previous point), you’re probably well aware of network security. You wouldn’t let the world connect to every port on your home computer just for fun, so why would you let them do that to your web server?
Most system administrators are competent, careful, methodical individuals who consider their server farm something like their own children. If we lesser beings so much as tried to touch their machines with our grubby mitts, they’d chase us out of the data center with a shotgun. This whole problem, and the article linked to by Slashdot, doesn’t mention these guys. Those same guys who spend sleepless nights making sure db-140.cluster.lax.internal-net.megacorp.com is running smoothly are the same guys who firewalled their Memcached servers and were not noticed in the report. They couldn’t be, because the report was about incorrectly configured servers. The world is full of surprises, and it may surprise you to learn that there are competent individuals out there.

The other excuse that drove me to the verge of insanity was the proposal that Memcached is a great object store that can increase the performance of a mom and pop webserver, so it’s just as likely to be installed on bobs-fancy-widgets.com as it is on a massive load-balanced site. Thus, since this target market likely doesn’t know the first thing about installing a webserver, Memcached should be set up safely by default. Fine–maybe this is true–but it also assumes that bobs-fancy-widgets.com isn’t firewalled. It should be, and if it’s not, then I’m afraid there’s not much any of us can do for him. Maybe Bob doesn’t know anything about firewalls. Maybe Bob is also receiving a gazillion SSH scans from China, too. Just wait ’til they find out his password is “password”.

I don’t have the data available that these security researchers do. Therefore, I have to base my assumptions on the sample data they released. The companies affected by this security oversight (it’s not a hole) weren’t small fly-by-night organizations. These were just about everything from trendy start-ups to major television network webservers! My guess? It was probably an honest mistake more often than not. Someone forgot to enable the firewall or configure Memcached. They got burned. It sucks, but it happens often enough to serve as a reminder to the rest of us.

Lesson Learned

There’s a valuable lesson to take away from this, and while I realize this essay is reaching almost epic proportions, it’s important enough to warrant just a few more sentences. Never make assumptions about your software’s configuration. If you’re running a business or being paid by a business to run their machines, you should always double-check (or triple-check) the configurations of software you’ve installed to ensure they’re going to run as you expect them to. Make certain to read the appropriate manpages for your software, too, because they often have valuable insight into what the software does or how to configure it. Search the Internet. Oh, and never forget–if you’re using a *nix-based system, there are tools to help you understand what’s running and what’s not. Use utilities like ps and netstat until you can read the output of these programs by heart. netstat is of particular importance; with it, you can understand at a glance what your system is reporting as listening on network interfaces. (-p helps, too, as does understanding similar non-Linux tools like FreeBSD’s sockstat.)

Remember: If some 16 year old hacker discovers the CEO’s password and sends out “Your Fired!” e-mails to the company, wipes the database server, and turns your web server into an archive that’d make the Pirate Bay blush, you’re the one who’s going to get that uncomfortable call at 3:00AM. The Memcached devs? Well, they’ll be sound asleep while you’re sweating bullets. After all, that’s what you’re paid for, right?

No comments.

***