nginx - Benjamin Shelton's Musings

To 404 or not to 404?

Wednesday July 9th, 2014

If you’ve been following my (admittedly) rare posts over the course of the last few years, you’re likely to have noticed a growing aggression toward PHP as a language. It’s not that I hate the language per se (although I sometimes do), it’s that there’s so much crap written in PHP that it’s almost impossible to find something well-written that’s pleasant to work with (major props to Fabien Potencier of SensioLabs for writing some of the best PHP I’ve ever had the pleasure to work with–see, I can be even-handed!). There are some positive developments, particularly in commercial PHP, but all things considered, we’ve a long way to go.

XenForo (as an example) is leaps and bounds better than vBulletin in terms of actual design. They use a well-tested framework (Zend), mostly keep to the philosophy of separation of concerns, and isolate components from each other without polluting the global namespace (hi, vBulletin!). However, (you were waiting for this, weren’t you?) the coding style still remains abysmal (what’s wrong the K&R style, guys? seriously!) and it’s almost infuriating how much magic is used to create classes on the fly for various tasks. Need an example? Just look through some of the deferred tasks: Nearly everything is generated by factory classes that accept a class to instantiate as a string argument, then return the instantiated class. This absolutely screws with a developer’s ability to rationalize about types being passed around, and if it weren’t so infuriating, I think most of us would break down in tears. Do you want to see a grown man cry?

Yes, I can understand why the XenForo developers have made their choices (no need for a long switch statement, object selection for instantiation becomes the concern of the calling code, etc.), but it’s absolutely a tremendous pain in the arse when one happens to be trawling through the sources to figure out why something isn’t working. Let’s not even get into the whole bazillion singleton methods splattered everywhere like the ground zero of a misplaced spittoon. How’s that for a visual? I think I just gave myself a fist-bump.

But alas, I distract myself with unnecessary things. This post is supposed to be about 404 errors, isn’t it? And here I was about to launch into an argument about how there’s so many libraries out there that support dependency injection, you’d have to be insane to use a hundred different singleton classes.

Today I encountered a relatively curious thing with XenForo. John (of Forum Foundry, Inc.) was having some difficulty with the nginx configuration on one of his sites and had convinced himself it was his fault because of something he had changed (it wasn’t). Something, somewhere was causing a 404 page to crop up every time users would attempt to reset their passwords. It was bugging him, and he asked if I’d take a look. So I did.

After examining his nginx configuration, I was convinced he did nothing wrong. But disconcertingly, I grew increasingly more convinced that it wasn’t nginx’s fault, either. Strangely, though, the 404 page was being spat out by nginx–not by XenForo–so it seemed impossible for the fault to lay elsewhere. It had to be nginx. Yet, I found myself battling with a sort of cognitive dissonance: Posting (using curl or similar) to the lost-password URL returned a 200 OK. Using GET (or other methods) returned a 405 Method Not Allowed (as expected). If it were the fault of nginx, the 404 error should be returned regardless of the method (GET, POST, or otherwise).

We tried various things, mostly involving the nginx debug log, but couldn’t quite get a handle on the source of the problem. I was pretty sure PHP was somehow at fault, but I couldn’t duplicate bubbling a 404 error up from PHP to nginx. Puzzled, I sat back and thought about the problem, and on a whim decided to do some digging around. It wasn’t long before I stumbled on the documentation for nginx’s fastcgi_intercept_errors directive, which seemed as if it might exacerbate upward error propagation, from application code. From the documentation:

[fastcgi_intercept_errors] [d]etermines whether FastCGI server responses with codes greater than or equal to 300 should be passed to a client or be redirected to nginx for processing with the error_page directive.

What does this mean? It means that, if fastcgi_intercept_errors is enabled, HTTP response codes with a value of 300 or greater (of which 404 is one) will trickle up and be intercepted by nginx regardless of the application (PHP) layer’s intent. Thus, even if the application displays something useful with the 404 message, nginx will intercept the response and display its own spartan 404 page.

Then a lightbulb came on.

Around line 63 or so of the XenForo lost-password handler, a 404 response is generated if a user’s name or email address doesn’t exist when a request is submitted to reset their password. This means that, in all likelihood, if someone mistypes their username (or email), they’ll receive a 404 message from nginx and no error message from XenForo. Rather than attempting to hit the back button, change their submission, and try again, the affected user is likely to believe the site is broken and they are therefore unable to reset their password. This couldn’t be further from the truth.

But it gets better. I didn’t stumble on fastcgi_intercept_errors initially, because I couldn’t get the 404 responses to bubble up to nginx for interception on my system (Arch Linux). The reason for this is that the /etc/nginx/fastcgi_params that ships with Arch is fairly sparse (the intent being that if you need more specific parameters, you add them yourself). But on this server, that wasn’t the case, because fastcgi_intercept_errors was enabled. Moreover, tagging along with it was a number of other parameters (I initially though it was the fault of Ubuntu), but it became clear that these were not the defaults that shipped with nginx (it doesn’t ship with fastcgi_intercept_errors enabled, for instance) or Ubuntu. Thus, any time a 404 error was dispatched from the application code, nginx intercepted it and displayed its 404 error page instead.

How many others might be affected by this? Who knows. If there’s a tutorial out there that recommends adding fastcgi_intercept_errors to /etc/nginx/fastcgi_params (it turns out there might be–read the update below), it goes without saying that the suggestion is a terrible idea if you expect your applications to generate and handle 404 errors (or greater). Of course, for high performance configurations, you probably want nginx to do as much of the heavy lifting as possible, but in this particular circumstance, we encountered an issue where using certain configuration options can yield unexpected results. In other words, don’t blindly copy and paste “suggested performance tweaks” from random Internet blogs without first consulting the documentation. The results may surprise you. (I don’t know if that’s what happened in this case, but I certainly can’t rule it out. For all I know, it may have been–and probably was–an accident.)

Is this the fault of XenForo? I’d argue no. While sending a 404 error in response to members that don’t exist who have attempted to reset their passwords is probably intended to be a RESTful response (not a member? respond with a 404) that does something unexpected (nginx 404 instead), it’s not a bug per se. However, it’s a potential privacy concern: Anyone who has access to a large corpus of email addresses can deduce based on the response returned by XenForo whether or not those email addresses are registered with the server. By doing so, a potential attacker can then target those accounts that are known to exist on the server for nefarious deeds (or violate their privacy).

But here is where we diverge into a matter of directed concerns: Do you provide users (some of whom might be terrible typists) with immediate feedback, indicating that the information they submitted was wrong, thereby increasing the usability of the site; or do you protect users’ privacy by generating the same response whether or not the information was submitted correctly? There is no right or wrong answer to this question, and generally it boils down purely to the matter of usability. Do you value ease of use or privacy? You can’t have both. Therefore, the answer rests on the shoulders of the site operator to decide. For sites that value users’ privacy, perhaps it’s better to sacrifice some usability for security.

For everyone else, it’s probably best to leave /etc/nginx/fastcgi_params at its default unless you absolutely must change the defaults. Better: Place the tweaks you need in a separate file and include that file separately.

Update I think I may have discovered one of the culprits for fastcgi_intercept_errors here, listed as the “Ultimate Speed Guide for WordPress on NGINX.” Hint: Don’t blindly copy everything someone on the Internet tells you is a good idea. If you don’t know what something does, always read the documentation. If you don’t, it will bite you.

No comments.

***

nginx v1.4.5 and IPv6

Saturday February 15th, 2014

I recently updated the VPS this blog is sitting on. Coincidentally, this also updated nginx to the latest version and broke everything. I didn’t think much of it at the time, but when I linked a friend to this post over on my fun blog, he was delivered to the default nginx page. Puzzled, I poked around for a while, mostly examining DNS records and server configurations. I couldn’t find anything wrong.

Then I had a eureka moment.

I’m on IPv6 at home. I have this site (and others) configured to use IPv6. It hadn’t occurred to me until then that it might be protocol related. Using curl (curl -4 and curl -6), I confirmed my suspicions. Although the server was listening on TCP and TCP6, it was only serving up the vhosts on IPv6 and not IPv4. IPv4 was receiving the standard welcome page.

I knew that I had configured the server appropriately for both stacks. I’ve read through the docs. I combed through dozens of blog posts documenting the process. I was convinced the server was correctly configured. I must’ve fiddled with it for a good hour or so, reviewing documentation and the likes to no avail.

Infuriating.

Since nginx 1.2 or 1.3 (I can’t remember precisely), it’s been necessary to add ipv6only=off to the listen directives in order to support a dual stack environment. It’s my understanding this trick doesn’t work on some BSDs, but I know for a fact it worked fine under Linux. Or so I thought. I tried it successfully under Arch and Ubuntu with identical results with the exception that I neglected to recall one minor detail: My Arch install updated to nginx 1.4-something well after I had configured my desktop for developing on a dual IPv4/IPv6 stack. I suspect it’s probably broken in the same manner. But, I use it strictly for development, so I’m not particularly concerned whether or not it works on IPv4. I don’t use the protocol much within my network, so why worry, right?

To continue: I decided to take another stab at it and discovered something curious. Previously, all that was required to enable dual-stack support in nginx was to add the following to whatever was configured as the default host

    listen [::]:80 ipv6only=off default_server

And then all subsequent vhosts simply required

    listen[::]:80;

That’s all. It used to work–like magic. But, sadly, magic eventually runs out. This is why electronics stop working once you let all the “magic smoke” escape. Sorry, it’s an old electrical engineering joke my father has oft repeated. I guess it’s brushed off onto me.

Anyway, here’s the solution. You might find it contrary to some of the antiquated information out there lurking on various blogs dating back from 2011 through the middle of 2013. It works for nginx 1.4.5 (and possibly earlier versions), but the trick is to add this to the default vhost configuration

    listen[::]:80 ipv6only=on default_server;
    listen 80 default_server;

And for all subsequent vhosts

    listen[::]:80;
    listen 80;

I should note it works fine without adding the ipv6only=on directive, just like the generic vhost config (above). I believe I’ve read that this is because the default behavior enables ipv6only automatically. However, if you’re running a slightly older version, you might need to keep it. Hence why I’m not going to remove it from my examples. Better safe than sorry, right?

default_server is (hopefully) obvious, but only required if you want to provide a default site (or page) for users hitting your web server’s IP. Or for ancient browsers that haven’t been taught how to use the Host header. Are there any of those left?

So, the trick is that you need two listen directives. Period. Yes, even for TLS/SSL. If you skip these directives on any vhost, the missing protocol binding will be skipped for that vhost. I suspect this is probably documented somewhere. The problem though is that there are literally dozens of blogs pointing the old instructions that used to work. These are now deprecated. Following them will only lead to sadness.

Initial frustration aside, I find meshes well with my preferences. It’s more explicit and there’s no question which protocols nginx will use when binding to the configure port or ports. However, it will cause headaches for IPv6-enabled sites migrating from nginx 1.2. So, if you’re running Ubuntu and have decided to update in order to gain access to newer features (websocket support, SPDY, et al), expect breakage. More importantly, be absolutely certain you’ve independently tested all of your deployed sites using IPv4 and IPv6. Make liberal use of the -4 and -6 switches for curl. It’ll save you from unpleasant surprises.

1 comment.

***