jQuery MSIE7 Attribute Traversal/Clone Crash Bug

I’ve been spending some time writing a client-side JavaScript library (more on this in a later post) that does a fair amount of DOM manipulation and has had its share of cross-browser, erm, idiosyncrasies. The foundation of the library depends rather heavily on jQuery for traversal, DOM insertion and removal, and the likes. It’s also fully unit-tested and therefore it’s relatively trivial to wire-in another browser, run up the unit tests, and see what works (or doesn’t). Unsurprisingly, every browser except most flavors of MSIE have worked without issues, but many of the MSIE issues I encountered have relied on minor workarounds for missing features or misbehavior that’s more noisome than frustrating.

That is to say until I discovered it’s possible to crash MSIE7 using a mix of traversal, detaching, and cloning. Needless to say, the unit tests barely finished, and I was left surprised. I’ve since fixed the problem, but it’s nevertheless been a curio in need of a simplified test case.

First, I want to point out that I suspect this may be tangentially related to this bug, although I’m reluctant to call it a jQuery “bug” considering that it works fine in every other version of MSIE (including 6 and ever other browser I’ve tested). Furthermore, since the release of jQuery 2.x drops support for MSIE 6, 7, and 8, the importance of addressing this “bug” is largely moot. Most applications, particularly single-page client-side applications that perform a great deal of work directly on the DOM, target newer browsers anyway. Only those with a large or predominantly non-technical audience ought to worry, and many of those sites likely don’t make use of much JavaScript outside advertising and analytics.

The bug manifests itself whenever the user manipulates the .data() method on an element and subsequently detaches that element and clones it. Actually, that’s a lie: Detaching and .data() manipulation can occur in any sequence. The important bit to take from this is that any manipulation of .data() on an element that has been or will be detached and further cloned will duplicate this behavior. Here’s a short example.

Given the HTML:

<div class="crash-me">Loop over this node's data, detach it, and clone it
to crash MSIE7.</div>

And the JavaScript (using jQuery):

var $element = $(".crash-me");
 
// Looping over data elements. The crash will occur even if they're empty.
$.each($($element[0]).data(), function(key, value){
  $("#values").append(key+": "+value+"<br>");
});
 
// Now to detach.
$element.detach();
 
// Crash.
for (var i = 0; i < 5; i++) {
  // You won't see the clones in MSIE7 'cause it's dead.
  // Technically, you only need the $element.clone() call.
  $("#nodes").append($element.clone());
}

You can also replace the $.each with a for-loop to the same effect:

var attributes = $element[0].attributes;
$($element[0]).data("test", 1);
for (var i = 0; i < attributes.length; i++) {
  // You only need to access nodeName or nodeValue to do this horrible deed.
  if (attributes[i].nodeName) {
    $("#values").append(attributes[i].nodeName+": "+attributes[i].nodeValue+"<br>");
  }
}

The difference, however, is that the $.each loop will trigger the crash whether or not the inner function does anything (since the nodes’ attributes have already been accessed); the for-loop requires that you do something with attributes, such as fetching the nodeName or nodeValue. This is essentially what $.each does anyway, so the two methods accomplish roughly the same thing.

The workaround for this is to simply avoid touching .data() on any node that is (or will be) detached from the DOM and operate only on the clones. In my case, I have a traverse() method that traverses over all DOM elements, looking for data-* attributes, and then examines their children for more data-* elements. To avoid this bug, I have a flag for the attribute handlers that indicates whether or not it detaches its children; if it does, traverse() simply ignores it and carries on with its business. It’s up to the handler to call traverse() on the clones of its detached children, never operating directly on the original children themselves. Since each handler that relies on traversal uses the same traverse() method for DOM manipulation, it’s easy to manage the fix in a single location.

Of course, if you have little choice but to examine the data attributes of a detached node before cloning them, you’re be out of luck. Fortunately, MSIE7 usage is declining, and if you’re outside the financial sector or government, you should be safe! My condolences otherwise.

No comments.
***

Updating an ancient Arch Installation

A close friend of mine recently decided it was time to update his Arch Linux installation. It had been over a year since the core OS was completely updated, and I did everything I could to discourage him from trying the usual pacman -Syu. The problem, for all intents and purposes, is that I knew it wouldn’t work. Much of the file system structure has changed dramatically over the course of the past year (the /lib move and the more recent merging of /bin and /sbin into their /usr counterparts) and a straightforward update was now difficult (but not impossible–more on this in a minute). Specifically, the glibc and filesystem packages have gone through several iterations even last July and would now be permanently blocking each other thanks to these two updates with no immediately obvious path forward.

I have observed some comments dismissing the update process from a pre-systemd installation as virtually impossible. While I suspect this is because the individual(s) asking for help are seen by others to be insufficiently experienced to do such an update, I’m not a particularly huge fan of the term “impossible” as it pertains to difficult feats–even for the inexperienced. After all, few things are truly impossible; it’s simply a matter of how much time and energy one is willing to invest in working toward a specific goal. Besides, if the recommended solution is to reinstall, why not figure out an optimal way to upgrade an “impossible” system? If it’s going to break either way, we may as well have some fun with it. (You have backups, right?) Even if you don’t have the necessary experience or knowledge relating to the underlying system, there’s no time like the present to begin the learning process.

So, Thursday evening, I set out looking for a solution. Fortuitously, I had an old Arch Linux virtual machine installed on my desktop (which, conveniently, is also Arch Linux) that dated back to approximately the same kernel revision as my friend’s (v3.3.6 for mine; v3.3.4 for his) and roughly analogous software. Like his system, my VM was also a pre-systemd installation, was still running glibc 2.15, and a very early version of filesystem-2012. In many regards, my VM was more or less in a state identical to that of the machine we needed to fix. However, the problem was compounded by an additional requirement. Console access was somewhat tedious for my friend to obtain because of the systems he had to go through, and he was exceptionally busy the afternoon we planned the updates. So, I had to to keep network access (particularly SSH) up and running as best as I could.

Disclaimer

This narrative guide is intended as general reference for anyone who might be in a similar situation: Pre-systemd, pre-filesystem moves, and operating with a requirement that the machine be (mostly) network-accessible throughout the duration of the update. Be aware that nothing in this guide should be considered authoritative. I am a user of Arch Linux. There’s still much about system internals (and Linux in general) that I don’t know or don’t fully understand. Consequently, I’m always learning, and there may be better alternatives to specific problems encountered during this update process. However, I do know enough about Arch to recommend that you never use –force unless you’re specifically instructed to do so. At no point during this update process should you use it. Doing so will assuredly destroy your system, and you’ll be forced to reinstall.

Secondly, this update process is not supported by the Arch Linux developers. If you have failed to maintain your system by keeping it up-to-date (and updating your Arch Linux installation is one of the core tenants of being an Arch user), you’re unlikely to receive much help since you’ve already established that your OS has been inadequately maintained. Furthermore, this update process relies on tricks using partial updates which are also unsupported. The developers and users of Arch who frequently the forums are nice folks who have lives outside of Arch, and due to the tremendous task presented to the community of supporting a relatively large user base, it’s impractical for them to spend a significant chunk of their time helping you with issues that–to avoid mincing words–are the fault of no one but yourself.

Thus, I hope this guide will be useful to those of you who may have a neglected Arch Linux box under your care. Be aware that I have targeted this guide to a system that was last updated somewhere in mid-2012. Although it should work for earlier systems, I’d highly recommend reading through the Arch news archives if you’re updating such a beast. You can usually determine where you need to begin by examining the version of your filesystem package, e.g. pacman -Qs filesystem. This guide should be general enough such that even if it doesn’t help you determine the exact update path for your system, you’ll be able to figure out where to get started.

Also, be sure to read through the guide in its entirety first. Check your installed packages, and make sure you’re not downgrading them. filesystem and glibc should never be downgraded at any point during this update otherwise you may exacerbate system breakage. This guide illustrates the update process with the x86_64 architecture. If you’re using a 32-bit system, you’ll need to replace all references to x86_64 with i686.

Read more…

No comments.
***

No Decode Delegates… and Conflicts

It’s not often that we run into unusual conflicts rooted in hard to find places, but those few times that they are encountered–no matter how rare–can be infuriating. I was recently bitten by this one, demonstrated by a particularly bizarre error when attempting to load an image (any type) into the php-imagick ImageMagick extension:

$ php -a 
Interactive shell
 
php > $c = new Imagick('kmix.png');
PHP Warning:  Uncaught exception 'ImagickException' with message 'no decode delegate for this image format `kmix.png' @ error/constitute.c/ReadImage/552' in php shell code:1
Stack trace:
#0 php shell code(1): Imagick->__construct('kmix.png')
#1 {main}
  thrown in php shell code on line 1

Yet, paradoxically, convert -list delegate, convert -list configure, and convert -list format all hinted that PNG and JPEG support were both installed. Worse, searching for “php imagick no decode delegate” resulted in few useful hits, and of those hits most were indicative of configuration problems, custom builds missing the appropriate libraries, and other dilemmas punctuated by the usual PEBKAC retort (though such accusations are sometimes accurate).

Nothing worked, of course. Swapping out to older versions of ImageMagick failed, and I had no intention of trying an earlier version of PHP or php-imagick. At this point, I had no expectations to find the solution online, and the only thing I surmised might work would be to attempt duplicating the problem on another machine. So, I set about installing ImageMagick (v6.8.5-10) and php-imagick (v3.0.1 plus PHP 5.4 patches and v3.1.0RC2) with the current-latest version of PHP available in the Arch repositories (v5.4.16) and quickly discovered that it worked quite well. This was a positive sign: Nothing about the latest builds indicated there was a problem, so it was plausible that the problem lurked elsewhere.

And it did.

I soon noted that the virtual machine I booted was lacking php-gmagick and GraphicsMagick both, though ImageMagick and GraphicsMagick worked quite well on my machine together. Thus, I speculated that perhaps php-gmagick was conflicting somehow with php-imagick. I commented out the gmagick extension, re-ran some unit tests, and presto! php-imagick was now working perfectly fine. Uncommenting the gmagick extension resulted in a return of the problem, and so I was (and am) convinced that the problem is a conflict between both extensions. Possibly, this is due to the definition of symbols or other such nonsense in gmagick that imagick requires (and is very upset when it discovers they already exist).

Is it possible to get them to play nicely together? Maybe.

On my machine, PHP extensions and their options are loaded from /etc/php/conf.d. These files are, apparently, loaded alphabetically in order, and since “G” comes before “I,” you might imagine (correctly) then that gmagick gets loaded before imagick which is where the magic stops (my apologies; that pun was awful). By renaming them to something creative, such as “10-imagick.ini” and “20-gmagick.ini”, it is then possible to force imagick to load before gmagick, and they’ll both work cooperatively near as I can tell. Perhaps gmagick defines–but doesn’t use–some interpreter symbols that imagick actually needs, but if they’re already defined gmagick doesn’t care? I don’t know. I can only speculate, because I know next to nothing about PHP internals. (Perhaps now might be a good time to start.)

Although renaming or otherwise moving the extension configs into an order that would suggest imagick.so gets loaded first seems like an adequate solution, the problem is that these two extensions most assuredly do conflict. Between PHP versions, you can’t be guaranteed that the loader mechanism for extensions will remain static and thus the possibility exists that renaming the *.ini files is only an ephemeral solution. Moreover, it’s also possibly an accident that it works on my machine; perhaps the inode order is such that imagick.so now gets loaded before gmagick.so simply due to luck. Or maybe I’ve now reduced it to a race condition? The point is that you shouldn’t rely on this in production. Instead, you should select between GraphicsMagick or ImageMagick, picking the one that best fits your use case. If you’re using a wrapper library to select the best backend available, force it to load a specific one in production mode. Obviously such concerns aren’t much of a bother on a development box, but I’d nevertheless recommend a healthy dose of caution.

If you’ve encountered this issue yourself but never found a reasonable solution, let me know. If this article helps in any measurable way, I’d be happy to hear from you. It’s been a frustrating few hours for me, and I hope that this will at least alleviate you of wasting further time on something that is poorly documented but easy to fix.

2 comments.
***