jQuery MSIE7 Attribute Traversal/Clone Crash Bug

I’ve been spending some time writing a client-side JavaScript library (more on this in a later post) that does a fair amount of DOM manipulation and has had its share of cross-browser, erm, idiosyncrasies. The foundation of the library depends rather heavily on jQuery for traversal, DOM insertion and removal, and the likes. It’s also fully unit-tested and therefore it’s relatively trivial to wire-in another browser, run up the unit tests, and see what works (or doesn’t). Unsurprisingly, every browser except most flavors of MSIE have worked without issues, but many of the MSIE issues I encountered have relied on minor workarounds for missing features or misbehavior that’s more noisome than frustrating.

That is to say until I discovered it’s possible to crash MSIE7 using a mix of traversal, detaching, and cloning. Needless to say, the unit tests barely finished, and I was left surprised. I’ve since fixed the problem, but it’s nevertheless been a curio in need of a simplified test case.

First, I want to point out that I suspect this may be tangentially related to this bug, although I’m reluctant to call it a jQuery “bug” considering that it works fine in every other version of MSIE (including 6 and ever other browser I’ve tested). Furthermore, since the release of jQuery 2.x drops support for MSIE 6, 7, and 8, the importance of addressing this “bug” is largely moot. Most applications, particularly single-page client-side applications that perform a great deal of work directly on the DOM, target newer browsers anyway. Only those with a large or predominantly non-technical audience ought to worry, and many of those sites likely don’t make use of much JavaScript outside advertising and analytics.

The bug manifests itself whenever the user manipulates the .data() method on an element and subsequently detaches that element and clones it. Actually, that’s a lie: Detaching and .data() manipulation can occur in any sequence. The important bit to take from this is that any manipulation of .data() on an element that has been or will be detached and further cloned will duplicate this behavior. Here’s a short example.

Given the HTML:

<div class="crash-me">Loop over this node's data, detach it, and clone it
to crash MSIE7.</div>

And the JavaScript (using jQuery):

var $element = $(".crash-me");
 
// Looping over data elements. The crash will occur even if they're empty.
$.each($($element[0]).data(), function(key, value){
  $("#values").append(key+": "+value+"<br>");
});
 
// Now to detach.
$element.detach();
 
// Crash.
for (var i = 0; i < 5; i++) {
  // You won't see the clones in MSIE7 'cause it's dead.
  // Technically, you only need the $element.clone() call.
  $("#nodes").append($element.clone());
}

You can also replace the $.each with a for-loop to the same effect:

var attributes = $element[0].attributes;
$($element[0]).data("test", 1);
for (var i = 0; i < attributes.length; i++) {
  // You only need to access nodeName or nodeValue to do this horrible deed.
  if (attributes[i].nodeName) {
    $("#values").append(attributes[i].nodeName+": "+attributes[i].nodeValue+"<br>");
  }
}

The difference, however, is that the $.each loop will trigger the crash whether or not the inner function does anything (since the nodes’ attributes have already been accessed); the for-loop requires that you do something with attributes, such as fetching the nodeName or nodeValue. This is essentially what $.each does anyway, so the two methods accomplish roughly the same thing.

The workaround for this is to simply avoid touching .data() on any node that is (or will be) detached from the DOM and operate only on the clones. In my case, I have a traverse() method that traverses over all DOM elements, looking for data-* attributes, and then examines their children for more data-* elements. To avoid this bug, I have a flag for the attribute handlers that indicates whether or not it detaches its children; if it does, traverse() simply ignores it and carries on with its business. It’s up to the handler to call traverse() on the clones of its detached children, never operating directly on the original children themselves. Since each handler that relies on traversal uses the same traverse() method for DOM manipulation, it’s easy to manage the fix in a single location.

Of course, if you have little choice but to examine the data attributes of a detached node before cloning them, you’re be out of luck. Fortunately, MSIE7 usage is declining, and if you’re outside the financial sector or government, you should be safe! My condolences otherwise.

No comments.
***

Updating an ancient Arch Installation

A close friend of mine recently decided it was time to update his Arch Linux installation. It had been over a year since the core OS was completely updated, and I did everything I could to discourage him from trying the usual pacman -Syu. The problem, for all intents and purposes, is that I knew it wouldn’t work. Much of the file system structure has changed dramatically over the course of the past year (the /lib move and the more recent merging of /bin and /sbin into their /usr counterparts) and a straightforward update was now difficult (but not impossible–more on this in a minute). Specifically, the glibc and filesystem packages have gone through several iterations even last July and would now be permanently blocking each other thanks to these two updates with no immediately obvious path forward.

I have observed some comments dismissing the update process from a pre-systemd installation as virtually impossible. While I suspect this is because the individual(s) asking for help are seen by others to be insufficiently experienced to do such an update, I’m not a particularly huge fan of the term “impossible” as it pertains to difficult feats–even for the inexperienced. After all, few things are truly impossible; it’s simply a matter of how much time and energy one is willing to invest in working toward a specific goal. Besides, if the recommended solution is to reinstall, why not figure out an optimal way to upgrade an “impossible” system? If it’s going to break either way, we may as well have some fun with it. (You have backups, right?) Even if you don’t have the necessary experience or knowledge relating to the underlying system, there’s no time like the present to begin the learning process.

So, Thursday evening, I set out looking for a solution. Fortuitously, I had an old Arch Linux virtual machine installed on my desktop (which, conveniently, is also Arch Linux) that dated back to approximately the same kernel revision as my friend’s (v3.3.6 for mine; v3.3.4 for his) and roughly analogous software. Like his system, my VM was also a pre-systemd installation, was still running glibc 2.15, and a very early version of filesystem-2012. In many regards, my VM was more or less in a state identical to that of the machine we needed to fix. However, the problem was compounded by an additional requirement. Console access was somewhat tedious for my friend to obtain because of the systems he had to go through, and he was exceptionally busy the afternoon we planned the updates. So, I had to to keep network access (particularly SSH) up and running as best as I could.

Disclaimer

This narrative guide is intended as general reference for anyone who might be in a similar situation: Pre-systemd, pre-filesystem moves, and operating with a requirement that the machine be (mostly) network-accessible throughout the duration of the update. Be aware that nothing in this guide should be considered authoritative. I am a user of Arch Linux. There’s still much about system internals (and Linux in general) that I don’t know or don’t fully understand. Consequently, I’m always learning, and there may be better alternatives to specific problems encountered during this update process. However, I do know enough about Arch to recommend that you never use –force unless you’re specifically instructed to do so. At no point during this update process should you use it. Doing so will assuredly destroy your system, and you’ll be forced to reinstall.

Secondly, this update process is not supported by the Arch Linux developers. If you have failed to maintain your system by keeping it up-to-date (and updating your Arch Linux installation is one of the core tenants of being an Arch user), you’re unlikely to receive much help since you’ve already established that your OS has been inadequately maintained. Furthermore, this update process relies on tricks using partial updates which are also unsupported. The developers and users of Arch who frequently the forums are nice folks who have lives outside of Arch, and due to the tremendous task presented to the community of supporting a relatively large user base, it’s impractical for them to spend a significant chunk of their time helping you with issues that–to avoid mincing words–are the fault of no one but yourself.

Thus, I hope this guide will be useful to those of you who may have a neglected Arch Linux box under your care. Be aware that I have targeted this guide to a system that was last updated somewhere in mid-2012. Although it should work for earlier systems, I’d highly recommend reading through the Arch news archives if you’re updating such a beast. You can usually determine where you need to begin by examining the version of your filesystem package, e.g. pacman -Qs filesystem. This guide should be general enough such that even if it doesn’t help you determine the exact update path for your system, you’ll be able to figure out where to get started.

Also, be sure to read through the guide in its entirety first. Check your installed packages, and make sure you’re not downgrading them. filesystem and glibc should never be downgraded at any point during this update otherwise you may exacerbate system breakage. This guide illustrates the update process with the x86_64 architecture. If you’re using a 32-bit system, you’ll need to replace all references to x86_64 with i686.

Read more…

No comments.
***