The Arch Linux Rant

This post is part one in a two part series (part two can be read here). In this post, I’ll explore some of my thoughts and feelings about Arch from a fairly high level vantage point–consider it a 50,000′ review. In the next post, I’ll explain why I feel comments like this one demonstrate a certain degree of naïvety, although it’s by far one of the more benign ones I’ve encountered in my short time partaking in the Arch community.

For those of you who know me or, at the very least, read my (mostly) pointless and fairly infrequent rants, you’ll probably recall my decision to leave Gentoo. Some eight or nine months later, I changed my mind and stuck with it for a while longer. That’s no longer the case, and as of January this year I permanently switched over to Arch Linux. I’ve been holding off on posting this rant primarily to take a “wait and see” approach to determine how well I’d do without Gentoo.

The short version is thus: I haven’t missed Gentoo one bit. I started about a year ago (maybe less), running Arch in a VM, tweaking settings, playing around with KDE 4.7-something, and spent quite some time familiarizing myself with the system. Coming from Gentoo and FreeBSD, the transition wasn’t that bad–I’m used to taking a greater hands-on approach with my systems than most people might be comfortable with–so ultimately, Arch felt like a natural fit.

Then I took the plunge.

For the interested, I should share a little bit of history. My initial foray into the world of Unix and Unix-like operating systems began around 2000ish (actually earlier, but it wasn’t until then when I started tinkering with them for my own purposes) with OpenBSD. I later transitioned to FreeBSD for a variety of reasons–mostly the improved performance and greater compatibility (at that time) with other software. It wasn’t until quite some time after that when I began using the ports collection, and prior to then I made a habit of configuring and compiling most of the software I used by hand. As my software library began to increase, so too did the amount of time needed to invest into updating it. I eventually began looking for other solutions, although I kept a FreeBSD system at home for a number of years thereafter.

Sometime later, around late 2003 or early 2004 (possibly as late as 2005), I began experimenting with Gentoo. While I tried other Linux distros, I found most of them too alien in contrast with my beloved *BSDs. Their package managers were strange and sometimes convoluted, and I had little idea how to properly install custom configured software as I was prone to doing. The BSD way worked, but I knew it wasn’t optimal since /usr/local seemed to be mostly unused or remained distro-specific as to its preferred usage. Gentoo was a reasonable fit: Its well-documented nature, a ports-like collection (portage), and the tendency to keep mostly out of the user’s way beckoned me to take a closer look. Better yet, though the system was compiled mostly from scratch, configuring individual options and customizing packages was simple and very BSD-like.

I’ll keep this as short as I can, but I’ll just say this: I ran Gentoo on my desktop and as a file server for a number of years, probably from 2005-2006 until just last year (2011). Indeed, in 2006, I made the switch to use Gentoo almost exclusively, even while I was going to university. Later that same year, my FreeBSD install on my file server was gone, victim to the spread of Gentoo. Don’t get me wrong: I still love FreeBSD–and Gentoo, too–but for home use FreeBSD didn’t seem cut out for what I wanted to do. Granted, I’d still run it in circumstances where I needed a stable, long-running platform as a web service or similar, but at that time, Ports often lagged behind Gentoo and sometimes the only way you could install software was to download and compile it yourself. Not that there’s anything wrong with that, of course. (N.B.: In 2010, the situation somewhat reversed: Ports was updated with newer software versions while Portage stagnated. Strange what a few years will do.)

So, the transition to Gentoo was made because everything I wanted was just an emerge away. Things remained like this for a long time, too, because Gentoo was at the bleeding edge, and the convenience of a rolling-release plus fairly simple package management was seductive. I think that’s why I had a hard time adopting other distros. Frequent, new releases of software allowed me to continue sampling what was coming down the turnpike and forced me to stay up to date on current software.

Gentoo was not without its warts, of course. I won’t go into detail here as I’ve linked to my thoughts on the matter earlier in this post–and they should be mostly up to date–but the problems that began as a slow trickle eventually turned into a torrent of disaster. Occasional dependency conflicts requiring manual intervention, lengthy X server compiles (and let’s not get into glibc or a desktop environment like KDE), the lack of package maintainers, and a slow, downward spiral into the abyss lead me to look for alternatives. I didn’t want to part with the rolling release model; indeed, once you’ve sampled such ambrosia, it’s difficult to whet your palate with meager release-based distros. Around this time (late 2010, early 2011), I was participating fairly regularly in various Slashdot discussions and made a similar offhanded remark about my concerns with Gentoo, and someone suggested trying Arch Linux. So I did.

The thing that struck me the most about Arch was its simplicity. There’s very little that it does for you. System configuration using the standard init scripts is exceptionally simple and straightforward (mostly in /etc/rc.conf), and with the exception of the package manager and build system, Arch makes Gentoo look like an impenetrable fortress of automation. Within a few days of using Arch, I fell in love with the brilliantly minimalistic design.

Truth be told, there’s a lot about Arch to love and enjoy. It does take some time to get the system configured to your liking, but the installation process is mostly painless and fairly simple (bugs in the installer notwithstanding). Using an AUR helper like yaourt isn’t necessary but strongly recommended. It also helps to have at least a passing understanding of how makepkg and pacman work and interact. Of course, it isn’t necessary to use extras like the Arch Build System (ABS) and the Arch User Repository (AUR), but to ignore them is to do yourself a great disservice: There’s so much software available on the AUR that it makes Ubuntu’s impressive repositories and widespread support (through .debs) seem almost anemic.

However, I don’t think it’s possible to use Arch for very long until you’re drawn in by the allure of creating your own PKGBUILDs. I think that’s at least part of what I like most about Arch. Unlike Gentoo’s ebuilds, PKGBUILDs are simple. Armed with just a basic understanding of sh-like syntax (e.g. bash), a moderately skilled user otherwise unfamiliar with PKGBUILDs could put together a custom one in under an hour. More advanced users could piece together PKGBUILDs customizing their favorite software in an evening or two (mostly dependent upon how many packages they want). But here’s the trick: If your PKGBUILD fails for whatever reason, it’s unlikely to break your system save for inappropriate dependencies or “conflicts” statements. Since pacman (Arch’s package manager) examines the contents of packages, including those generated by custom PKGBUILDs, and determines where each file in the archive is to be placed, it takes an exceptionally stupid habit (usually using -f for force) to circumvent pacman’s all-seeing eye.

To illustrate: When Arch released GIMP 2.8, I was disappointed by some of the new features. The solution? Create a GIMP 2.6 package! (You can download the PKGBUILD here, but be sure to grab both the gegl and babl PKGBUILDs, too.) Since the Arch project provides all the appropriate PKGBUILDs for the official repos, it’s easy to find the one you want, download it, and then modify it to your liking. You don’t even have to deal with the headaches caused by portage overlays.

The astute reader may have noticed that I haven’t yet addressed the issue of binary package updates with regards to the core system and official repositories. There’s a reason for that. While having the binary packages available is nice, and it’s certainly better than spending two days compiling KDE, it wasn’t my primary reason for switching. Binary releases did impact my decision–and don’t get me wrong, I love being able to download just the updates I need without compiling anything but the few AUR packages I have–but it’s one of those conveniences that’s nice to have. I won’t begrudge compiling the system (or kernel) from scratch, because I spent so many years with stagnant Gentoo systems compiling off and on over a week or two. It would seem a tad bit hypocritical of me to complain about compile times. Besides, if you swing that way, you can do that with Arch, too.

I don’t mean to wax optimistic about the glory of Arch. It’s not without the occasional sharp edge, and there’s a plethora of things that will snag the naive user who hasn’t yet developed the healthy habit of thoroughly reading documentation. I certainly wouldn’t recommend it to someone who’s new to Linux in general (they should use Ubuntu or a similar distribution), but for others who are at least familiar with the shell and want a do-it-yourself distro, Arch is certainly worth a look.

Interestingly, and unexpectedly, Arch has made using Linux fun again. I’m reminded of the days when I first started getting into Gentoo after understanding its quirks and loving every minute of it. The difference, though, is that I spend so much time doing things in Arch and with Arch that I almost don’t do much else (not even games!). My Windows install is probably lonely; it might receive a boot once every three weeks for the occasional update or game that I can’t otherwise get working under Wine, but even then, there’s so much more to explore with Arch.

No comments.
***

Shell Voodoo, Connected IPs, and Counting Total Connections

I’m posting this mostly as a note to myself, but if you, future visitor, stumble upon this post and have improvements or other things you’d like to share, be my guest. Posts that are overly critical of the methodologies provided by others, or those which otherwise add nothing to the discussion will be removed. This is especially true for those espousing beliefs that PowerShell is superior.

I won’t go into the exact details of why we needed to do this, but the general break down is thus:

  • Get a list of connected IP addresses
  • Sort them
  • Count how many connections were made from a single address

Fortunately, the solution turns out to be quite easy. For FreeBSD:

netstat -anfinet | grep -v 127.0.0.1 | awk '{ print $5 }' | \
grep -E '.*([0-9]{1,4}\.)+.*' | sed 's/\(.*\)\..*/\1/' | \
sort -g -k 1 | uniq -c | sort -n -k 1

And for most derivatives of Linux:

netstat -anW --tcp --udp | grep -v 127.0.0.1 | awk '{ print $5 }' | \
grep --color=never -E '.*[0-9]{1,4}(\.|\:).*' | sed 's/\(.*\)\:.*/\1/' | \
sort -g -k 1 | uniq -c | sort -n -k 1

You may need to modprobe sctp to get the --tcp and --udp netstat flags working. Also, both of these should work with IPv6 addresses, too, which is why I’ve tried to keep the sed regex as simple as possible.

What the Eff is This?!

Okay, I agree. I’ve probably made some kind of mistake somewhere; I don’t know awk or sed quite as well as I should (easily fixed, if I ever wanted to spend a weekend learning). That said, here’s my understanding of how this should work. First, we’ll deal with the FreeBSD derivative, line by line:

FreeBSD

Here is a breakdown for the FreeBSD-specific stuff:

netstat -anfinet | grep -v 127.0.0.1 | awk '{ print $5 }' | \

As with all platforms I’m aware, -an shows all connections by their numerical addresses. netstat prefers to perform a reverse lookup on every address, and this can take some time. However, the FreeBSD-specific option -f inet specifies to only show INET (IPv4/IPv6) addresses and eliminates much of the cruft associated with local Unix domain sockets. Likewise, we trim localhost from the list with grep -v, and we fetch the 5th output column using awk

grep -E '.*([0-9]{1,4}\.)+.*' | sed 's/\(.*\)\..*/\1/' | \

Moving on to the next line, we fetch only those lines that contain something that vaguely resembles an IP address with grep -E (I prefer to use -E here since it gives us the extended regex syntax), and we pass the results into sed to strip off the trailing remote host’s port number. Alternatively, you could use something like 's/^\([0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\).*/\1/' instead to filter out IPv4 addresses, but since we already know roughly what to expect from the input we can simplify our regex. Furthermore, we also know that the IP address of the remote host in FreeBSD will always have a dot followed by the port number appended, and we can naively remove this.

sort -g -k 1 | uniq -c | sort -n -k 1

Lastly, we sort (generically, with -gunique addresses in our list including their totals, and we sort numerically by the first column (now containing the count).

Linux

Here is a breakdown for the Linux-specific stuff:

netstat -anW --tcp --udp | grep -v 127.0.0.1 | awk '{ print $5 }' | \

Following in the footsteps of FreeBSD, we use -an to display all connected numeric addresses so we don’t waste time running reverse lookups. However, in most Linux distributions, lengthy columns–and especially IPv6 addresses–will be truncated by netstat’s output. To counter this, we use -W to show the wide listing, and we use --tcp and --udp to filter out only those protocols. You may need to modprobe sctp in order to get this to work; if you can’t, this string of commands might still work. Lastly, we filter connections to localhost with grep -v, and we fetch the 5th column using awk Easy enough, right?

grep --color=never -E '.*[0-9]{1,4}(\.|\:).*' | sed 's/\(.*\)\:.*/\1/' | \

In this next line, we use the extended regex feature of grep -E to filter out lines that look somewhat address-y, and we separate the remote host’s address from its port using sed. In this case, Linux appends port numbers using a colon (:), so we have to deviate slightly from the FreeBSD example. Also, since some distros might alias grep with grep --color=auto|always, we use --color=never to eliminate feeding ANSI control characters to sed.

sort -g -k 1 | uniq -c | sort -n -k 1

Lastly, we sort by the IP address using a generic sort (-g), filter out only those addresses that are unique, count them, and then sort by the count column which is now tacked onto the front.

Now we can get a fancy list of IP addresses, how many connections from them are being made to us, and sort them accordingly! Manipulating grep accordingly can re-introduce localhost or remove specific addresses that might not be of interest.

No comments.
***

The Return of Gentoo

About nine months ago I wrote a post title The End of Gentoo. At the time, the article mostly echoed my growing frustrations with the lack of maintainer support for the vast collection of software in Portage, Gentoo’s repository and package management subsystem. Although the gentoo-server mailing list has all but dried up, gentoo-user has seen a marked increase in activity. Whether seasonal or otherwise, I think it’s a positive sign.

Another positive sign that comes to mind is the increased frequency and speed with which package maintainers have been pushing stable (and sometimes unstable) package versions out the door. For example, I was surprised to discover that MongoDB exists in Gentoo at version 1.8.2 as of this writing, which is conveniently the same version in FreeBSD’s ports collection. Ubuntu is decidedly behind the curve, holding in at around version 1.4.x. Of course, with sufficient digging, you can find prebuilt .debs of 1.8.3, or you can always fall back on building from source. Then again, I’m somewhat torn with regards to this: Sure, it brings back memories of earlier days when I often had to build packages by hand just to apply security fixes or obtain new versions, but I also wonder what the value is to it. After all, if I abandoned Gentoo to avoid the nightmare of compile-wait-restart, what’s the point if I leap over to another distribution that is forcing me to do exactly the same thing (except with less automation)?

Given the nature of work and my current projects, I’ve discovered that Gentoo suits my needs best. I can obtain fairly new versions of packages with some degree of customization without the need to manually run the ./configure && make && make install cycle by hand. Downgrading is also fairly easy, provided it doesn’t affect too many packages. However, I’ve found that eselect for those packages it supports can be an exceedingly welcome tool in the developer’s arsenal. I may not use it with any degree of regularity, but the option of setting the system default of a specific package to one version or another is appealing. I suspect this will be mostly useful for any Python-based tools I write in the near future, particularly given the split that is currently underway between 2.x and 3.x, but eselect also works with a handful of other systems that exhibit some degree of change between versions, including PostgreSQL and Boost.

But, I confess that none of this really influences my motivation for writing this post. Well, with the exception of V8 and MongoDB.

I think that much of my decision revolves around familiarity and maybe, if I were to make something of a stretch, annoyance. Ubuntu on the desktop looks absolutely beautiful. I love it. I really do. But the moment you dare to venture beyond the official packages it shipped with (think instant messengers), you begin to encounter various bits of weirdness that fester into a sore. Ubuntu has a great community of developers and supporters, but sometimes more peculiar problems are harder to find via search simply because of the noise level generated by its popularity. There’s nothing wrong with that–in fact, that’s an excellent problem for a distribution to have–but for unusual issues, it often makes finding the answer an uphill battle that is difficult to win without some patience. Add this to the abomination that is NetworkManager (installed and enabled by default), the excessively annoying network configuration borrowed from Debian, and whatever blasphemous modifications have been made to sysvinit, and one starts to see a pattern that makes this distribution more than a little tiring to those who simply wanted something that Just Worked.

It’s ironic in a way. I read an article a couple of weeks ago praising Linux Mint for many of these same reasons that Ubuntu seems deficient. Perhaps I should give it a try…

Yet time and again, I find myself drawn to Gentoo. It’s a rough distribution to maintain. It has many sharp edges. It’s not exceptionally good for use on a server where security updates may need to be applied from upstream regularly. It’s not even really that great for low powered desktops (try compiling Xorg and the desktop manager of your choice on a Netbook without distcc or cross-compilation on another system and then get back with me). Time and again, Gentoo lures me in. Why? Well, I’m starting to think that the answer is more complicated than simply “familiarity.” Perhaps I should take back what I said earlier.

About 8 or 9 years ago, I started toying around with a handful of Linux distributions. The only *nix-based systems I knew at the time were FreeBSD and OpenBSD; I had no idea what Linux really was, why there was such a significant chasm between the userland and kernel, or even really what the differences were between distributions. Superficially, I just assumed that the init systems were largely identical, and individual distributions simply customized various subsystems here and there. I had no idea that the world of Linux was vastly different from that of FreeBSD. In the latter, kernel and userland development is largely one and the same. FreeBSD is the kernel. It’s also the world. From init to various userland tools (yes, even ls) to device drivers (oh fxp0, how I miss you), development continued as a part of a single cohesive continuum. Little did I know, the Linux world is almost the polar opposite of that.

I was introduced to Gentoo by my friend John G. who suggested it as a more “BSD-like” distribution of Linux. He was right–everything about Gentoo seemed to be a GNU-derived analog of the BSD world with the one exception that it was decidedly Linux-flavored. But the most important lesson I took from Gentoo was that of how an operating system is put together–from scratch, but with training wheels. Sure, I knew all of the basic steps: There’s the file system, the kernel, the userland tools, and then there’s various odds and ends here and there that are glued in place to make life easier (or more miserable). In some ways, it’s almost a surprise any of this actually works as well as it does.

Yet I think it was that experience with Gentoo that won my heart. Not only do you have to partition the file systems yourself, but you have to effectively bootstrap the entire system from a live CD (or other Linux distribution), prepare it, and configure it, but you also have to build the kernel and all of the utilities yourself. To this end, I think Gentoo should be a required topic in any operating system course in every CS program at all universities. It’s like Linux From Scratch set to super-easy-mode. It’s no surprise then that any time I want to learn anything new, the best way for me is to pick it up under Gentoo and play with it.

And let’s be honest, Gentoo probably has one of the very best network configuration systems in the Linux world. It better–because it’s the kindred spirit of FreeBSD’s network configuration via rc.conf, except that it’s not. Well, not completely.

This isn’t to say that Gentoo is all sunshine and roses. It certainly does have more than its fair share of sharp edges. I recently reinstalled it on my desktop (no, I still have my Ubuntu install) only to discover that it still takes the better part of a weekend (and then some) to configure, build, and find everything you want, get things situated exactly right, and then discover that there’s one or two minor annoyances still eating away at you. For me, those annoyances are font-related, but I suppose nothing’s perfect. Ubuntu’s fonts are about as close to perfection as possible in the Linux world. Although, I admit that sound and sound support sucks badly in both. Oh, and don’t get me started on media players. I spent most of my free time this week messing around with the damned things only to discover that nearly every single one available is absolutely terrible. I miss Amarock 1.4. They had a good thing going…

The most important lesson I’ve taken from the time period between now and the time I wrote that fairly anti-Gentoo rant is something worth repeating: Nothing perfect. No distribution is perfect, no one distribution will do everything you want, and compromise is always a necessity. I still like Ubuntu for its aesthetics, but Gentoo is still the most appropriate solution for a general purpose workstation. I guess some things never really do change.

So, lesson learned: Rants are stupid. The future you is always the wisest. Sometimes you look back on what you wrote and wonder what the hell you were thinking. Long live Gentoo!

No comments.
***