Shell Voodoo, Connected IPs, and Counting Total Connections

I’m posting this mostly as a note to myself, but if you, future visitor, stumble upon this post and have improvements or other things you’d like to share, be my guest. Posts that are overly critical of the methodologies provided by others, or those which otherwise add nothing to the discussion will be removed. This is especially true for those espousing beliefs that PowerShell is superior.

I won’t go into the exact details of why we needed to do this, but the general break down is thus:

  • Get a list of connected IP addresses
  • Sort them
  • Count how many connections were made from a single address

Fortunately, the solution turns out to be quite easy. For FreeBSD:

netstat -anfinet | grep -v 127.0.0.1 | awk '{ print $5 }' | \
grep -E '.*([0-9]{1,4}\.)+.*' | sed 's/\(.*\)\..*/\1/' | \
sort -g -k 1 | uniq -c | sort -n -k 1

And for most derivatives of Linux:

netstat -anW --tcp --udp | grep -v 127.0.0.1 | awk '{ print $5 }' | \
grep --color=never -E '.*[0-9]{1,4}(\.|\:).*' | sed 's/\(.*\)\:.*/\1/' | \
sort -g -k 1 | uniq -c | sort -n -k 1

You may need to modprobe sctp to get the --tcp and --udp netstat flags working. Also, both of these should work with IPv6 addresses, too, which is why I’ve tried to keep the sed regex as simple as possible.

What the Eff is This?!

Okay, I agree. I’ve probably made some kind of mistake somewhere; I don’t know awk or sed quite as well as I should (easily fixed, if I ever wanted to spend a weekend learning). That said, here’s my understanding of how this should work. First, we’ll deal with the FreeBSD derivative, line by line:

FreeBSD

Here is a breakdown for the FreeBSD-specific stuff:

netstat -anfinet | grep -v 127.0.0.1 | awk '{ print $5 }' | \

As with all platforms I’m aware, -an shows all connections by their numerical addresses. netstat prefers to perform a reverse lookup on every address, and this can take some time. However, the FreeBSD-specific option -f inet specifies to only show INET (IPv4/IPv6) addresses and eliminates much of the cruft associated with local Unix domain sockets. Likewise, we trim localhost from the list with grep -v, and we fetch the 5th output column using awk

grep -E '.*([0-9]{1,4}\.)+.*' | sed 's/\(.*\)\..*/\1/' | \

Moving on to the next line, we fetch only those lines that contain something that vaguely resembles an IP address with grep -E (I prefer to use -E here since it gives us the extended regex syntax), and we pass the results into sed to strip off the trailing remote host’s port number. Alternatively, you could use something like 's/^\([0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\).*/\1/' instead to filter out IPv4 addresses, but since we already know roughly what to expect from the input we can simplify our regex. Furthermore, we also know that the IP address of the remote host in FreeBSD will always have a dot followed by the port number appended, and we can naively remove this.

sort -g -k 1 | uniq -c | sort -n -k 1

Lastly, we sort (generically, with -gunique addresses in our list including their totals, and we sort numerically by the first column (now containing the count).

Linux

Here is a breakdown for the Linux-specific stuff:

netstat -anW --tcp --udp | grep -v 127.0.0.1 | awk '{ print $5 }' | \

Following in the footsteps of FreeBSD, we use -an to display all connected numeric addresses so we don’t waste time running reverse lookups. However, in most Linux distributions, lengthy columns–and especially IPv6 addresses–will be truncated by netstat’s output. To counter this, we use -W to show the wide listing, and we use --tcp and --udp to filter out only those protocols. You may need to modprobe sctp in order to get this to work; if you can’t, this string of commands might still work. Lastly, we filter connections to localhost with grep -v, and we fetch the 5th column using awk Easy enough, right?

grep --color=never -E '.*[0-9]{1,4}(\.|\:).*' | sed 's/\(.*\)\:.*/\1/' | \

In this next line, we use the extended regex feature of grep -E to filter out lines that look somewhat address-y, and we separate the remote host's address from its port using sed. In this case, Linux appends port numbers using a colon (:), so we have to deviate slightly from the FreeBSD example. Also, since some distros might alias grep with grep --color=auto|always, we use --color=never to eliminate feeding ANSI control characters to sed.

sort -g -k 1 | uniq -c | sort -n -k 1

Lastly, we sort by the IP address using a generic sort (-g), filter out only those addresses that are unique, count them, and then sort by the count column which is now tacked onto the front.

Now we can get a fancy list of IP addresses, how many connections from them are being made to us, and sort them accordingly! Manipulating grep accordingly can re-introduce localhost or remove specific addresses that might not be of interest.

No comments.
***

The Return of Gentoo

About nine months ago I wrote a post title The End of Gentoo. At the time, the article mostly echoed my growing frustrations with the lack of maintainer support for the vast collection of software in Portage, Gentoo’s repository and package management subsystem. Although the gentoo-server mailing list has all but dried up, gentoo-user has seen a marked increase in activity. Whether seasonal or otherwise, I think it’s a positive sign.

Another positive sign that comes to mind is the increased frequency and speed with which package maintainers have been pushing stable (and sometimes unstable) package versions out the door. For example, I was surprised to discover that MongoDB exists in Gentoo at version 1.8.2 as of this writing, which is conveniently the same version in FreeBSD’s ports collection. Ubuntu is decidedly behind the curve, holding in at around version 1.4.x. Of course, with sufficient digging, you can find prebuilt .debs of 1.8.3, or you can always fall back on building from source. Then again, I’m somewhat torn with regards to this: Sure, it brings back memories of earlier days when I often had to build packages by hand just to apply security fixes or obtain new versions, but I also wonder what the value is to it. After all, if I abandoned Gentoo to avoid the nightmare of compile-wait-restart, what’s the point if I leap over to another distribution that is forcing me to do exactly the same thing (except with less automation)?

Given the nature of work and my current projects, I’ve discovered that Gentoo suits my needs best. I can obtain fairly new versions of packages with some degree of customization without the need to manually run the ./configure && make && make install cycle by hand. Downgrading is also fairly easy, provided it doesn’t affect too many packages. However, I’ve found that eselect for those packages it supports can be an exceedingly welcome tool in the developer’s arsenal. I may not use it with any degree of regularity, but the option of setting the system default of a specific package to one version or another is appealing. I suspect this will be mostly useful for any Python-based tools I write in the near future, particularly given the split that is currently underway between 2.x and 3.x, but eselect also works with a handful of other systems that exhibit some degree of change between versions, including PostgreSQL and Boost.

But, I confess that none of this really influences my motivation for writing this post. Well, with the exception of V8 and MongoDB.

I think that much of my decision revolves around familiarity and maybe, if I were to make something of a stretch, annoyance. Ubuntu on the desktop looks absolutely beautiful. I love it. I really do. But the moment you dare to venture beyond the official packages it shipped with (think instant messengers), you begin to encounter various bits of weirdness that fester into a sore. Ubuntu has a great community of developers and supporters, but sometimes more peculiar problems are harder to find via search simply because of the noise level generated by its popularity. There’s nothing wrong with that–in fact, that’s an excellent problem for a distribution to have–but for unusual issues, it often makes finding the answer an uphill battle that is difficult to win without some patience. Add this to the abomination that is NetworkManager (installed and enabled by default), the excessively annoying network configuration borrowed from Debian, and whatever blasphemous modifications have been made to sysvinit, and one starts to see a pattern that makes this distribution more than a little tiring to those who simply wanted something that Just Worked.

It’s ironic in a way. I read an article a couple of weeks ago praising Linux Mint for many of these same reasons that Ubuntu seems deficient. Perhaps I should give it a try…

Yet time and again, I find myself drawn to Gentoo. It’s a rough distribution to maintain. It has many sharp edges. It’s not exceptionally good for use on a server where security updates may need to be applied from upstream regularly. It’s not even really that great for low powered desktops (try compiling Xorg and the desktop manager of your choice on a Netbook without distcc or cross-compilation on another system and then get back with me). Time and again, Gentoo lures me in. Why? Well, I’m starting to think that the answer is more complicated than simply “familiarity.” Perhaps I should take back what I said earlier.

About 8 or 9 years ago, I started toying around with a handful of Linux distributions. The only *nix-based systems I knew at the time were FreeBSD and OpenBSD; I had no idea what Linux really was, why there was such a significant chasm between the userland and kernel, or even really what the differences were between distributions. Superficially, I just assumed that the init systems were largely identical, and individual distributions simply customized various subsystems here and there. I had no idea that the world of Linux was vastly different from that of FreeBSD. In the latter, kernel and userland development is largely one and the same. FreeBSD is the kernel. It’s also the world. From init to various userland tools (yes, even ls) to device drivers (oh fxp0, how I miss you), development continued as a part of a single cohesive continuum. Little did I know, the Linux world is almost the polar opposite of that.

I was introduced to Gentoo by my friend John G. who suggested it as a more “BSD-like” distribution of Linux. He was right–everything about Gentoo seemed to be a GNU-derived analog of the BSD world with the one exception that it was decidedly Linux-flavored. But the most important lesson I took from Gentoo was that of how an operating system is put together–from scratch, but with training wheels. Sure, I knew all of the basic steps: There’s the file system, the kernel, the userland tools, and then there’s various odds and ends here and there that are glued in place to make life easier (or more miserable). In some ways, it’s almost a surprise any of this actually works as well as it does.

Yet I think it was that experience with Gentoo that won my heart. Not only do you have to partition the file systems yourself, but you have to effectively bootstrap the entire system from a live CD (or other Linux distribution), prepare it, and configure it, but you also have to build the kernel and all of the utilities yourself. To this end, I think Gentoo should be a required topic in any operating system course in every CS program at all universities. It’s like Linux From Scratch set to super-easy-mode. It’s no surprise then that any time I want to learn anything new, the best way for me is to pick it up under Gentoo and play with it.

And let’s be honest, Gentoo probably has one of the very best network configuration systems in the Linux world. It better–because it’s the kindred spirit of FreeBSD’s network configuration via rc.conf, except that it’s not. Well, not completely.

This isn’t to say that Gentoo is all sunshine and roses. It certainly does have more than its fair share of sharp edges. I recently reinstalled it on my desktop (no, I still have my Ubuntu install) only to discover that it still takes the better part of a weekend (and then some) to configure, build, and find everything you want, get things situated exactly right, and then discover that there’s one or two minor annoyances still eating away at you. For me, those annoyances are font-related, but I suppose nothing’s perfect. Ubuntu’s fonts are about as close to perfection as possible in the Linux world. Although, I admit that sound and sound support sucks badly in both. Oh, and don’t get me started on media players. I spent most of my free time this week messing around with the damned things only to discover that nearly every single one available is absolutely terrible. I miss Amarock 1.4. They had a good thing going…

The most important lesson I’ve taken from the time period between now and the time I wrote that fairly anti-Gentoo rant is something worth repeating: Nothing perfect. No distribution is perfect, no one distribution will do everything you want, and compromise is always a necessity. I still like Ubuntu for its aesthetics, but Gentoo is still the most appropriate solution for a general purpose workstation. I guess some things never really do change.

So, lesson learned: Rants are stupid. The future you is always the wisest. Sometimes you look back on what you wrote and wonder what the hell you were thinking. Long live Gentoo!

No comments.
***

Gnome Sudo in Gentoo

Having gotten used to the sudo-like interface presented in Ubuntu for most user-facing operations that require root access, the sudden lack of such convenience in Gentoo was grating. Fortunately, the solution is fairly easy, even if it isn’t mentioned anywhere that can be found by a cursory Google search. (I’m sure there’s an entry on the Gentoo forums, but let’s face it–for most people, if Google can’t find it, it doesn’t exist.)

The solution: Run gconf-editor, browse to apps/gksu, and then tick sudo-mode. That’s it!

Slightly longer answer: After wasting about 5 minutes searching for the answer, I found the solution in the man page for gksu which pointed to the gconf setting above.

You may have also noticed from this post that I’ve mentioned Gentoo for the first time in a while. There are a few reasons for this–which I’ll save for future posts–but it’s largely because of various irritations I’ve found with running Ubuntu for a while. Don’t worry, I’ll make an honest effort to share my rationale with you.

No comments.
***
Page 1 of 41234