Archive for July, 2006
Deep Inside
I must have had my “Grotesque Breast Deformity” badge on today, two completely separate people (each with a backup cohort) decided to extoll their stories of mammary madness to me. Don’t get me wrong, I’m all for breasts (usually feminine, but if men can just apply “perky” as an adjective to their moobs, who knows?) but knowing when your nipple went septic or that an old woman who comes in doesn’t wear a bra isn’t exactly my idea of acceptable consumer conversation.
Ordinarilly I’m a no-speaking consumer, I grab my goods, give a perfunctory “thanks” and depart without another word. So my shock when the caged till-jockey started her monologue on aged breasts was palpable. I suppose perhaps in the law of statistics, going to Tesco for ~3 years straight, I was bound to run into a breast related soliloquy, infinite monkies have nothing on these people. I’m questioning whether I would actually be more surprised if Shakespeare was the topic of the day.
The hairdressers was next in the day. I visit male-only hairdressers primarilly because they have a greater understanding of male conversation when getting their hair cut, namely nothing. Arguably this is what hairdressers usually spew forth, but today was an exception. I was in no mood to play my usual game of Bullshit (“Yes I’m going to Sweden tomorrow, I run a brothel there composed of teenage runaways…”) and in retrospect, I have no idea really how the conversation diverged to hairs on boobs. It perhaps started with removing hairs from in between your toes, not exactly pleasant but it’s one of the perils of being a hairdresser I guess; being a web developer I can certainly relate to having to remove hairs from my toes… From there, of course, where else could the conversation go? It moved to hair in nipples, then said hair going pathogenic, the hairdresser’s delightful trip to the doctor (a cute doctor no less) to have the offending folicle “lanced” out. By now my haircut had fortunately finished, I couldn’t hand over the money quickly enough.
Vehement Euphoria
You would think that with something as entrenched as Virtual Hosting with Apache 2.0/2.2, that the simple situation of storing the Virtual Hosts in a seperate database would be well covered right? Yes and no. The vanilla Virtual Hosting setup makes you specify each virtual host as a block within the config file. Apache’s answer to the increasing size of your config file(s) is to provide a module called mod_vhost_alias which allows you to set a number of standard options (log file storage etc.) then each virtual host is given its own directory. This seems like the answer until you realise that on a small enough host the options provided for breaking down a URL are basic at best.
As an example, assume we have the domain “www.anadress.com” hosted on our server, we want “anadress.com” and perhaps “www.anotheraddress.com” to point to the same hosting.
Given “www.anaddress.com”, you can select the “www”, “anaddress” and “com” or a combination of any of them by referring to them by the position they appear within the URL; the entire URL is %0, “www” is %1 and “anaddress” is %2. So on your file system you could have /com/www/anadress (/%3/%1/%2/) or (perhaps more commonly) /www.anaddress.com/ (/%0/). This system breaks down when you want the www-less address to point to the same directory as the www. With “anaddress.com”, %1 is “anaddress”, %2 is “com”. There are various modifiers you can apply to get the last or first parts of the URL, but a brief read of the avilable options reveal that there is no way of automatically pointing a www and a non-www address to the same filesystem position. The simplest answer to this is to use filesystem links to point to the correct directory e.g. /www.anaddress.com/ is a link to /anaddress.com/, just as /www.anotheraddress.com/ is. This is a workable solution but isn’t exactly clean.
Using MySQL for my e-mail (virtual mailboxes ahoy!) and FTP (pam_mysql), it seemed pertinent to try and use this for virtual hosting. After hours of searching with a menagerie of different keywords, a number of modules appear:
Straight off: mod_v2h is a dead project, mod_vhost_mysql is for httpd 1.3, mod_vhs uses a slightly esoteric library dependency which leaves mod_vhost_dbi. This is the one I actually got working and while it relies on mod_dbi_pool for its connection pooling, it does what it claims to without fuss.
There is really nothing more technically to add to this; if you haven’t built Apache modules before then prepare for an uphill struggle of library dependencies and tool mismatches because none of the modules mentioned above have the best documentation. For me, all of this was compounded by the straight-faced obtuseness of Fedora, I haven’t yet figured out whether this is some elaborate form of mental torture or just an outright “for the sake of it” mentality on the part of Red Hat.
The ideal, the blue-sky would be to build a module use the shiny new Apache mod_dbd system, an Apache supplied module for managing connection pooling. Of course, that’s not to say using that would be easy. mod_dbd doesn’t come with MySQL support as standard due to the MySQL and Apache licenses conflicting on the finer details of “no u”. Building in MySQL support is another case of ice-skating uphill if you haven’t built the server from source to begin with. To give you some perspective for Fedora, you would have to rebuild Apache 2.2 from source into a binary RPM with the appropriate modifications.
So after all this bluster and furstration, where does this leave the original question of having Virtual Hosts stored in a database? Realistically I can only recommend mod_vhost_dbi as it’s the only one which actually decided to spin-up and start working; that doesn’t mean, however, I’m particularly happy with the situation. I can find very little hard evidence or benchmarks online with regards to any of the aforementioned solutions, so dropping this sort of thing into a production environment seems tantamount to lunacy without some robust stress and error checking. This is no reflection upon the coding prowess of OutOfOrder.cc, having a sub 1.0 version number has a tendancy to unnerve me for production (read: making money) environments.
Critically it all comes down to what your situation is and how much work you’re prepared to put in. Apart from my desire for neatness and updatability, virtual hosts remain within their own config file and a graceful restart of Apache is soup-of-the-day for each subsequent addition or amendment.
Late Goodbye
Running a dedicated server was not originally part of my job description, however neither was designing content management systems or chastising the IT people when they try and relieve us of a large sum of cash for incidents well within their jurisdiction. Managing it has been one big learning curve, and while I’ve run a local test-server before, running something that is used by the public and by clients is a wholly different beast. Aspects that you can gloss over or outright ignore are put under the magnifiying glass and an e-mail system fell under that umbrella with me.
Realistically I want to recommend “Linux Email Set up and Run a Small Office Email Server” but it doesn’t feel like a good technical book: it covers all the bases and is eminently practical in how it approaches the subject but ultimately it doesn’t go into enough detail of the individual elements to be useful on its own. Arguably, it never claims to do that but with such a wide-ranging subject, if you run anything slightly altered from their presented configuration the book feels more like a hindrance than a tome of information.
One book I can wholly recommend is “Postfix the Definitive Guide” by O’Reilly. It explains everything in a straightforward and pragmatic manner and never feels like it’s omitting anything for simplicity. I could well imagine that if you’re an Exim fan, then the sister book “Exim: The Mail Transport Agent” would be equally useful.
I’m well prepared to extoll the numerous virtues of SSH but the number of automated login attempts the server was getting was spiralling out of control. Using a 16 digit password and only allowing a single username to login made me comfortably confident that no one was going to get in, but 17,000 login attempts over a weekend seems excessive and the size of the log file enforces this. My best analogy would be when you have an immense iron door and someone comes along and starts hitting it with a stick, you know they’re not going to get in, but that doesn’t mean you want them doing it.
Enter SSHBlack, a wonderful little real-time Perl script that will blacklist any IP address that fulfills certain conditions. It does this by monitoring the log file (usually /var/log/secure) and counting the number of failed login attempts by a certain IP. Once an IP crosses a certain threshold (usually 3 or 4 failed attempts), the IP address is automatically added to an iptables chain.
If all that sounds highly complicated, it basically acts as a layer on top of existing and well-proven other programs to do something that seems so intuitive. On a stock Fedora machine the script worked pretty much out-of-the-box (with the mandatory setting of a whitelist) but still has plenty of configurability.
A butterfly beneath the glass
With a new design and a new resolve, I head out for the brave new world. The design is not the fantastically awe-inspiring concept I frequently quest after, but it sufficiently cleaves this site away from the “Kubrick” standard theme.
To kick things off I’ll start with a tiny bit of [Java|DOM]Script I wrote nigh-on 5 minutes ago for making off-site links have an icon next to them. I have no idea whether I’ll actually keep this condensed piece of awesome, though it may be useful to some people and perhaps small rodents.
First thing is to setup a CSS class for your offsite links:
a.offsite {
background: url(images/icon-offsite.gif) no-repeat right center;
padding-right: 16px;
}
Nothing particularly taxing about this. Next is the tiny chunk of script that grabs the page section you want to work with (so you don’t go flaggin up menus and whatnot), iterates through all the links in that section and assigns a class to any link which does (or doesn’t in this case) match the specified criteria.
doOffsiteLinks = function() {
if(document.getElementById && document.getElementsByTagName) {
contentArea = document.getElementById("content");
if(contentArea) {
ahrefs = contentArea.getElementsByTagName("a");
for(i = 0; i < ahrefs.length; i++) { if(ahrefs[i].href.match(/^http[s]?\:\/\//) && !ahrefs[i].href.match(/chaostangent\.com/)) {
ahrefs[i].className += " offsite";
}
}
}
}
window.onload = doOffsiteLinks;So from top to bottom:
- A check for browser compatability, older browsers are quietly ignored
- The document node with the ID of “content” is grabbed
- Grab and iterate through the list of link (“a”) tags
- Assign the “offsite” class depending on the result of pattern matching
The pattern-matching is where the magic happens. The first pattern matches against “http://” or “https://” at the start of the link while the second (optional) pattern states that if it has “chaostangent.com” in it, treat it as relative. More than likely these two could be merged into one regex but this way is simpler and makes it more readable. The second pattern match could be omitted for people without subdomains but applications like WordPress tend to go a bit crazy with their direct links (rather than relative ones).
With something as simple as this you really can tailor it to your own purposes. So you could match against “mailto:” links with a class (with a cunningly placed envelope icon) or FTP links. For me, this really was just “proof of concept” which is evident due to its brevity.