Recently a reader asked about how to password-protect a directory for every specified IP while allowing open access to everyone else. In my article, Stupid htaccess Tricks, I show how to password-protect a directory for every IP except the one specified, but not for the reverse case. In this article, I will demonstrate this technique along with a wide variety of other useful password-protection tricks, including a few from my Stupid htaccess Tricks article. Before getting into the juicy stuff, we’ll review a few basics of HTAccess password protection.
Continue Reading
It’s been awhile since I have taken some time to just blog a little bit about what’s been happening in this crazy little world of mine. Normally, I like to keep my articles focused on web design, social media, and other online topics, but every now and then I like to take some time out and share some personal news. Needless to say, lots of awesome stuff has been happening both online and in my personal life, so here’s a brief summary for the sake of prosperity. I’ll start with the personal drama and then hit the online/design and project-related news.
Continue Reading
In my article, Associate Extensionless Files with Notepad, I explain how to navigate the labyrinthine maze of Windows dialogue menus to assign Microsoft’s Notepad text editor as the opening application for files without extensions. In this post, I’ll show you how to associate any file type with any program (including Notepad) in less than five seconds.
Ready? Don’t blink, you’ll miss it.. ;)
Modify any file extension association in five seconds
Open the Windows command prompt and enter the following command:
assoc .extension=fileType
It’s that easy. What is happening here? Let’s break it down, just for fun:
assoc — Windows file-association command
.extension — the extension of the file for which you would like to associate
fileType — the type of file that you would like to associate with your extension
Continue Reading
Let’s face it. There’s just as much scum on the Internet as there is out there in the “real world.” Maybe even more, who knows. From scammers and spammers to scrapers and crackers, the Web is just crawling with all sorts of pathetic scumbags. As predictably random as much of the malicious activity happens to be, it is virtually guaranteed that you will be hounded by at least a few persistent IP addresses that, for whatever reason, have latched on and just won’t let go. Like satanic parasites, they plague you night and day, haunting you and making your online life a living hell. Perhaps they leave endless spam comments; perhaps they are just mindless trolls giving you grief; or perhaps they continue to take flying stabs at the security of your website. Whatever the behavior, once you have determined that you need to block a collection of evil IPs, you have many choices. Here is a simple way to blacklist multiple IP addresses with a little PHP magic..
Continue Reading
I finally broke down and uninstalled Alex King’s once-great “Popularity Contest” plugin for WordPress.
The plugin had been installed here at Perishable Press for over two years, and had provided fairly consistent and apparently accurate statistics.
Unfortunately, there were serious errors involved with the plugin way back during the WordPress-2.3 upgrade that were never addressed by the plugin author. There was an interim version of the plugin that had patched the error until an official update was released, but sadly and almost two years later this has not happened. I don’t know about you, but I really don’t like running abandoned plugins on my site.
Continue Reading
Normally, when visitors post a comment to your site, specific types of client data are associated with the request. Commonly, a client will provide a user agent, a referrer, and a host header. When any of these variables is absent, there is good reason to suspect foul play. For example, virtually all browsers provide some sort of user-agent name to identify themselves. Conversely, malicious scripts directly posting spam and other payloads to your site frequently operate without specifying a user agent. In the Ultimate User-Agent Blacklist, we account for the “no-user-agent” case in the very first directive, preventing a host of anonymous visitors from hitting the site.
In addition to empty user-agent strings, malicious requests for site content frequently fail to provide any referrer information. Unless special privacy software is being used, the web page from which a visitor has arrived at your site will be specified in the header information for that request. Likewise, when a visitor posts a comment at your site, the referrer string for that post request will be the URL of that particular page. Thus, as with blank user-agent requests, no-referrer requests are frequently indicative of spam and other malicious behavior.
Another important piece of information provided by all legitimate clients is the host request header. The host header specifies the Internet host and port number of the requested resource. This information is required for all clients making HTTP/1.1 requests. Thus, requiring the host request-header field for all posts to your site safely eliminates illicit requests from hitting your server.
Continue Reading
Just like last year, this Spring I have been taking some time to do some general maintenance here at Perishable Press. This includes everything from fixing broken links and resolving errors to optimizing scripts and eliminating unnecessary plugins. I’ll admit, this type of work is often quite dull, however I always enjoy the process of cleaning up my HTAccess files. In this post, I share some of the changes made to my HTAccess files and explain the reasoning behind each modification. Some of the changes may surprise you! ;)
Continue Reading
The other day, my server crashed and Perishable Press was unable to connect to the MySQL database. Normally, when WordPress encounters a database error, it delivers a specific error message similar to the following:
Continue Reading
Given my propensity to discuss matters involving error log data (e.g., monitoring malicious behavior, setting up error logs, and creating extensive blacklists), I am often asked about the best way to go about monitoring 404 and other types of server errors. While I consider myself to be a novice in this arena (there are far brighter people with much greater experience), I do spend a lot of time digging through log entries and analyzing data. So, when asked recently about my error monitoring practices, I decided to share my response here at Perishable Press, and hopefully get some good feedback concerning best practices for error monitoring. Here is my email response to the question:
Continue Reading
You have seen user-agent blacklists, IP blacklists, 4G Blacklists, and everything in between. Now, in this article, for your sheer and utter amusement, I present a collection of over 8000 blacklisted referrers.
For the uninitiated, in teh language of teh Web, a referrer is the online resource from whence a visitor happened to arrive at your site. For example, if Johnny the Wonder Parrot was visiting the Mainstream Media website and happened to follow a link to your site (of all places), you would look at your access logs, notice Johnny’s visit, and speak out loud (slowly): “hmmm.. it looks like the Mainstream Media website referred my good pal Johnny to my Alka-Seltzer sales page.” In such a bizarre case, the Mainstream Media website — or specific page — is referred to as (no pun intended) the referrer.
Continue Reading
In addition to your choice collection of “Share This” links, you may also want to provide visitors with a link that enables them to quickly and easily send the URL permalink of any post to their friends via email. This is a great way to increase your readership and further your influence. Just copy & paste the following code into the desired location in your page template:
<a href="mailto:?subject=Fresh%20Linkage%20@%20Perishable%20Press&body=Check%20out%20<?php the_permalink(); ?>%20from%20Perishable%20Press" title="Send a link to this post via email" rel="nofollow">Share this post via email</a>
Within the code, you will need to edit both instances of the string “Perishable%20Press” to reflect your own site name. Note that the “%20” is the encoded equivalent of a blank space, and is required to ensure validation of parameterized query strings. As is, the code will generate an email that is populated with the following information:
Continue Reading
As discussed in my recent article, Eight Ways to Blacklist with Apache’s mod_rewrite, one method of stopping spammers, scrapers, email harvesters, and malicious bots is to blacklist their associated user agents. Apache enables us to target bad user agents by testing the user-agent string against a predefined blacklist of unwanted visitors. Any bot identifying itself as one of the blacklisted agents is immediately and quietly denied access. While this certainly isn’t the most effective method of securing your site against malicious behavior, it may certainly provide another layer of protection.
Even so, there are several things to consider before choosing to implement an extensive user-agent blacklist on your site. First and most importantly is the transient nature of the user agent itself. On most systems, the user-agent variable is easy to change, making it possible for bot owners to use any user-agent name they wish. Once a bad bot makes the rounds, becomes known, and is blacklisted, the bot owner need only modify or change its declared user agent and they’re back in business. User-agent names are constantly invented, spoofed, or otherwise altered in order to operate beneath — or above — the virtual radar. Thus, a user-agent blacklist is a high-maintenance affair, requiring continuous cultivation in order to maintain relevancy and effectiveness.
Continue Reading
At last! After many months of collecting data, crafting directives, and testing results, I am thrilled to announce the release of the 4G Blacklist! The 4G Blacklist is a next-generation protective firewall that secures your website against a wide range of malicious activity. Like its 3G predecessor, the 4G Blacklist is designed for use on Apache servers and is easily implemented via HTAccess or the httpd.conf configuration file. In order to function properly, the 4G Blacklist requires two specific Apache modules, mod_rewrite and mod_alias. As with the third generation of the blacklist, the 4G Blacklist consists of multiple parts:
Update Feb 22, 2011: The 5G version of the blacklist is available now in beta.
Continue Reading
I really hate bad robots. When a web crawler, spider, bot — or whatever you want to call it — behaves in a way that is contrary to expected and/or accepted protocols, we say that the bot is acting suspiciously, behaving badly, or just acting stupid in general. Unfortunately, there are thousands — if not hundreds of thousands — of nefarious bots violating our websites every minute of the day.
For the most part, there are effective methods available enabling us to protect our sites against the endless hordes of irrelevant and mischievous bots. Such evil is easily blocked with virtually zero side-effects because their presence is simply irrelevant.
But what about bad bots that aren’t exactly irrelevant, such as Yahoo’s mindless Slurp crawler? By disobeying the robots.txt protocol as promised, Yahoo’s Slurp clearly falls into the “bad-bot” category. Unlike typical “nonsense” bots, Slurp is not exactly irrelevant (yet), so simply blocking them is not a reasonable solution.
Continue Reading
Last year, after much research and discussion, I built a concise, lightweight security strategy for Apache-powered websites. Prior to the development of this strategy, I relied on several extensive blacklists to protect my sites against malicious user agents and IP addresses. Unfortunately, these mega-lists eventually became unmanageable and ineffective. As increasing numbers of attacks hit my server, I began developing new techniques for defending against external threats. This work soon culminated in the release of a “next-generation” blacklist that works by targeting common elements of decentralized server attacks. Consisting of a mere 37 lines, this “2G” Blacklist provided enough protection to enable me to completely eliminate over 350 blacklisting directives from my site’s root htaccess file. This improvement increased site performance and decreased attack rates, however many bad hits were still getting through. More work was needed..
Continue Reading
In my recent article on blocking proxy servers, I explain how to use HTAccess to deny site access to a wide range of proxy servers. The method works great, but some readers want to know how to allow access for specific proxy servers while denying access to as many other proxies as possible.
Fortunately, the solution is as simple as adding a few lines to my original proxy-blocking method. Specifically, we may allow any requests coming from our whitelist of proxy servers by testing Apache’s HTTP_REFERER variable, like so:
RewriteCond %{HTTP_REFERER} !(.*)allowed-proxy-01.domain.tld(.*)
RewriteCond %{HTTP_REFERER} !(.*)allowed-proxy-02.domain.tld(.*)
RewriteCond %{HTTP_REFERER} !(.*)allowed-proxy-03.domain.tld(.*)
Continue Reading
Canonical URLs are important for maintaining consistent linkage, reducing duplicate content issues, and increasing the overall integrity of your site. In addition to cleaning up trailing slashes and removing extraneous index.php and index.html strings, removing the www subdirectory prefix is an excellent way to shorten links and deliver consistent, canonical URLs.
Of course, an optimal way of removing (or adding) the www prefix is accomplished via HTAccess canonicalization:
Continue Reading
With the imminent release of the next series of (4G) blacklist articles here at Perishable Press, now is the perfect time to examine eight of the most commonly employed blacklisting methods achieved with Apache’s incredible rewrite module, mod_rewrite. In addition to facilitating site security, the techniques presented in this article will improve your understanding of the different rewrite methods available with mod_rewrite.
Blacklist via Request Method
This first blacklisting method evaluates the client’s request method. Every time a client attempts to connect to your server, it sends a message indicating the type of connection it wishes to make. There are many different types of request methods recognized by Apache. The two most common methods are GET and POST requests, which are required for “getting” and “posting” data to and from the server. In most cases, these are the only request methods required to operate a dynamic website. Allowing more request methods than are necessary increases your site’s vulnerability. Thus, to restrict the types of request methods available to clients, we use this block of Apache directives:
Continue Reading
In my previous article on temporarily redirecting visitors during site updates, I present numerous PHP and HTAccess methods for handling traffic during site maintenance, updates, and other temporary periods of downtime. Each of the PHP methods presented in the article allow for access from a single IP while redirecting everyone else. In this article, we modify our previous techniques to allow access for multiple IP addresses while temporarily redirecting everyone else to the page of our choice. Plus, while we’re at it, we’ll explore a few additional ways to adapt and use the general technique.
Continue Reading
Most of us learned how to use “echo()” in one of our very first PHP tutorials. That was certainly the case for me. As a consequence, I never really had a need to visit PHP’s documentation page for echo(). On a recent visit to Perishable Press, I saw a Tumblr post from Jeff about the use of PHP’s shortcut syntax for echo() but somewhere deep in my memory, there lurked a warning about its use. I decided to investigate.
Continue Reading
Working a great deal with blacklists, I am frequently trying to isolate and identify problematic code. For example, a blacklist implementation may suddenly prevent a certain type of page from loading. In order to resolve the issue, the blacklist is immediately removed and tested for the offending directive(s). This situation is common to other coding languages as well, especially when dealing with CSS. Identifying problem code is more of an art form than a science, but fortunately, there are a few ways to improve your overall code-sleuthing strategy.
Continue Reading
Here’s the scene: you have been noticing a large number of 404 requests coming from a particular domain. You check it out and realize that the domain in question has a number of misdirected links to your site. The links may resemble legitimate URLs, but because of typographical errors, markup errors, or outdated references, they are broken, leading to nowhere on your site and producing a nice 404 error for every request. Ugh. Or, another painful scenario would be a single broken link on a highly popular site. For example, you may have one of your best posts mentioned in the SitePoint forums, but the person leaving the link completely botched the job:
Continue Reading
Time for another Feedburner redirect tutorial! In our previous FeedBurner-redirect post, I provide an improved HTAccess method for redirecting your site’s main feed and comment feed to their respective Feedburner URLs. In this tutorial, we are redirecting individual WordPress category feeds to their respective FeedBurner URLs. We will also look at the complete code required to redirect all of the above: the main feed, comments feed, and of course any number of individual category feeds. Let’s jump into it..
Continue Reading
One of the original purposes of Perishable Press involved serving as a “virtual dumpster” for all of my miscellaneous code snippets. Over time, I continued elaborating to greater degrees on the various code recipes that I was posting, until eventually those brief snippet posts evolved into complete, richly detailed articles (at least from my point of view). Now that I enjoy the luxury of writing for an incredible audience, I try to avoid posting anything that doesn’t include an accompanying explanation. “If it’s worth posting, it’s worth explaining,” I always say. When you have people reading your stuff, there is little room for superfluous nonsense, unexplained code snippets, and long-winded introductions. ;)
Even so, every now and then you need to break the rules, shake up the routine, rock the boat, drop some acid, that kind of thing. Lately, I have been doing some deep archiving and have amassed a considerable collection of completely miscellaneous and unrelated chunks of code. There are too many random snippets to spend time sewing together similar functionality, and I really hate deleting perfectly good code. I also hate keeping misfit code chunks lying around in my otherwise pristine digital archive (joking). Fortunately, this dilemma is easily resolved by loosening up and simply dumping the information right here on the site. After all, that’s what it was originally designed for — in fact, the further you dig back into the archives, the more apparently pointless code snippets you will find. So without further ado, I now present a completely random, unexplained, miscellaneous collection of potentially useful code snippets!
Continue Reading
I recently added OpenSearch functionality to Perishable Press. Now, OpenSearch-enabled browsers such as Firefox and IE 7 alert users with the option to customize their browser’s built-in search feature with an exclusive OpenSearch-powered search option for Perishable Press. The autodiscovery feature of supportive browsers detects the custom search protocol and enables users to easily add it to their collection of readily available site-specific search options. Now, users may search the entire Perishable Press domain with the click of a button.
And you can do it too! Adding customized OpenSearch-powered search functionality to your own site is a great way to foster site awareness and reinforce brand identity, while providing a tool that will benefit your visitors and improve the usability of your site. Even better, implementing OpenSearch functionality is extremely easy, completely free, and requires zero maintenance. In this article, I provide an easy, 3-step tutorial on how to add OpenSearch functionality to your site in less than five minutes. After the tutorial, we will look at the many different ways to customize your OpenSearch implementation, including examples, search options, and much more.
Continue Reading