User-agents come and go, and are easily spoofed, but it’s worth a few lines of htaccess to block the more persistent bots that repeatedly scan your site with malicious requests.
# Nov 2010 User Agents
SetEnvIfNoCase User-Agent "MaMa " keep_out
SetEnvIfNoCase User-Agent "choppy" keep_out
SetEnvIfNoCase User-Agent "heritrix" keep_out
SetEnvIfNoCase User-Agent "Purebot" keep_out
SetEnvIfNoCase User-Agent "PostRank" keep_out
SetEnvIfNoCase User-Agent "archive.org_bot" keep_out
SetEnvIfNoCase User-Agent "msnbot.htm)._" keep_out
<Limit GET POST PUT>
Order Allow,Deny
Allow from all
Deny from env=keep_out
</Limit>
Over the course of each year, I blacklist a considerable number of individual IP addresses. Every day, Perishable Press is hit with countless numbers of spammers, scrapers, crackers and all sorts of other hapless turds. Weekly examinations of my site’s error logs enable me to filter through the chaff and cherry-pick only the most heinous, nefarious attackers for blacklisting. Minor offenses are generally dismissed, but the evil bastards that insist on wasting resources running redundant automated scripts are immediately investigated via IP lookup and denied access via simple htaccess directive:
<Limit GET POST PUT>
Order Allow,Deny
Allow from all
Deny from 123.456.789
</LIMIT>
Although many of the worst attacks happen in randomized, zombie-like fashion, I have found that individual IPs that are not blacklisted will return repeatedly until finally blocked. Yet, despite the short-term success enjoyed by denying access to the most malicious IPs, the long-term futility of such blacklisting reflects the temporary nature of this solution.
In other words, I have found that blocking individual IPs is useful only for limited periods of time. Thus, every year, I gather my code and flush the blacklist of all individually blocked IP addresses. I then start fresh, adding the worst villains to the list, blocking entire IP ranges if necessary, and referring to previous versions of my htaccess files to cross-check suspiciously familiar entities. Eventually, a new blacklist emerges and I share it at Perishable Press. Here is the current version for 2010..
I have been wanting to write about 404 error pages for quite awhile now. They have always been very important to me, with customized error pages playing a integral part of every well-rounded web-design strategy. Rather than try to re-invent the wheel with this, I think I will just go through and discuss some thoughts about 404 error pages, share some useful code snippets, and highlight some suggested resources along the way. In a sense, this post is nothing more than a giant “brain-dump” of all things 404 for future reference. Hopefully you will find it useful in pimping your own 404.
When requested page is not found by server, error message is returned; this is the essence of the 404 — Ancient Chinese proverb
Let’s face it. There’s just as much scum on the Internet as there is out there in the “real world.” Maybe even more, who knows. From scammers and spammers to scrapers and crackers, the Web is just crawling with all sorts of pathetic scumbags. As predictably random as much of the malicious activity happens to be, it is virtually guaranteed that you will be hounded by at least a few persistent IP addresses that, for whatever reason, have latched on and just won’t let go. Like satanic parasites, they plague you night and day, haunting you and making your online life a living hell. Perhaps they leave endless spam comments; perhaps they are just mindless trolls giving you grief; or perhaps they continue to take flying stabs at the security of your website. Whatever the behavior, once you have determined that you need to block a collection of evil IPs, you have manychoices. Here is a simple way to blacklist multiple IP addresses with a little PHP magic..
Just like lastyear, this Spring I have been taking some time to do some general maintenance here at Perishable Press. This includes everything from fixing broken links and resolving errors to optimizing scripts and eliminating unnecessary plugins. I’ll admit, this type of work is often quite dull, however I always enjoy the process of cleaning up my HTAccess files. In this post, I share some of the changes made to my HTAccess files and explain the reasoning behind each modification. Some of the changes may surprise you! ;)
Given my propensity to discuss matters involving error log data (e.g., monitoring malicious behavior, setting uperror logs, and creating extensive blacklists), I am often asked about the best way to go about monitoring 404 and other types of server errors. While I consider myself to be a novice in this arena (there are far brighter people with much greater experience), I do spend a lot of time digging through log entries and analyzing data. So, when asked recently about my error monitoring practices, I decided to share my response here at Perishable Press, and hopefully get some good feedback concerning best practices for error monitoring. Here is my email response to the question:
At last! After many months of collecting data, crafting directives, and testing results, I am thrilled to announce the release of the 4G Blacklist! The 4G Blacklist is a next-generation protective firewall that secures your website against a wide range of malicious activity. Like its 3G predecessor, the 4G Blacklist is designed for use on Apache servers and is easily implemented via HTAccess or the httpd.conf configuration file. In order to function properly, the 4G Blacklist requires two specific Apache modules, mod_rewrite and mod_alias. As with the third generation of the blacklist, the 4G Blacklist consists of multiple parts:
Update Feb 22, 2011: The 5G version of the blacklist is available now in beta.
Last year, after much research and discussion, I built a concise, lightweight security strategy for Apache-powered websites. Prior to the development of this strategy, I relied on several extensive blacklists to protect my sites against malicious user agents and IP addresses. Unfortunately, these mega-lists eventually became unmanageable and ineffective. As increasing numbers of attacks hit my server, I began developing new techniques for defending against external threats. This work soon culminated in the release of a “next-generation” blacklist that works by targeting common elements of decentralized server attacks. Consisting of a mere 37 lines, this “2G” Blacklist provided enough protection to enable me to completely eliminate over 350 blacklisting directives from my site’s root htaccess file. This improvement increased site performance and decreased attack rates, however many bad hits were still getting through. More work was needed..
Welcome to the Perishable Press “Blacklist Candidate” series. In this post, we continue our new tradition of exposing, humiliating and banishing spammers, crackers and other worthless scumbags..
From time to time on the show, a contestant places a bid that is so absurd and so asinine that you literally laugh out loud, point at the monitor, and openly ridicule the pathetic loser. On such occasions, even the host of the show will laugh and mock the idiocy. Of course, this same situation happens frequently here at Perishable Press, where the scumbags that manage to escape the 3G Blacklist are proving themselves to be increasingly desperate and pathetic. Such is the case with this month’s official Blacklist Candidate Number 2008-10-19:
Come on down — you’re the next cracker whore to get banished from the site!
An ongoing series of articles on the fine art of malicious exploit detection and prevention. Learn about preventing the sneaky mischievous and deceptive practices of some of the worst spammers, scrapers, crackers, and other scumbags on the Internet.
Even so, I have to admit that I am very happy with my latest strategy against crackers, spammers, and other scumbags, namely, the 3G Blacklist. Since implementing this effective HTAccess security method, I have seen a dramatic decrease in the overall volume of malicious activity recorded in my Apache, PHP, and 404 error logs. Thankfully, the 3G Blacklist has kept things pretty quiet around here, at least as far as malicious activity is concerned. In fact, it has been awhile since I’ve seen anything worthy of sharing with you — until now.
It’s been awhile since my last personal news post, and I figure that enough has been happening to warrant yet another exciting news update. Yay! ;)
So let’s see, first on my mind is the recent launch of the new design for Monzilla Media, the official site for my personal website and graphic design business. The first two versions of the site were single-page brochure sites, but this new version is fully loaded, featuring tons of portfolio content, business news, and service information. If you’ve got a minute, I would love to hear your feedback!
Welcome to the Perishable Press “Blacklist Candidate” series. In this post, we continue our new tradition of exposing, humiliating and banishing spammers, crackers and other worthless scumbags..
Just under the wire! Even so, this month’s official Blacklist-Candidate article may be the last monthly installment of the series. Although additional BC articles may appear in the future, it is unlikely that they will continue as a regular monthly feature. Oh sure, I see the tears streaming down your face, but think about it: this is actually good news. After implementing the 3G Blacklist, finding decent blacklist candidates is becoming increasingly difficult. This is a good thing because it indicates that the new blacklist strategy is working. Nonetheless, every now and then, some brainless bozo will magically appear and poke around where it shouldn’t be poking. Such is the case with this month’s Blacklist Candidate Number 2008-05-31:
Come on down! You’re the next worthless turd to get banished from the site!
In the now-complete series, Building the 3G Blacklist, I share insights and discoveries concerning website security and protection against malicious attacks. Each article in the series focuses on unique blacklist strategies designed to protect sites transparently, effectively, and efficiently. The five articles culminate in the release of the next generation 3G Blacklist.
For the record, here is a quick summary of the entire Building the 3G Blacklist series:
After much research and discussion, I have developed a concise, lightweight security strategy for Apache-powered websites. Prior to the development of this strategy, I relied on several extensive blacklists to protect my sites against malicious user agents and IP addresses. Over time, these mega-lists became unmanageable and ineffective. As increasing numbers of attacks hit my server, I began developing new techniques for defending against external threats. This work soon culminated in the release of a “next-generation” blacklist that works by targeting common elements of decentralized server attacks. Consisting of a mere 37 lines, this “2G” Blacklist provided enough protection to enable me to completely eliminate over 350 blacklisting directives from my site’s root htaccess file. This improvement increased site performance and decreased attack rates, however many bad hits were still getting through. More work was needed..
The 3G Blacklist
Work on the 3G Blacklist required several weeks of research, testing, and analysis. During the development process, five major improvements were discovered, documented, and implemented. Using pattern recognition, access immunization, and multiple layers of protection, the 3G Blacklist serves as an extremely effective security strategy for preventing a vast majority of common exploits. The list consists of four distinct parts, providing multiple layers of protection while synergizing into a comprehensive defense mechanism. Further, as discussed in previous articles, the 3G Blacklist is designed to be as lightweight and flexible as possible, thereby facilitating periodic cultivation and maintenance. Sound good? Here it is:
In this continuingfive-article series, I share insights and discoveries concerning website security and protecting against malicious attacks. Wrapping up the series with this article, I provide the final key to our comprehensive blacklist strategy: selectively blocking individual IPs. Previousarticles also focus on key blacklist strategies designed to protect your site transparently, effectively, and efficiently. In the next article, these five articles will culminate in the release of the next generation 3G Blacklist.
Improving Site Security by Selectively Blocking Individual IPs
The final component of the 3G Blacklist establishes a vehicle through which individual IPs may be blocked. As previously discussed, the goal of the 3G Blacklist involves moving away from extensive lists of blacklisted user agents, IP addresses, and other identifying attributes. Nonetheless, a comprehensive security strategy benefits from the ability to quickly and easily neutralize threats by directly blocking individual IP addresses.
In my experience, an effective IP blacklist should be kept as current as possible. Unfortunately, once lists grow to over 100 entries, they become quite unmanageable. Periodic verification of blocked IPs becomes increasingly difficult as lists grow in size. Uncultivated blacklists that continue to grow in size will eventually do more harm than good. Beyond the burden placed on server resources, constantly changing IP ownership inevitably results in unintentional blocking of legitimate traffic.
Just as with individual blocking of user agents, blocking individual (or ranges of) IP addresses provides a secondary layer of protection against direct attacks, corrupt networks, and other case-specific situations. This level of granular control enables defensive fine-tuning for targeting and blocking specific machines. Known repeat offenders should remain permanently blacklisted, whereas short-term offenders should be blocked temporarily. Examples of known repeat offenders typically include:
In this series of five articles, I share insights and discoveries concerning website security and protecting against malicious attacks. In this first article of the series, I examine the process of identifying attack trends and using them to immunize against future attacks. Subsequent articles will focus on key blacklist strategies designed to protect your site transparently, effectively, and efficiently. At the conclusion of the series, the five articles will culminate in the release of the next generation 3G Blacklist.
Improving Site Security by Recognizing and Exploiting Server Attack Patterns
Crackers, spammers, scrapers, and other attackers are getting smarter and more creative with their methods. It is becoming increasingly difficult to identify elements of an attack that may be effectively manipulated to prevent further attempts. For example, attackers almost exclusively employ decentralized botnets to execute their attacks. Such decentralized networks consist of large numbers of compromised computer systems that are remotely controlled to post data, run scripts, and request resources. Each compromised computer used in an attack is associated with a unique IP, thus making it impossible to prevent future attempts by blocking one or more addresses:
Welcome to the Perishable Press “Blacklist Candidate” series. In this post, we continue our new tradition of exposing, humiliating and banishing spammers, crackers and other worthless scumbags..
Since the implementation of my 2G Blacklist, I have enjoyed a significant decrease in the overall number and variety of site attacks. In fact, I had to time-travel back to March 1st just to find a candidate worthy of this month’s blacklist spotlight. I felt like Rod Roddy looking over the Price-is-Right audience to announce the next name only to discover a quiet, empty room. And then like Bob gets pissed that nobody showed up and begins to bark and snarl at Rod to go across the street to the clam store and find some damn contestants. Or, ..um, something like that. Needless to say, this month’s data isn’t as fresh as I would have liked it, but I think you’ll find the information fascinating nonetheless. So let’s get on with it then:
Blacklist Candidate number 2008-04-27, come on down! You’re the next clam-store loser to get blacklisted from the site!
Welcome to the Perishable Press “Blacklist Candidate” series. In this post, we continue our new tradition of exposing, humiliating and banishing spammers, crackers and other worthless scumbags..
Imagine, if you will, an overly caffeinated Bob Barker, hunched over his favorite laptop, feverishly scanning his server access files. Like some underpaid factory worker pruning defective bobble heads from a Taiwanese assembly line, Bob rapidly identifies and isolates suspicious log entries with laser focus. Upon further investigation, affirmed spammers, scrapers and crackers are swiftly blacklisted from future access. For the most heinous offenders, we suddenly hear Rod Roddy’s guzzling voice echo throughout the room:
“Candidate number 2008-03-09, COME ON DOWN!! — you’re the next contestant to get blacklisted from the site!”
Every week, I dig through my access and errorlogs to learn from the spammers, scrapers, and other cracker whores. Typically, attempts to exploit potential security vulnerabilities demonstrate the following characteristics:
Since posting the Ultimate htaccess Blacklist and then the Ultimate htaccess Blacklist 2, I find myself dealing with a new breed of maliciousattacks. It is no longer useful to simply block nefarious user agents because they are frequently faked. Likewise, blocking individual IP addresses is generally a waste of time because the attacks are coming from a decentralized network of zombie machines. Watching my error and access logs very closely, I have observed the following trends in current attacks:
User agents are faked, typically using something generic like “Mozilla/5.0”
Each attack may involve hundreds of compromised IP addresses
Attacks generally target a large number of indexed (i.e., known) pages, posts, etc.
Frequently, attacks utilize query strings appended to variously named PHP files
The target URLs often include a secondary URL appended to the end of a permalink
An increasing number of attacks employ random character strings to probe for holes
Yet despite the apparent complexity of such attacks, they tend to look remarkably similar. Specifically, notice the trends in the following examples of (nonexistent) target URLs, or “attack strings,” as I like to call them:
Update 2010/07/07: Please visit the 2010 IP Blacklist for more current information.
Over the course of each year, I blacklist a considerable number of individual IP addresses. Every day, Perishable Press is hit with countless numbers of spammers, scrapers, crackers and all sorts of other hapless turds. Weekly examinations of my site’s error logs enable me to filter through the chaff and cherry-pick only the most heinous, nefarious attackers for blacklisting. Minor offenses are generally dismissed, but the evil bastards that insist on wasting resources running redundant automated scripts are immediately investigated via IP lookup and denied access via simple htaccess directive:
<Limit GET POST PUT>
order allow,deny
allow from all
deny from 123.456.789
</LIMIT>
Although many of the worst attacks happen in randomized, zombie-like fashion, I have found that individual IPs that are not blacklisted will return repeatedly until finally blocked. Yet, despite the short-term success enjoyed by denying access to the most malicious IPs, the long-term futility of such blacklisting reflects the temporary nature of this solution. In other words, I have found that blocking individual IPs is useful only for limited periods of time. Thus, every year, I gather my code and flush the blacklist of all individually blocked IP addresses. I then start fresh, adding the worst villains to the list, blocking entire IP ranges if necessary, and referring to previous versions of my htaccess files to cross-check suspiciously familiar entities. It is within this context, then, that I present the following manually assembled collection of over 150 of the worst spammers, scrapers, and crackers to hit my site in 2007.
Continuing my quest to stop comment spamwithout using plugins, I have decided to disable comments on “old” posts. In my experience, over 90% of comment, trackback and pingback spam occurs on posts that have been online for over a month or so, just long enough to be indexed by the search engines and picked up by spammers. Especially for older posts that have managed to acquire a little page rank, the frequency of spam attempts is far greater than it is for fresher content. Throw dofollowcommentstatus into the mix, and say “hello” to a hellish number of spam attempts on established pages. Thus, my evolving anti-spam strategy now includes discussion management, which involves periodic closing of feedback on older posts. In this article, we will examine currently available methods of managing comments, and then proceed with a versatile toolbox of SQL queries for complete discussion management.
Just a note to announce a site upgrade to WordPress 2.3.3. The upgrade went well, but overall server performance continues to suffer. I am aware that some people are experiencing difficulties leaving comments and even accessing the site in general. Rest assured, I am working with my hosting company, A Small Orange, to get everything back on track and running smooth. In the meantime, I appreciate your patience as we work to resolve the issues, restore full functionality, and return to reliable performance.
Please share any helpful observations regarding the site here. — Thanks!
Welcome to the Perishable Press “Blacklist Candidate” series. In this post, we continue our new tradition of exposing, humiliating and banishing spammers, crackers and other worthless scumbags..
Scumbag number 2008-02-10, “COME ON DOWN!!” — you’re the next baboon to get banished from the site!
Like many bloggers, I like to spend a little quality time each week examining my site’s error logs. The data contained in Apache, 404, and even PHP error logs is always enlightening. In addition to suspicious behavior, spam nonsense, and cracker mischief, this site frequently endures automated and even manual attacks targeting various XSS exploits, WordPress vulnerabilities, and other potential security holes. Although the number of successful attacks remains relatively small, the very nature of some of the attacks serves to threaten site performance, security and stability. Such is the case of blacklist candidate number 2008-02-10: IP address 128.111.48.138.
Come one, come all — today we officially begin a new series of posts here at Perishable Press: the public exposure, humiliation, and banishment of spammers, crackers, and other site attackers. Kicking things off for 2008: blacklist candidate number 2008-01-02!
Every Wednesday, I take a little time to investigate my 404errorlogs. In addition to spam, crack attacks, and other deliberate mischief, the 404 logs for Perishable Press contain errors due to missing resources, mistyped URLs, and the occasional bizarre or even suspiciousbehavior of the search-engine robots. Whenever possible, I attempt to resolve a majority of the “fixable” errors, either by restoring missing resources, adding an htaccess redirect, or by any other means available.
Having exercised this rigorous maintenance practice for well over a year now, my 404 error logs are almost completely devoid of all “fixable” 404 errors, and are filled almost exclusively with spam attacks, XSS attempts, and other miscellaneous cracker nonsense. Fortunately, my site has only fallen victim to such espionage on one occasion, and on a different server.