Search Results for “htaccess”

Notes on Switching Servers

Posted on January 14, 2011 in Websites by Jeff Starr

Switching servers & migrating sites can be a HUGE deal (or not), depending on things like:

  • Number of sites to transfer
  • Size and complexity of sites
  • Who is hosting your sites
  • Experience

I recently did this, switching from a 3-year run at ASO to my new home at Media Temple. Total of 24 properties, with WordPress running on around 10 sites. Past experience with VPS servers really had me paranoid about running out of memory. A few years ago, Perishable Press alone gobbled up 256MB of RAM at WiredTree, so add another 23 sites on top of that and needless to say I was extremely concerned about the migration from a shared-hosting environment to a (dv) Base account limited to 512MB RAM.

Continue Reading

Canonical URLs and Subdomains with Plesk

Posted on January 6, 2011 in Websites by Jeff Starr

I am in the process of migrating my sites from A Small Orange to Media Temple. Part of that process involves canonicalizing domain URLs to help maximize SEO strategy. At ASO, URL canonicalization required just a few htaccess directives:

# enforce no www prefix
<IfModule mod_rewrite.c>
 RewriteCond %{HTTP_HOST} !^domain\.tld$ [NC]
 RewriteRule ^(.*)$ http://domain.tld/$1 [R=301,L]
</IfModule>

When placed in the web-accessible root directory’s htaccess file, that snippet will ensure that all requests for your site are not prefixed with www. There’s also a force-www technique if that’s how you roll. Either way, the point is that on most shared hosting, URL canonicalization is simple.

Continue Reading

Latest Blacklist Entries

Posted on November 9, 2010 in Websites by Jeff Starr

Recently cleared several megabytes of log files, detecting patterns, recording anomalies, and blacklisting gross offenders. Gonna break it down into three sections:

User Agents

User-agents come and go, and are easily spoofed, but it’s worth a few lines of htaccess to block the more persistent bots that repeatedly scan your site with malicious requests.

# Nov 2010 User Agents
SetEnvIfNoCase User-Agent "MaMa " keep_out
SetEnvIfNoCase User-Agent "choppy" keep_out
SetEnvIfNoCase User-Agent "heritrix" keep_out
SetEnvIfNoCase User-Agent "Purebot" keep_out
SetEnvIfNoCase User-Agent "PostRank" keep_out
SetEnvIfNoCase User-Agent "archive.org_bot" keep_out
SetEnvIfNoCase User-Agent "msnbot.htm)._" keep_out

<Limit GET POST PUT>
 Order Allow,Deny
 Allow from all
 Deny from env=keep_out
</Limit>

Continue Reading

How to Deal with Content Scrapers

Posted on September 24, 2010 in Websites by Jeff Starr

Chris Coyier of CSS-Tricks recently declared that people should do “nothing” in response to other sites scraping their content. I totally get what Chris is saying here. He is basically saying that the original source of content is better than scrapers because:

  • it’s on a domain with more trust.
  • you published that article first.
  • it’s coded better for SEO than theirs.
  • it’s better designed than theirs.
  • it isn’t at risk for serious penalization from search engines.

If these things are all true, then I agree, you have nothing to worry about. Unfortunately, that’s a tall order for many sites on the Web today. Although most scraping sites are pure and utter crap, the software available for automating the production of decent-quality websites is getting better everyday. More and more I’ve been seeing scraper sites that look and feel authentic because they are using some sweet WordPress theme and a few magical plugins. In the past, it was easy to spot a scraper site, but these days it’s getting harder to distinguish between scraped and original content. Not just for visitors, but for search engines too.

Continue Reading

Lessons Learned after 5 Years of Blogging

Posted on August 30, 2010 in Blogging by Jeff Starr

This Fall, I celebrate five years of blogging. I have written tons of web development stuff at Perishable Press, lots of helpful WordPress stuff at Digging into WordPress, some philosophical stuff at mindfeed.org, creative/artistic stuff at Dead Letter Art, jQuery stuff at jQuery Mix, and some business-related web-design stuff at Monzilla Media. Plus a bunch of interviews, guest posts, and other blogging projects. So yeah, lots of blogging and writing during the past five years. And they just flew by.

Despite what the haters may say, there are some tangible benefits to blogging. As I write, I continue to learn a great deal – not just about the fine art of writing, but also about the nature of the audience, social media, and the Web in general. There’s a lot to it, more than you may realize. Looking back during my recent hiatus, I enjoyed the opportunity to reflect on the past and contemplate lessons learned, future goals, and what it all means. Here are some of my thoughts, strategies, and lessons learned after five years of blogging..

Continue Reading

2010 User-Agent Blacklist

Posted on August 9, 2010 in Websites by Jeff Starr

[ 2010 User-Agent Blacklist ] The 2010 User-Agent Blacklist blocks hundreds of bad bots while ensuring open-access for the major search engines: Google, Bing, Ask, Yahoo, et al. Blocking bad user-agents is an effective addition to any security strategy. It works like this: your site is getting hammered by rogue bots that waste valuable server resources and bandwidth. So you grab a copy of the 2010 UA Blacklist from Perishable Press, include it in your site’s root .htaccess file, and enjoy a more secure and better performing website. It’s that easy.

Proven Security

The 2010 UA Blacklist has been carefully constructed based on rigorous server-log analyses. Obsessive daily log monitoring reveals bad bots scanning for exploits, spamming resources, and wasting bandwidth. While analyzing malicious behavior, evil bots are identified and added to the UA Blacklist. Blocked user-agents are denied access to your site, increasing efficiency and providing safety for your visitors.

Continue Reading

Best Method for Email Obfuscation?

Posted on August 1, 2010 in Accessibility, Function by Jeff Starr

Awhile ago, Silvan Mühlemann conducted a 1.5 year experiment whereby different approaches to email obfuscation were tested for effectiveness. Nine different methods were implemented, with each test account receiving anywhere from 1800 to zero spam emails. Here is an excerpt from the article:

When displaying an e-mail address on a website you obviously want to obfuscate it to avoid it getting harvested by spammers. But which obfuscation method is the best one? I drove a test to find out.

After reading through the article and its many findings, here are what seem to be the best methods for obfuscating email addresses displayed publicly on web pages..

Continue Reading

Protect Your Site with a Blackhole for Bad Bots

Posted on July 14, 2010 in Websites by Jeff Starr

[ Black Hole ] One of my favorite security measures here at Perishable Press is the site’s virtual Blackhole trap for bad bots. The concept is simple: include a hidden link to a robots.txt-forbidden directory somewhere on your pages. Bots that ignore or disobey your robots rules will crawl the link and fall into the trap, which then performs a WHOIS Lookup and records the event in the blackhole data file. Once added to the blacklist data file, bad bots immediately are denied access to your site. I call it the “one-strike” rule: bots have one chance to follow the robots.txt protocol, check the site’s robots.txt file, and obey its directives. Failure to comply results in immediate banishment. The best part is that the Blackhole only affects bad bots: normal users never see the hidden link, and good bots obey the robots rules in the first place.

In five easy steps, you can set up your own Blackhole to trap bad bots and protect your site from evil scripts, bandwidth thieves, content scrapers, spammers, and other malicious behavior.

[ Blackhole Directory with Files ] The Blackhole is built with PHP, and uses a bit of .htaccess to protect the blackhole directory. The blackhole script combines heavily modified versions of the Kloth.net script (for the bot trap) and the Network Query Tool (for the whois lookups). Refined over the years and completely revamped for this tutorial, the Blackhole consists of a single plug-&-play directory that contains the following four files:

Continue Reading

htaccess Code for WordPress Multisite

Posted on July 7, 2010 in WordPress by Jeff Starr

For the upcoming Digging into WordPress update for WordPress 3.0, I have been working with WordPress’ multisite functionality. Prior to version 3.0, WordPress came in two flavors: “original” and “multisite” (MU). Most designers probably work with regular, one-blog installations of “regular” WordPress. The htaccess rules for all single-blog installations of WordPress haven’t changed. They are the same for WordPress 3.0 as they are for all previous versions.

But now that multisite has merged with regular-flavored WordPress, we can stick with single-blog installs (which is how things are setup by default), or we can activate multisite functionality and create an unlimited network of sites. The process is still new and there are bugs that need to be worked out, but eventually it will be a widely used WordPress feature. That said, the htaccess rules used for WordPress Multisite may change as the software continues to evolve.

Continue Reading

2010 IP Blacklist

Posted on July 6, 2010 in Websites by Jeff Starr

Over the course of each year, I blacklist a considerable number of individual IP addresses. Every day, Perishable Press is hit with countless numbers of spammers, scrapers, crackers and all sorts of other hapless turds. Weekly examinations of my site’s error logs enable me to filter through the chaff and cherry-pick only the most heinous, nefarious attackers for blacklisting. Minor offenses are generally dismissed, but the evil bastards that insist on wasting resources running redundant automated scripts are immediately investigated via IP lookup and denied access via simple htaccess directive:

<Limit GET POST PUT>
 Order Allow,Deny
 Allow from all
 Deny from 123.456.789
</LIMIT>

Although many of the worst attacks happen in randomized, zombie-like fashion, I have found that individual IPs that are not blacklisted will return repeatedly until finally blocked. Yet, despite the short-term success enjoyed by denying access to the most malicious IPs, the long-term futility of such blacklisting reflects the temporary nature of this solution.

In other words, I have found that blocking individual IPs is useful only for limited periods of time. Thus, every year, I gather my code and flush the blacklist of all individually blocked IP addresses. I then start fresh, adding the worst villains to the list, blocking entire IP ranges if necessary, and referring to previous versions of my htaccess files to cross-check suspiciously familiar entities. Eventually, a new blacklist emerges and I share it at Perishable Press. Here is the current version for 2010..

Continue Reading

HTML5 Table Template

Posted on June 8, 2010 in Structure by Jeff Starr

A good designer knows that tables should not be used for layout, but rather for displaying columns and rows of data. HTML enables the creation of well-structured, well-formatted tables, but they’re used infrequently enough to make remembering all of the different elements and attributes rather time-consuming and tedious. So to make things easier, here is a clean HTML5 template to speed-up development for your next project:

<table dir="ltr" width="500" border="1" 
			summary="purpose/structure for speech output">
	<caption>Here we assign header information to cells 
			by setting the scope attribute.
	</caption>
	<colgroup width="50%" />
	<colgroup id="colgroup" class="colgroup" align="center" 
			valign="middle" title="title" width="1*" 
			span="2" style="background:#ddd;" />
	<thead>
		<tr>
			<th scope="col">Name</th>
			<th scope="col">Side</th>
			<th scope="col">Role</th>
		</tr>
	</thead>
	<tfoot>
		<tr>
			<td>Darth Vader</td>
			<td>Dark</td>
			<td>Sith</td>
		</tr>
	</tfoot>
	<tbody>
		<tr>
			<td>Obi Wan Kenobi</td>
			<td>Light</td>
			<td>Jedi</td>
		</tr>
		<tr>
			<td>Greedo</td>
			<td>South</td>
			<td>Scumbag</td>
		</tr>
	</tbody>
</table>

Ahh, gives me goose bumps just looking at it. You can use this code with any markup language, whether HTML5, HTML 4.1, or any flavor of XHTML. Here is how this basic template looks in the current web page 1 (using different data and with styles removed):

Continue Reading

htaccess Redirect to Maintenance Page

Posted on May 19, 2010 in Function by Jeff Starr

Redirecting visitors to a maintenance page or other temporary page is an essential tool to have in your tool belt. Using HTAccess, redirecting visitors to a temporary maintenance page is simple and effective. All you need to redirect your visitors is the following code placed in your site’s root HTAccess:

# MAINTENANCE-PAGE REDIRECT
<IfModule mod_rewrite.c>
 RewriteEngine on
 RewriteCond %{REMOTE_ADDR} !^123\.456\.789\.000
 RewriteCond %{REQUEST_URI} !/maintenance.html$ [NC]
 RewriteCond %{REQUEST_URI} !\.(jpe?g?|png|gif) [NC]
 RewriteRule .* /maintenance.html [R=302,L]
</IfModule>

That is the official copy-&-paste goodness right there. Just grab, gulp and go. Of course, there are a few more details for those who may be unfamiliar with the process. Let’s look at all the gory details..

Continue Reading

Stop 404 Requests for Mobile Versions of Your Site

Posted on April 26, 2010 in Function by Jeff Starr

If you’ve been keeping an eye on your 404 errors recently, you will have noticed an increase in requests for nonexistent mobile files and directories, especially over the past year or so. The scripts and bots requesting these files from your server seem to be looking for a mobile version of your site. Unfortunately, they are wasting bandwidth and resources in the process. It has become common to see the following 404 errors constantly repeated in your log files:

  • http://domain.tld/apple-touch-icon.png
  • http://domain.tld/iphone
  • http://domain.tld/mobile
  • http://domain.tld/mobi
  • http://domain.tld/m

So some bot comes along, assumes that your site includes a mobile version, and then tries its hand at guessing the location. In the common request-set listed above, we see the bot looking first for an “apple-touch icon,” and then for mobile content in various directories. If this only happens once in awhile, it’s no big deal. But these days I’ve been seeing many different bots requesting these nonexistent resources.

Even worse, these mobile-hungry bots can’t seem to remember where they’ve been – they typically request the same resources repeatedly, and in multiple locations within the directory structure. I frequently see hundreds of these types of requests in my weekly error-log analyses. Needless to say, this is an incredible waste of time, bandwidth, and server resources.

Continue Reading

Fixing WordPress Infinite Duplicate Content Issue

Posted on April 6, 2010 in WordPress by Jeff Starr

Jeff Morris recently demonstrated a potential issue with the way WordPress handles multipaged posts and comments. The issue involves WordPress’ inability to discern between multipaged posts and comments that actually exist and those that do not. By redirecting requests for nonexistent numbered pages to the original post, WordPress creates an infinite amount of duplicate content for your site. In this article, we explain the issue, discuss the implications, and provide an easy, working solution.

Understanding the “infinite duplicate content” issue

Using the <!--nextpage--> tag, WordPress makes it easy to split your post content into multiple pages, and also makes it easy to paginate the display of your comment threads. For both paged posts and paged comments, WordPress appends the page number to the permalink. So for example, if we have a post split into 3 pages, WordPress will generate the following set of completely valid permalinks (based on name-only permalink structure):

Continue Reading

Is it Secret? Is it Safe?

Posted on March 17, 2010 in Function by Jeff Starr

[ Enjoying the Evening ] Whenever I find myself working with PHP or messing around with server settings, I nearly always create a phpinfo.php file and place it in the root directory of whatever domain I happen to be working on. These types of informational files employ PHP’s handy phpinfo() function to display a concise summary of all of your server’s variables, which may then be referenced for debugging purposes, bragging rights, and so on.

While this sort of thing is normally okay, I frequently forget to remove the file and just leave it sitting there for the entire world to look at. This of course is a big “no-no” for site security, because the phpinfo.php file contains a hefty amount of information about my server, including stuff like:

  • The web server version
  • The IP address of the host
  • The version of the operating system
  • The root directory of the web server
  • Configuration information about the remote PHP installation
  • The username of the user who installed php and if they are a SUDO user

Continue Reading

Better Image Preloading with CSS3

Posted on January 4, 2010 in Function, Presentation by Jeff Starr

I recently added to my growing library of image-preloading methods with a few new-&-improved techniques. After posting that recent preloading article, an even better way of preloading images using pure CSS3 hit me:

.preload-images {
	background: url(image-01.png) no-repeat -9999px -9999px;
	background: url(image-01.png) no-repeat -9999px -9999px,
		    url(image-02.png) no-repeat -9999px -9999px,
		    url(image-03.png) no-repeat -9999px -9999px,
		    url(image-04.png) no-repeat -9999px -9999px,
		    url(image-05.png) no-repeat -9999px -9999px;
	}

Using CSS3’s new support for multiple background images, we can use a single, existing element to preload all of the required images. Compare this method with the old way of using CSS to preload images:

Continue Reading

Protect WordPress Against Malicious URL Requests

Posted on December 22, 2009 in WordPress by Jeff Starr

A few months ago, many WordPress sites were attacked with some extremely malicious code. While searching for a good solution, I discovered the following gem of a plugin in the pastebin repository:

<?php /* Plugin Name: Block Bad Queries */

if (strlen($_SERVER['REQUEST_URI']) > 255 || 
	strpos($_SERVER['REQUEST_URI'], "eval(") || 
	strpos($_SERVER['REQUEST_URI'], "base64")) {
		@header("HTTP/1.1 414 Request-URI Too Long");
		@header("Status: 414 Request-URI Too Long");
		@header("Connection: Close");
		@exit;
} ?>

This script checks for excessively long request strings (i.e., greater than 255 characters), as well as the presence of either “eval(” or “base64” in the request URI. These sorts of nefarious requests were implicated in the September 2009 WordPress attacks.

Continue Reading

Stupid WordPress Tricks

Posted on December 1, 2009 in WordPress by Jeff Starr

[ WordPress ] One of the most popular articles here at Perishable Press is my January 2005 post, Stupid htaccess Tricks. In that article, I bring together an extensive collection of awesome copy-&-paste HTAccess code snippets. Four years later, people continue to tell me how much they enjoy and use the content as a bookmarked reference for many of their HTAccess needs. The article was even published in a book on Joomla! Security.

This is very inspiring to me, so I have decided to create a similar post for all of the useful WordPress code snippets, tips and tricks that I have collected while working on Digging into WordPress, the new book by co-author Chris Coyier and myself that really “digs in” to all of the awesome ways to get the most out of WordPress. While writing the DiW book, I collected hundreds of incredibly useful WordPress tips and tricks. After packing the book with as many of these techniques as possible, I decided to share the “best of the rest” here at Perishable Press.

If you are one of the millions of people who use WordPress, this article will help you improve the appearance, functionality, and performance of your WordPress-powered websites. Each of these “stupid WordPress tricks” is presented as clearly and succinctly as possible, including as many notes, instructions, and pointers as needed for successful implementation. Of course, keep in mind that we are only scratching the surface here. For a much more complete resource that is packed with tons of tasty techniques, you need to get Digging into WordPress.

Continue Reading

Pimp Your 404: Presentation and Functionality

Posted on November 2, 2009 in Function, Presentation by Jeff Starr

I have been wanting to write about 404 error pages for quite awhile now. They have always been very important to me, with customized error pages playing a integral part of every well-rounded web-design strategy. Rather than try to re-invent the wheel with this, I think I will just go through and discuss some thoughts about 404 error pages, share some useful code snippets, and highlight some suggested resources along the way. In a sense, this post is nothing more than a giant “brain-dump” of all things 404 for future reference. Hopefully you will find it useful in pimping your own 404.

When requested page is not found by server, error message is returned; this is the essence of the 404 — Ancient Chinese proverb

Continue Reading

4

Posted on October 25, 2009 in Perishable by Jeff Starr

Time flies! Perishable Press celebrates its fourth year online during this Fall season. Not really sure what that means at this point, other than a lot of hard work, plenty of great conversation, and a ton of design-related content. How did I get here? Let’s take a brisk stroll down memory lane..

During the site’s first year, I remember being too excited for my own good. WordPress was relatively new and I was completely inspired by the amazing things I could do with it. I spent most of my time building all sorts of different themes, studying web design, and posting useful code snippets. During this first year, Perishable Press was virtually unknown. Late night hours were spent drenched in CSS, (X)HTML, and PHP.

Continue Reading

Stupid Twitter Tricks

Posted on October 18, 2009 in Blogging by Jeff Starr

[ Twitter ] Might as well face it, Twitter is here to stay. Not that it’s all that bad, just used to be a lot more laid-back and enjoyable. These days it seems to have been taken over by the lowest common-denominator, mostly high-school twits or useless commercial propaganda. Even so, I still enjoy tweeting the occasional profound thought once in awhile, and even like to play around with various types of “advanced” Twitter functionality. You know, cool stuff like including “Tweet This!” links with short URLs, showing off my number of Twitter followers, displaying the number of tweets for each post, and even backing up my marvelous tweets quickly and easily. As you might have guessed, these are the kind of stupid Twitter tricks that you will find in this article. So kick back, relax, and enjoy these twitterific techniques that will make you go tweet.

“Tweet This!” short links for posts

I like to include an easy way for readers to “tweet” my posts, but the URLs are generally way over Twitter’s 140-character limit. Fortunately, we can use PHP’s handy file_get_contents function to grab short versions of our URLs from a free minifying service like TinyURL.

Continue Reading

HTAccess Privacy for Specific IPs

Posted on October 12, 2009 in Function by Jeff Starr

Running a private site is all about preventing unwanted visitors. Here is a quick and easy way to allow access to multiple IP addresses while redirecting everyone else to a custom message page.

To do this, all you need is an HTAccess file and a list of IPs for which you would like to allow access.

Edit the following code according to the proceeding instructions and place into the root HTAccess file of your domain:

# ALLOW ONLY MULTIPLE IPs
<Limit GET POST PUT>
 Order Deny,Allow
 Deny from all
 Allow from 123.456.789
 Allow from 456.789.123
 Allow from 789.123.456
</Limit>
ErrorDocument 403 path/custom-message.html
<Files path/custom-message.html>
 Order Allow,Deny
 Allow from all
</Files>

To prepare this code for use on your site, do these three things:

  1. Edit the three IP addresses to suit your needs. Feel free to add more IPs or remove any that aren’t needed.
  2. Edit both instances of “path/custom-message.html” to match the path and file name of the file that will contain your custom message. This may be anything, anywhere, with any functionality you desire.
  3. That’s it. Copy/paste into your site’s root htaccess file, upload, test, and get out!

Continue Reading

How to Protect Your Site Against Content Thieves

Posted on September 23, 2009 in Blogging by Jeff Starr

Stolen content is the bane of every blogger who provides a publicly available RSS feed. By delivering your content via feed, you make it easy for scrapers to assimilate and re-purpose your material on their crap Adsense sites. It’s bad enough that someone would re-post your entire feed without credit, but to use it for cheap money-making schemes is about as pathetic as it gets. If you’re lucky, the bastards may leave all the links intact, so at least you will get a few back-links (if you have been linking internally) and get notified of the stolen content as well (via pingback or Google Alert). Lately, however, many of the scraper sites that I have seen are completely removing all links within the stolen content. Incidentally, there are some tell-tale signs that the site you are visiting is a scraper site:

  • No RSS feed available
  • Many quality posts that contain no links
  • Many quality posts but very low subscriber count
  • Great content but with zero comments on any posts
  • Lots of good content but with lots of Adsense or other ads
  • No “About” page or business information
  • And the number one brain-dead giveaway: no contact form or email address

If you pay attention as you surf around, you may want to keep an eye out for some of these dead giveaways. If it looks like the site is profiting from stolen content, it is advisable to leave immediately and locate an original source of information (you could even be cool and report the scraper site to the original author). I.e., help strengthen the legit blogging community and don’t support scrapers in any way. But avoiding scraper sites is merely an afterthought. The real challenge is to have a solid strategy in place that will help you identify, eliminate and prevent stolen content. Unfortunately, there is no “magic cure” that will stop the scrapers from stealing your hard work — apart from running a private site or not providing a feed — but there are many great tools that have proven quite effective in fighting the war against stolen content. While not completely exhaustive, here are some powerful tips and tricks that have served me well over the years:

Continue Reading

Disable Trace and Track for Better Security

Posted on September 6, 2009 in Function by Jeff Starr

The shared server on which I host Perishable Press was recently scanned by security software that revealed a significant security risk. Namely, the HTTP request methods TRACE and TRACK were found to be enabled on my webserver. The TRACE and TRACK protocols are HTTP methods used in the debugging of webserver connections.

Although these methods are useful for legitimate purposes, they may compromise the security of your server by enabling cross-site scripting attacks (XST). By exploiting certain browser vulnerabilities, an attacker may manipulate the TRACE and TRACK methods to intercept your visitors’ sensitive data. The solution, of course, is disable these methods on your webserver.

Continue Reading