Introduction To URL Rewriting

Advertisement

Many Web companies spend hours and hours agonizing over the best domain names for their clients. They try to find a domain name that is relevant and appropriate, sounds professional yet is distinctive, is easy to spell and remember and read over the phone, looks good on business cards and is available as a dot-com.

Or else they spend thousands of dollars to purchase the one they really want, which just happened to be registered by a forward-thinking and hard-to-find squatter in 1998.

They go through all that trouble with the domain name but neglect the rest of the URL, the element after the domain name. It, too, should be relevant, appropriate, professional, memorable, easy to spell and readable. And for the same reasons: to attract customers and improve in search ranking.

Fortunately, there is a technique called URL rewriting that can turn unsightly URLs into nice ones — with a lot less agony and expense than picking a good domain name. It enables you to fill out your URLs with friendly, readable keywords without affecting the underlying structure of your pages.

This article covers the following:

  1. What is URL rewriting?
  2. How can URL rewriting help your search rankings?
  3. Examples of URL rewriting, including regular expressions, flags and conditionals;
  4. URL rewriting in the wild, such as on Wikipedia, WordPress and shopping websites;
  5. Creating friendly URLs;
  6. Changing pages names and URLs;
  7. Checklist and troubleshooting.

What Is URL Rewriting?

If you were writing a letter to your bank, you would probably open your word processor and create a file named something like lettertobank.doc. The file might sit in your Documents directory, with a full path like C:WindowsusersjulieDocumentslettertobank.doc. One file path = one document.

Similarly, if you were creating a banking website, you might create a page named page1.html, upload it, and then point your browser to http://www.mybanksite.com/page1.html. One URL = one resource. In this case, the resource is a physical Web page, but it could be a page or product drawn from a CMS.

URL rewriting changes all that. It allows you to completely separate the URL from the resource. With URL rewriting, you could have http://www.mybanksite.com/aboutus.html taking the user to …/page1.html or to …/about-us/ or to …/about-this-website-and-me/ or to …/youll-never-find-out-about-me-hahaha-Xy2834/. Or to all of these. It’s a bit like shortcuts or symbolic links on your hard drive. One URL = one way to find a resource.

With URL rewriting, the URL and the resource that it leads to can be completely independent of each other. In practice, they’re usually not wholly independent: the URL usually contains some code or number or name that enables the CMS to look up the resource. But in theory, this is what URL rewriting provides: a complete separation.

How Does URL Rewriting Help?

Can you guess what this Web page sells?

http://www.diy.com/diy/jsp/bq/nav.jsp?action=detail&fh_secondid=11577676

B&Q went to all the trouble and expense of acquiring diy.com and implementing a stock controlled e-commerce website, but left its URLs indecipherable. If you guessed “brown guttering,” you might want to considering playing the lottery.

Even when you search directly for this “miniflow gutter brown” on Google UK, B&Q’s page comes up only seventh in the organic search results, below much smaller companies, such as a building supplier with a single outlet in Stirlingshire. B&Q has 300+ branches and so is probably much bigger in budget, size and exposure, so why is it not doing as well for this search term? Perhaps because the other search results have URLs like http://www.prof…co.uk/products/brown-miniflo-gutter-148/; that is, the URL itself contains the words in the search term.

screenshot

Almost all of these results on Google have the search term in their URLs (highlighted in green). The one at the bottom does not.

Looking at the URL from B&Q, you would (probably correctly) assume that a file named nav.jsp within the directory /diy/jsp/bq/ is used to display products when given their ID number, 11577676 in this case. That is the resource intimately tied to this URL.

So, how would B&Q go about turning this into something more recognizable, like http://www.diy.com/products/miniflow-gutter-brown/11577676, without restructuring its whole website? The answer is URL rewriting.

Another way to look at URL rewriting is like a thin layer that sits on top of a website, translating human- and search-engine-friendly URLs into actual URLs. Doing it is easy because it requires hardly any changes to the website’s underlying structure — no moving files around or renaming things.

URL rewriting basically tells the Web server that
/products/miniflow-gutter-brown/11577676 should show the Web page at: /diy/jsp/bq/nav.jsp?action=detail&fh_secondid=11577676,
without the customer or search engine knowing about it.

Many factors (or “signals”), of course, determine the search ranking for a particular term, over 200 of them according to Google1. But friendly and readable URLs are consistently ranked as one of the most important2 of those factors. They also help humans to quickly figure out what a page is about.

The next section describes how this is done.

How To Rewrite URLs

Whether you can implement URL rewriting on a website depends on the Web server. Apache usually comes with the URL rewriting module, mod_rewrite, already installed. The set-up is very common and is the basis for all of the examples in this article. ISAPI Rewrite3 is a similar module for Windows IIS but requires payment (about $100 US) and installation.

The Simplest Case

The simplest case of URL rewriting is to rename a single static Web page, and this is far easier than the B&Q example above. To use Apache’s URL rewriting function, you will need to create or edit the .htaccess file in your website’s document root (or, less commonly, in a subdirectory).

For instance, if you have a Web page about horses named Xu8JuefAtua.htm, you could add these lines to .htaccess:

RewriteEngine On
RewriteRule   horses.htm   Xu8JuefAtua.htm

Now, if you visit http://www.mywebsite.com/horses.htm, you’ll actually be shown the Web page Xu8JuefAtua.htm. Furthermore, your browser will remain at horses.htm, so visitors and search engines will never know that you originally gave the page such a cryptic name.

Introducing Regular Expressions

In URL rewriting, you need only match the path of the URL, not including the domain name or the first slash. The rule above essentially tells Apache that if the path contains horses.htm, then show the Web page Xu8JuefAtua.htm. This is slightly problematic, because you could also visit http://www.mywebsite.com/reallyfasthorses.html, and it would still work. So, what we really need is this:

RewriteEngine On
RewriteRule   ^horses.htm$   Xu8JuefAtua.htm

The ^horses.htm$ is not just a search string, but a regular expression4, in which special characters — such as ^ . + * ? ^ ( ) [ ] { } and $ — have extra significance. The ^ matches the beginning of the URL’s path, and the $ matches the end. This says that the path must begin and end with horses.htm. So, only horses.htm will work, and not reallyfasthorses.htm or horses.html. This is important for search engines like Google, which can penalize what it views as duplicate content95 — identical pages that can be reached via multiple URLs.

Without File Endings

You can make this even better by ditching the file ending altogether, so that you can visit either http://www.mywebsite.com/horses or http://www.mywebsite.com/horses/:

RewriteEngine On
RewriteRule   ^horses/?$   Xu8JuefAtua.html  [NC]

The ? indicates that the preceding character is optional. So, in this case, the URL would work with or without the slash at the end. These would not be considered duplicate URLs by a search engine, but would help prevent confusion if people (or link checkers) accidentally added a slash. The stuff in brackets at the end of the rule gives Apache some further pointers. [NC] is a flag that means that the rule is case insensitive, so http://www.mywebsite.com/HoRsEs would also work.

Wikipedia Example

We can now look at a real-world example. Wikipedia appears to use URL rewriting, passing the title of the page to a PHP file. For instance…

http://en.wikipedia.org/wiki/Barack_obama

… is rewritten to:

http://en.wikipedia.org/w/index.php?title=Barack_obama

This could well be implemented with an .htaccess file, like so:

RewriteEngine On
#Look for the word "wiki" followed by a slash, and then the article title
RewriteRule   ^wiki/(.+)$   w/index.php?title=$1   [L]

The previous rule had /?, which meant zero or one slashes. If it had said /+, it would have meant one or more slashes, so even http://www.mywebsite.com/horses//// would have worked. In this rule, the dot (.) matches any character, so .+ matches one or more of any character — that is, essentially anything. And the parentheses — ( ) — ask Apache to remember what the .+ is. The rule above, then, tells Apache to look for wiki/ followed by one or more of any character and to remember what it is. This is remembered and then rewritten as $1. So, when the rewriting is finished, wiki/Barack_obama becomes w/index.php?title=Barack_obama

Thus, the page w/index.php is called, passing Barack_obama as a parameter. The w/index.php is probably a PHP page that runs a database lookup — like SELECT * FROM articles WHERE title='Barack obama' — and then outputs the HTML.

screenshot

You can also view Wikipedia entries directly, without the URL rewriting.

Comments and Flags

The example above also introduced comments. Anything after a # is ignored by Apache, so it’s a good idea to explain your rewriting rules so that future generations can understand them. The [L] flag means that if this rule matches, Apache can stop now. Otherwise, Apache would continue applying subsequent rules, which is a powerful feature but unnecessary for all but the most complex rule sets.

Implementing the B&Q Example

The recommendation for B&Q above could be implemented with an .htaccess file, like so:

RewriteEngine On
#Look for the word "products" followed by slash, product title, slash, id number
RewriteRule  ^products/.*/([0-9]+)$   diy/jsp/bq/nav.jsp?action=detail&fh_secondid=$1 [NC,L]

Here, the .* matches zero or more of any character, so nothing or anything. And the [0-9] matches a single numerical digit, so [0-9]+ matches one or more numbers.

The next section covers a couple of more complex conditional examples. You can also read the Apache rewriting guide6 for much more information on all that URL rewriting has to offer.

Conditional Rewriting

URL rewriting can also include conditions and make use of environment variables. These two features make for an easy way to redirect requests from one domain alias to another. This is especially useful if a website changes its domain, from mywebsite.co.uk to mywebsite.com for example.

Domain Forwarding

Most domain registrars allow for domain forwarding, which redirects all requests from one domain to another domain, but which might send requests for www.mywebsite.co.uk/horses to the home page at www.mywebsite.com and not to www.mywebsite.com/horses. You can achieve this with URL rewriting instead:

RewriteEngine On
RewriteCond   %{HTTP_HOST}   !^www.mywebsite.com$         [NC]
RewriteRule   (.*)           http://www.mywebsite.com/$1  [L,R=301]

The second line in this example is a RewriteCond, rather than a RewriteRule. It is used to compare an Apache environment variable on the left (such as the host name in this case) with a regular expression on the right. Only if this condition is true will the rule on the next line be considered.

In this case, %{HTTP_HOST} represents www.mywebsite.co.uk, the host (i.e. domain) that the browser is trying to visit. The ! means “not.” This tells Apache, if the host does not begin and end with www.mywebsite.com, then remember and rewrite zero or more of any character to www.mywebsite.com/$1. This converts www.mywebsite.co.uk/anything-at-all to www.mywebsite.com/anything-at-all. And it will work for all other aliases as well, like www.mywebsite.biz/anything-at-all and mywebsite.com/anything-at-all.

The flag [R=301] is very important. It tells Apache to do a 301 (i.e. permanent) redirect. Apache will send the new URL back to the browser or search engine, and the browser or search engine will have to request it again. Unlike all of the examples above, the new URL will now appear in the browser’s location bar. And search engines will take note of the new URL and update their databases. [R] by itself is the same as [R=302] and signifies a temporary redirect.

File Existence and WordPress

Smashing Magazine runs on the popular blogging software WordPress. WordPress enables the author to choose their own URL, called a “slug.” Then, it automatically prepends the date, such as http://coding.smashingmagazine.com/2011/09/05/getting-started-with-the-paypal-api/. In your pre-URL rewriting days, you might have assumed that Smashing Magazine’s Web server was actually serving up a file located at …/2011/09/05/getting-started-with-the-paypal-api/index.html. In fact, WordPress uses URL rewriting extensively.

screenshot

WordPress enables the author to choose their own URL for an article.

WordPress’ .htaccess file looks like this:

RewriteEngine On
RewriteBase /  
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

The -f means “this is a file” and -d means “this is a directory.” This tells Apache, if the requested file name is not a file, and the requested file name is not a directory, then rewrite everything (i.e. any path containing any character) to the page index.php. If you are requesting an existing image or the log-in page wp-login.php, then the rule is not triggered. But if you request anything else, like /2011/09/05/getting-started-with-the-paypal-api/, then the file index.php jumps into action.

Internally, index.php (probably) looks at the environment variable $_SERVER['REQUEST_URI'] and extracts the information that it needs to find out what it is looking for. This gives it even more flexibility than Apache’s rewrite rules and enables WordPress to mimic some very sophisticated URL rewriting rules. In fact, when administering a WordPress blog, you can go to Settings → Permalink on the left side, and choose the type of URL rewriting that you would like to mimic.

screenshot

WordPress’ permalink settings, letting you choose the type of URL rewriting that you would like to mimic.

Rewriting Query Strings

If you are hired to recreate an existing website from scratch, you might use URL rewriting to redirect the 20 most popular URLs on the old website to the locations on the new website. This could involve redirecting things like prod.php?id=20 to products/great-product/2342, which itself gets redirected to the actual product page.

Apache’s RewriteRule applies only to the path in the URL, not to parameters like id=20. To do this type of rewriting, you will need to refer to the Apache environment variable %{QUERY_STRING}. This can be accomplished like so:

RewriteEngine On
RewriteCond   %{QUERY_STRING}           ^id=20$                   
RewriteRule   ^prod.php$             ^products/great-product/2342$      [L,R=301]
RewriteRule   ^products/(.*)/([0-9]+)$  ^productview.php?id=$1             [L]

In this example, the first RewriteRule triggers a permanent redirect from the old website’s URL to the new website’s URL. The second rule rewrites the new URL to the actual PHP page that displays the product.

Examples Of URL Rewriting On Shopping Websites

For complex content-managed websites, there is still the issue of how to map friendly URLs to underlying resources. The simple examples above did that mapping by hand, manually associating a URL like horses.htm with the file or resource Xu8JuefAtua.htm. Wikipedia looks up the resource based on the title, and WordPress applies some complex internal rule sets. But what if your data is more complex, with thousands of products in hundreds of categories? This section shows the approach that Amazon and many other shopping websites take.

If you’ve ever come across a URL like this on Amazon, http://www.amazon.co.uk/High-Voltage-AC-DC/dp/B00008AJL3, you might have assumed that Amazon’s website has a subdirectory named /High-Voltage-AC-DC/dp/ that contains a file named B00008AJL3.

This is very unlikely. You could try changing the name of the top-level “directory” and you would still arrive on the same page, http://www.amazon.co.uk/Test-Voltage-AC-DC/dp/B00008AJL3.

The bit at the end is what really matters. Looking down the page, you’ll see that B00008AJL3 is this AC/DC album’s ASIN (Amazon Standard Identification Number). If you change that, you’ll get a “Page not found” or an entirely different product: http://www.amazon.co.uk/High-Voltage-AC-DC/dp/B003BEZ7HI.

The /dp/ also matters. Changing this leads to a “Page not found.” So, the B00008AJL3 probably tells Amazon what to display, and the dp tells the website how to display it. This is URL rewriting in action, with the original URL possibly ending up getting rewritten to something like:
http://www.amazon.co.uk/displayproduct.php?asin=B00008AJL3.

Features of an Amazon URL

This introduces some important features of Amazon’s URLs that can be applied to any website with a complex set of resources. It shows that the URL can be automatically generated and can include up to three parts:

  1. The words
    In this case, the words are based on the album and artist, and all non-alphanumeric characters are replaced. So, the slash in AC/DC becomes a hyphen. This is the bit that helps humans and search engines.
  2. An ID number
    Or something that tells the website what to look up, such as B00008AJL3.
  3. An identifier
    Or something that tells the website where to look for it and how to display it. If dp tells Amazon to look for a product, then somewhere along the line, it probably triggers a database statement such as SELECT * FROM products WHERE id='B00008AJL3'.

Other Shopping Examples

Many other shopping websites have URLs like this. In the list below, the ID number and (suspected) identifier are in bold:

  • http://www.ebay.co.uk/itm/Ian-Rankin-Set-Darkness-Rebus-Novel-/140604842997
  • http://www.kelkoo.com/c-138201-lighting/brand/caravan
  • http://www.ciao.co.uk/Fridge_Freezers_5266430_3
  • http://www.gumtree.com/p/for-sale/boys-bmx-bronx-blaze/97669042
  • http://www.comet.co.uk/c/Televisions/LCD-Plasma-LED-TVs/1844

A significant benefit of this type of URL is that the actual words can be changed, as shown below. As long as the ID number stays the same, the URL will still work. So products can be renamed without breaking old links. More sophisticated websites (like Ciao above) will redirect the changed URL back to the real one and thus avoid creating the appearance of duplicate content (see below for more on this topic).

screenshot

Websites that use URL rewriting are more flexible with their URLs — the words can change but the page will still be found.

Friendly URLs

Now you know how to map nice friendly URLs to their underlying Web pages, but how should you create those friendly URLs in the first place?

If we followed the current advice, we would separate words with hyphens rather than underscores7 and capitalize consistently8. Lowercase might be preferable because most people search in lowercase. Punctuation such as dots and commas should also be turned into hyphens, otherwise they would get turned into things like %2C, which look ugly and might break the URL when copied and pasted. You might want to remove apostrophes and parentheses entirely for the same reason.

Whether to replace accented characters is debatable. URLs with accents (or any non-Roman characters) might look bad or break when rendered in a different character format. But replacing them with their non-accented equivalents might make the URLs harder for search engines to find (and even harder if replaced with hyphens). If your website is for a predominately French audience, then perhaps leave the French accents in. But substitute them if the French words are few and far between on a mainly English website.

This PHP function succinctly handles all of the above suggestions:

function GenerateUrl ($s) {
  //Convert accented characters, and remove parentheses and apostrophes
  $from = explode (',', "ç,æ,œ,á,é,í,ó,ú,à,è,ì,ò,ù,ä,ë,ï,ö,ü,ÿ,â,ê,î,ô,û,å,e,i,ø,u,(,),[,],'");
  $to = explode (',', 'c,ae,oe,a,e,i,o,u,a,e,i,o,u,a,e,i,o,u,y,a,e,i,o,u,a,e,i,o,u,,,,,,');
  //Do the replacements, and convert all other non-alphanumeric characters to spaces
  $s = preg_replace ('~[^wd]+~', '-', str_replace ($from, $to, trim ($s)));
  //Remove a - at the beginning or end and make lowercase
  return strtolower (preg_replace ('/^-/', '', preg_replace ('/-$/', '', $s)));
}

This would generate URLs like this:

echo GenerateUrl ("Pâtisserie (Always FRESH!)"); //returns "patisserie-always-fresh"

Or, if you wanted a link to a $product variable to be pulled from a database:

$product = array ('title'=>'Great product', 'id'=>100);
echo '<a href="' . GenerateUrl ($product['title']) . '/' . $product['id'] . '">';
echo $product['title'] . '</a>';

Changing Page Names

Search engines generally ignore duplicate content95 (i.e. multiple pages with the same information). But if they think they are being manipulated, search engines will actively penalize the website, so avoid this where possible. Google recommends using 301 redirects to send users from old pages to new ones.

When a URL-rewritten page is renamed, the old URL and new URL should both still work. Furthermore, to avoid any risk of duplication, the old URL should automatically redirect to the new one, as WordPress does.

Doing this in PHP is relatively easy. The following function looks at the current URL, and if it’s not the same as the desired URL, it redirects the user:

function CheckUrl ($s) {
  // Get the current URL without the query string, with the initial slash
  $myurl = preg_replace ('/?.*$/', '', $_SERVER['REQUEST_URI']);
  //If it is not the same as the desired URL, then redirect
  if ($myurl != "/$s") {Header ("Location: /$s", true, 301); exit;}
}

This would be used like so:

$producturl = GenerateUrl ($product['title']) . '/' . $product['id'];
CheckUrl ($producturl); //redirects the user if they are at the wrong place

If you would like to use this function, be sure to test it in your environment first and with your rewrite rules, to make sure that it does not cause any infinite redirects. This is what that would look like:

screenshot

This is what happens when Google Chrome visits a page that redirects to itself.

Checklist And Troubleshooting

Use the following checklist to implement URL rewriting.

1. Check That It’s Supported

Not all Web servers support URL rewriting. If you put up your .htaccess file on one that doesn’t, it will be ignored or will throw up a “500 Internal Server Error.”

2. Plan Your Approach

Figure out what will get mapped to what, and how the correct information will still get found. Perhaps you want to introduce new URLs, like my-great-product/p/123, to replace your current product URLs, like product.php?id=123, and to substitute new-category/c/12 for category.php?id=12.

3. Create Your Rewrite Rules

Create an .htaccess file for your new rules. You can initially do this in a /testing/ subdirectory and using the [R] flag, so that you can see where things go:

RewriteEngine On
RewriteRule   ^.+/p/([0-9]+)   product.php?id=$1    [NC,L,R]
RewriteRule   ^.+/c/([0-9]+)   category.php?id=$1    [NC,L,R]

Now, if you visit www.mywebsite.com/testing/my-great-product/p/123, you should be sent to www.mywebsite.com/testing/product.php?id=123. You’ll get a “Page not found” because product.php is not in your /testing/ subdirectory, but at least you’ll know that your rules work. Once you’re satisfied, move the .htaccess file to your document root and remove the [R] flag. Now www.mywebsite.com/my-great-product/p/123 should work.

4. Check Your Pages

Test that your new URLs bring in all the correct images, CSS and JavaScript files. For example, the Web browser now believes that your Web page is named 123 in a directory named my-great-product/p/. If the HTML refers to a file named images/logo.jpg, then the Web browser would request the image from www.mywebsite.com/my-great-product/p/images/logo.jpg and would come up with a “File not found.”

You would need to also rewrite the image locations or make the references absolute (like <img src="/images/logo.jpg"/>) or put a base href at the top of the <head> of the page (<base href="/product.php"/>). But if you do that, you would need to fully specify any internal links that begin with # or ? because they would now go to something like product.php#details.

5. Change Your URLs

Now find all references to your old URLs, and replace them with your new URLs, using a function such as GenerateUrl to consistently create the new URLs. This is the only step that might require looking deep into the underlying code of your website.

6. Automatically Redirect Your Old URLs

Now that the URL rewriting is in place, you probably want Google to forget about your old URLs and start using the new ones. That is, when a search result brings up product.php?id=20, you’d want the user to be visibly redirected to my-great-product/p/123, which would then be internally redirected back to product.php?id=20.

This is the reverse of what your URL rewriting already does. In fact, you could add another rule to .htaccess to achieve this, but if you get the rules in the wrong order, then the browser would go into a redirect loop.

Another approach is to do the first redirect in PHP, using something like the CheckUrl function above. This has the added advantage that if you rename the product, the old URL will immediately become invalid and redirect to the newest one.

7. Update and Resubmit Your Site Map

Make sure to carry through your new URLs to your site map, your product feeds and everywhere else they appear.

Conclusion

URL rewriting is a relatively quick and easy way to improve your website’s appeal to customers and search engines. We’ve tried to explain some real examples of URL rewriting and to provide the technical details for implementing it on your own website. Please leave any comments or suggestions below.

(al)

Footnotes

  1. 1 http://www.google.com/about/corporate/company/tech.html
  2. 2 http://searchengineland.com/21-essential-seo-tips-techniques-11580
  3. 3 http://www.isapirewrite.com/
  4. 4 http://httpd.apache.org/docs/current/rewrite/intro.html#regex
  5. 5 http://www.google.com/support/webmasters/bin/answer.py?answer=66359&&hl=en
  6. 6 http://httpd.apache.org/docs/current/rewrite/intro.html
  7. 7 http://www.mattcutts.com/blog/dashes-vs-underscores/
  8. 8 http://www.leadqual.com/blog/searchengineoptimization-seo/capitalization-and-case-sensitivity-in-urls-matters-for-seo
  9. 9 http://www.google.com/support/webmasters/bin/answer.py?answer=66359&&hl=en

↑ Back to topShare on Twitter

Paul Tero is an experienced PHP programmer and server administrator. He developed the Stockashop ecommerce system in 2005 for Sensable Media. He now works part-time maintaining and developing Stockashop, and the rest of the time freelancing from a corner of his living room, and sleeping, eating, having fun, etc. He has also written numerous other open sourcish scripts and programs.

Advertising

Note: Our rating-system has caused errors, so it's disabled at the moment. It will be back the moment the problem has been resolved. We're very sorry. Happy Holidays!

  1. 1

    Great!! This is exactly what I was looking for yesterday for a new project.

    I’m gonna read it very carefully…

    Thanks

  2. 2

    Excellent! This couldn’t have come at a better time.

  3. 3

    Wow. This post seems like a book for me, full of knowledge. I was really amazed. Great post.

  4. 4

    Awesome :)
    It is a short, complete guide to URL rewriting for *nix server environment.
    Thanks Paul.

  5. 5

    this is really an awesome post ……thanks for sharing

  6. 6

    Great time for an article like this. Thanks for the indepth look.

  7. 7

    awesome! thank you!

  8. 8

    Just a note for those of us who don’t live in a LAMP world…

    IIS 7 (since Win Server 2008) has had a free URLRewrite module that I use quite successfully.

    http://www.iis.net/download/urlrewrite

    ISAPI Rewrite is third party and intended for IIS6 (think Windows 2003)

  9. 9

    “File Existence and WordPress: Smashing Magazine runs on the popular blogging software WordPress.” …FYI, WordPress is a CMS, not just blogging software. Thanks for the great article!

  10. 10

    Thank you so much – this is probably the best written article I’ve come across on URL rewriting. Definitely gonna bookmark it for further study.

    Give that man a Bell’s !

  11. 11

    One thing to also remember is to set canonical tags on a page. For example, the Barack Obama page has a tag that reads:

    This tells search engines that while the page can be accessed via http://en.wikipedia.org/wiki/Barack_obama and http://en.wikipedia.org/w/index.php?title=Barack_obama, the two pages are actually the same page and are not duplicates made for SEO purposes which may result in penalties from the search engines.

    • 12

      Agustin Giannastasio

      November 4, 2011 8:23 pm

      Hi John, this SEO issue (duplicated content) is avoided by this line:

      It tells which URL is preferred, so Google will takes only this URL to rank on search results ;)

  12. 13

    Excellent tutorial I’ve been rewriting urls for a few years now my way isn’t quite as elegant as the GenerateUrl function, well not anymore :)

    I prefer to add the converted url to a database then it can be used across the project.

    • 14

      I found this all confusing until you explained everything and made it seem so easy. Thanks. This is a great article and will help a ton!

  13. 15

    Great!!!!!

    //Convert accented characters, and remove parentheses and apostrophes
    $from = explode (‘,’, “ç,æ,œ,á,é,í,ó,ú,à,è,ì,ò,ù,ä,ë,ï,ö,ü,ÿ,â,ê,î,ô,û,å,e,i,ø,u,(,),[,],'”);
    $to = explode (‘,’, ‘c,ae,oe,a,e,i,o,u,a,e,i,o,u,a,e,i,o,u,y,a,e,i,o,u,a,e,i,o,u,,,,,,’);

  14. 16

    oh man, I’ve been striving all day long with htaccess at work today, read this too late!
    tomorrow I’ll finish it up nicely thank to you ^^

  15. 17

    Two little tips.

    If the data is not changing its better make an array of it only once, instead of exploding something that is basically never changing.

    In the regex ‘.*’ is to captive, its better to make it more explicit like ‘[^/]+’.
    Or else the ‘/’ may get captured to.

    This to can be improved:
    return strtolower (preg_replace (‘/^-/’, ”, preg_replace (‘/-$/’, ”, $s)));

    In

    return strtolower(preg_replace (‘/^-|-$/’, ”, $s ));

    Or even better, using strtolower(trim( $s, ‘-‘ )) which is much faster then an Regex.

    • 18

      Very good point – thanks for that. Using trim is much better. And thanks to everybody for the very positive comments. I’m glad you all liked the article.

  16. 19

    Nice article but you should definitely mention the QSA flag. Many people don’t know how to pass query strings to the rewritten URL. This might as well break google adwords auto tagging and you wouldn’t be able to see adwords stats in analytics.

    See http://httpd.apache.org/docs/2.3/rewrite/flags.html#flag_qsa

  17. 20

    A very informative post. Thanks!

  18. 21

    Great post! I’ve been using using .htaccess for a couple years, but I’ve never seen a tutorial that explains the meaning of all the special characters used in .htaccess as nice as you did. It was very helpful.

  19. 22

    Wow, just wow!
    Just another very good reason why Smashing is a top notch resource.

  20. 23

    This is by far one of the best articles I’ve ever came across.. thank you soo much for this outstanding information…

  21. 24

    Excellent article! Thanks :)

  22. 25

    If you’re not using Apache I have found ISAPI rewrite excellent on IIS, follows similar syntax to the Apache rewrite and same functionality. Been using it for about 4 years now with no issues, except a massive txt file of rewrite rules.

  23. 26

    Amazing information!
    My only problem: shall I bookmark this, or save the entire page for future reference? I’m sure that I’ll need this information again, and I’m also sure that by then I will have forgotten most of it…

  24. 27

    If you have a website and the navigation set up: whateversitename.index.html, and a page with the collie puppies to sell to just name/link the page whateversitename.com/beautiful-collie-puppies.html without having to go through all the other .htaccess stuff?

    I mean, isn’t it just as effective and seo friendly?

  25. 28

    Nice this is the only step that requires looking deeply into the underlying code on your website.

    nice work thank you

  26. 29

    Excellent! Will have a look to apply this as soon as possible. Thanks!

  27. 30

    You should be careful with rules like
    RewriteRule ^products/.*/([0-9]+)$ diy/jsp/bq/nav.jsp?action=detail&fh_secondid=$1 [NC,L]

    While it’s nice that you have URL example.com/products/lawnmower/123 this also allows someone to use example.com/products/we-eat-babies/123 and it will show the exact same page. You don’t want that URL to show up in your Google results. There are real-life examples of online magazine articles with “funny” URLs.

    The article suggests the checkUrl() function for redirects. You can use that same function to show 404 error on invalid URLs.

  28. 31

    Would also be cool to replace the ampersand with an ‘and’ string

  29. 32

    Several examples seem to forget that the pattern in a RewriteRule is a regex. For example, ‘RewriteRule ^horses.htm$ Xu8JuefAtua.htm’ will also redirect ‘horses$htm’ and ‘horses=html’ to ‘Xu8JuefAtua.htm’.

    I personally dislike URIs along the lines of ‘http://www.amazon.co.uk/High-Voltage-AC-DC/dp/B00008AJL3’. There’s too much opaque cruft in there. IMO, it would be better to have ‘http://www.amazon.co.uk/product/B00008AJL3’ point at that page and ‘http://www.amazon.co.uk/product/High-Voltage-AC-DC/’ do a search for the product with the given name (also searching older names), redirecting automatically if only one match is found.

    By the way, you might want to turn off the syntax highlighting on the .htaccess examples, since it seems to be meant for a different language.

    Aside from those things, great article! This really is an excellent introduction to and overview of URL rewriting.

  30. 33

    Great post, URL rewriting can be one of the best and quickest ways to improve the usability and search friendliness of your site.

    Morgan Todd Memphis, TN

  31. 34

    Well explained, very clear and nice article! We can get the ideas in a single glance itself!

    I was really confused of all these things, and now I get a clear idea of URL rewriting.

    Thanks again!

  32. 35

    Superb article. Answers all my questions!

  33. 36

    Eduardo Hernández Villa

    November 3, 2011 8:29 pm

    Hey guys when you gonna do the Joomla! section! or some book about it.

  34. 37

    Note: If you can’t use the Apache Rewriting-Module (“mod_rewrite” in combination with “.htacess”), you might consider using the PATH_INFO variable.

    Back in the days of PHP 4 and complicated, unreadable Apache manuals, this proved to be the quick’n’grimy way to still get some partially good-looking URLs.

    I was reminded of this technique when reading your WordPress example; just recently I had a few cases in which regular server-based rewriting was not available, or to be exact, not reliable, because the actual deployment of those projects was carried out by the customer’s own system administration team. So I had no direct access to their servers and thus also would not know if the supplied .htaccess chain would properly – or at all (!) – work.

    These projects where thus set up with a different permalink rewriting (WP > Admin > Settings > Permalink(s)), by selecting the wanted rewrite and THEN adding “/index.php” to the mix. And that does the magic ;)

    Of course, you can use this with any PHP-based project, and a quick look-up on the net told me that this method is available for lots of other programming languages as well (eg. Perl =>
    “CGI Environmental Variables”, Ruby, Python or C).

    cu, w0lf.

  35. 38

    Gyanomtech Studios

    November 4, 2011 3:27 am

    This is really awesome stuff. Lucidly and to the point, yet explains all the finer details to a great extent.
    Many Thanks!!

  36. 39

    Jonathan Goldford

    November 4, 2011 1:08 pm

    Excellent article. Very thorough.

  37. 40

    Alessandro Martins

    November 5, 2011 7:49 am

    There are better ways to do this, and you are assuming that everyone will use Apache and have mod_rewrite enabled, then, before offering this feature to your customer, make sure the minimum requirements.

    About the transliteration (it is more than a simple character conversion see: http://bit.ly/viPftQ), the best way is to use iconv (http://bit.ly/rLxnA2) if available, especially if the system need to support more than one language (i18n, http://bit.ly/ruc3vj).

    • 41

      Thanks for the comment – I didn’t know about the iconv function – that sounds like a better way of doing the transliteration. Thanks for everybody’s comments.

  38. 42

    Yet Another PHP article.

  39. 43

    Very nicely explained. Good Stuff.

  40. 44

    good timing thanks

  41. 45

    webdevelopergeeks

    November 8, 2011 9:18 pm

    one of the most elegant post i have read for url rewriting. Just yesterday night i was searching for wordpress url rewriting and today i got yours. Thanks a lot…

  42. 46

    I tried to redirect users from a dead url
    RewriteEngine On
    RewriteCond %{HTTP_HOST} !^vcet.collegeczar.com$ [NC]
    RewriteRule (.*) http://www.com/$1 [L,R=301]

    I have placed .htacess file in the vcet folder.
    It didn’t work.

    • 47

      Alessandro Martins

      November 20, 2011 3:04 pm

      Make sure you have “mod_rewrite for Apache” module enabled and AllowOverride allowing you to overwrite the Apache config files.

  43. 48

    Thanks a lot Paul for the in-depth post on URL rewriting. This obviously took some time, nice! Thanks for sharing!
    Jim
    @SEO_Web_Design

  44. 49

    “Convert accented characters, and remove parentheses and apostrophes”
    They are called diacritics, and there are many, MANY, M A N Y more than the ones listed. Really, there are hundreds. And that’s only counting latin(-derived) languages.

    But beware, in some languages, removing diacrtitics can actually change the meaning of a word quite dramatically. In those languages, people are very much used to typing them, so removing them will only work counterproductive.

    Example: http://en.wikipedia.org/wiki/Danish_and_Norwegian_alphabet#Norwegian

  45. 51

    Perfectly organized article…!!! Thanks a lot..!!!

  46. 52

    Thanks this was really interesting and had to read the whole article! I loved it, very very useful.

  47. 53

    Thank you for this article. Was very helpful for me.

  48. 54

    Thank you for this article. Was very helpful for me. very very tanks

  49. 55

    Get free Wallpapers

    January 17, 2012 5:53 am

    Thank you, I have recently been searching for info approximately this topic for a while and yours is the best I’ve discovered so far. However, what in regards to the bottom line? Are you sure about the supply?|What i do not understood is actually how you are no longer really much more smartly-liked than you may be right now. You’re very intelligent.

  50. 56

    I’ve read this and other articles you wrote here on SmashingMag. I have to say they’re all easy to read and also filled with nuggets of valuable information. I’ve always struggled to find good explanations on this, invaluable addition to my bookmarks!

  51. 57

    This is a really well structured post. Sadly, the central example is incorrect.

    If you take a look at the source of the page used as an example, you’ll see it’s canonicalised to:

    “/nav/fix/plumbing/guttering/mini_line_guttering/-specificproducttype-gutter/FloPlast-2Mtr-x-76mm-Miniflo-Gutter-Brown-11577676″

    In other words, they already use a URL that includes the product name. (albeit not the prettiest in the world)

    dan

  52. 58

    hi everyone

    please i need your help
    i have a url which is like http://www.eng-usalah.com/news45.php i want to rewrite it like http://www.eng-usalah.com/news/45 how can i do it???? please help.

    i tried in htaccess but i couldnt. what do i shuold write in htaccess file?and what must i do in the position which linked to this page?

  53. 59

    Neville Phillips

    April 21, 2012 8:45 am

    Problemo! John Folley above wrote, “One thing to also remember is to set canonical tags on a page. For example, the Barack Obama page has a tag that reads:”

    … but the line following that didn’t render on this page. What was it?

    Similarly, below that, Agustin Giannastasio wrote, “Hi John, this SEO issue (duplicated content) is avoided by this line::

    .. but that following line wasn’t rendered either!

    Do we know what these posters actually wrote?

  54. 60

    cant get it to work.

    RewriteEngine On

    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . index.php
    RewriteRule ^MYSTUFF/?$ categories/MYSTUFF/ [NC]

    SetEnv SEO_SUPPORT 1

    any advice please as I want to loose the categories ???

  55. 61

    Thank You, good

    I have question one does rewrite pages get listted in search engine.

    Regards,
    Pushpanjali

  56. 62

    Thank you.

    Nice tutorial. Keep it up.

  57. 63

    Does someone maybe knows how to redirect links that contains Apostrophe ( ‘ ) ?

  58. 64

    Raja Narayanasamy

    October 19, 2012 12:10 pm

    Fantastic!!! Great Article!!! My Big Thumbs Up!!
    By the way can i use generateurl function inside .htaccess?

  59. 65

    Great article but it doesn’t my problem where my website (a book store) has series names and book titles with -sometimes- more than two charcter that should vanity/Urlized. On top of that, my wesite is using the URL name for find the item back in the database:
    example: book with the series name “I.R.$.”
    /shop/series/i_r_ /
    is not working, my database can accept however using select like
    /shop/series/i_r__/

    the problem is there since there is a dot and a dollar sign next to each other.
    realize that i cannot use a id code since a series name stand for several books.

    real working example of my problem on http://www.lambiek.net, this particular problem I have temporarily solved by changiong the series name to IR$

  60. 66

    Sorry, I should probably have mentioned that the desired URL would be: st-pauls-cathedral-london

    However, it turns out that the original function code you gave DOES work. But it only seems to do so if I apply the function to the name live on the page, i.e.

    But what I am currently doing is applying the function within the index.php file when data is entered into the website. I suspect the problem is coming from the fact that I am applying the function to a value which has already had the below function applied to it (to deal with magic quotes):-

    $attraction_name = mysqli_real_escape_string($link, $_POST[‘attraction_name’]);
    $attraction_url = generateurl($attraction_name);

    I reckon I’ve got to shift some coding around to generate the URL from the attraction_name before it is affected by mysqli_real_escape_string. I’d best go and have a play…

  61. 67

    Yes, it turns out that ‘mysqli_real_escape_string’ was the cause of the problem. A bit of reordering of the code seems to have sorted it:-

    $attraction_url = generateurl($_POST[‘attraction_name’]);
    $attraction_name = mysqli_real_escape_string($link, $_POST[‘attraction_name’]);

    Thanks!

    Andy

  62. 68

    I always spent my half an hour to read this webpage’s articles everyday along with a mug of coffee.Robes de Soirée Boutique

  63. 69

    I really like this article because it’s full of great information. It would be even better though if the rewrite rule examples were visible. Paul, any chance of fixing that?

  64. 70

    I was expecting an article to talk about how to use WordPress $WP_Rewrite and their hooks for actions and filters so I could apply it to a plugin without needing to edit my htaccess files, do you have something like this? Please let me know, thanks!

  65. 71

    can I ask your some question? the wordpress url rewrite is by “/wp-includes/rewrite.php” file?
    how to work?

  66. 72

    Thank you so much. Great explanation and examples. So many other sites are way too generic and assume that you know way more than is actually the case.

    Steve

  67. 73

    This is what I was looking For

    Thanks for such a article

  68. 74

    Hello
    I need help please. I have my website:

    http://website.com/badkeyword

    and I try to change it with htacces into

    http://website.com/goodkeyword

    It is possible ?
    Thanks for help.

  69. 75

    Very nice article.
    Thanks a lot !!

  70. 76

    very good post …

  71. 77

    I made a javascript version of the character replacement.

    Its written to be a method within an object, but could easily be turned into a stand alone function.

    https://gist.github.com/rememberlenny/6721025

  72. 78

    how about if i wanna change my url from index.htm to index.xxx ?

    • 79

      Try this

      RewriteEngine on

      RewriteBase /

      RewriteRule ^index.xxx$ index.htm [NC,L]

      So if u type example.com/index.xxx in your browser the index.htm page would be loaded but your url wld still be index.xxx

  73. 80

    Bless you for this tutorial. It helped a lot.

  74. 81

    This is the best tutorial on rewrite I have seen! Thank you very much!

  75. 82

    Well, you can use Long Path Tool for such issues, it works good I will say.

  76. 83

    Oh my God, what kept you writing, my head is spinning.

    I am a very loyal PHP writer and a hate .htaccess with a passion but you make it so easy to understand, bless you. I am still a beginner ( hence my question on http://stackoverflow.com/questions/20842764/can-someone-prompt-me-in-the-right-direction-to-use-mod-rewrite ) but i think i got the gist.

    God bless you.

  77. 84

    Thank you Alot. Very helpful tutorial, learned alot :)

  78. 85

    Lalit Narayan Jadhav

    February 12, 2014 3:11 am

    very good explained.
    cleared all my doubts about this topic

  79. 86

    Thanks for such a lovely tutorial…. loved it… and enjoyed it too…

  80. 87

    This is the one i am looking from web.I’ll search many times,but this is the best one google gave me.thanks for the author!you are doing great work.Good Luck!Better to add comment form in the top of the comments position as a link.Anyway thanks!

  81. 88

    Sajadullah Safi

    April 3, 2014 3:38 am

    I used this code
    RewriteEngine On
    RewriteRule ^.+/p/([0-9]+) showarticle.php?article=$1 [NC,L,R]
    RewriteRule ^.+/c/([0-9]+) showarticles.php?cat=$1 [NC,L,R]

    but not work
    my site is : sorface.com
    plz help
    i want change
    http://sorface.com/showarticle.php?article=3554
    to
    http://sorface.com/Bad_Luck_Or_Blessing_In_Disguise?

  82. 89

    Thanks for the info but am still facing a problem.
    var/www/abc/xyz is the folder where a website xyz.com is located and var/www/abc being the document root. Similarly many sites are there.

    What I am trying to do is on going to xyz.com it will point to /var/www/abc but will open xyz folder. how can i implement this and across all such xyz,pqr etc folders inside abc.

  83. 90

    RewriteEngine on
    RewriteCond %{HTTP_HOST} ^(www.)?xyz.com$ [NC]
    RewriteCond %{REQUEST_URI} !^/xyz/
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^(.*)$ /xyz/$1
    RewriteCond %{HTTP_HOST} ^(www.)?xyz.com$ [NC]
    RewriteRule ^(/)?$ xyz/index.php [L]

    this gave me what i wanted. but how to implement it across multiple sites in the same document root?

  84. 91

    This is a great article!
    Thanks for describing everything in detail and also considering every error that could be happened :D

    Kind regards

  85. 92

    Concerning the GenerateURL function, the preg_replace first parameter is wrong and doesn’t work at all : it’s not a valid regex. Here the correction :

    function GenerateUrl ($s) {
    //Convert accented characters, and remove parentheses and apostrophes
    $from = explode (‘,’, “ç,æ,œ,á,é,í,ó,ú,à,è,ì,ò,ù,ä,ë,ï,ö,ü,ÿ,â,ê,î,ô,û,å,e,i,ø,u,(,),[,],'”);
    $to = explode (‘,’, ‘c,ae,oe,a,e,i,o,u,a,e,i,o,u,a,e,i,o,u,y,a,e,i,o,u,a,e,i,o,u,,,,,,’);
    //Do the replacements, and convert all other non-alphanumeric characters to spaces
    $s = preg_replace (‘~[^wd]+~’, ‘-‘, str_replace ($from, $to, trim ($s)));
    //Remove a – at the beginning or end and make lowercase
    return strtolower (preg_replace (‘/^-/’, ”, preg_replace (‘/-$/’, ”, $s)));
    }

  86. 93

    This is exactly what i needed. Thanx Bro!!!!

  87. 94

    Hmm it looks like your site ate my first comment (it was extremely long) so I guess I’ll just sum it up what I had written and say, I’m thoroughly enjoying your blog. I as well am an aspiring blog blogger but I’m still new to the whole thing. Do you have any points for newbie blog writers? I’d definitely appreciate it.

  88. 95

    This article changed my view regarding url writting complication…

    Thanks a lot

↑ Back to top