Using Mod Rewrite to Mask Variables
This post assumes that you are familiar with server side scripting, if/then statements, and basic website development.
When I first started working with web development, I always assumed that there was a one-to-one relation to a unique URL and file on the server. That is, every time a user clicks on a new link, they are taken to a different file and/or script on the server. This brings several problems to a web developer's plate, most importantly maintaining a consistent look across a wide range of pages, usually addressed by using a content management system or template product. If I want to update part of a website as a web developer, I probably want to update a single file as opposed to a multitude of pages to keep the main parts of a website (header, footer, navigation) the same.
It took a while to understand how to use $_GET variables to simplify a website. First used for html forms, $_GET variables are passed through the address bar from one page to the next. The best example of this is with Google. When you visit http://www.google.com/ to search for 'Oak Table', you view the search results after pressing Submit. The address bar now has a bunch of gunk at the end of it http://www.google.com/search?hl=en&q=Oak+Table&btnG=Google+Search&cts=1240330684752&aq=f&oq= - also known as $_GET variables.
Starting with a question mark, there are a series of variables separated with & symbols. Each of these variables have a name and value - q=Oak+Table shows the original search. If you change this to, say, q=Wooden+Table, you are taken to a page with results on Wooden Tables instead of Oak Tables.
This becomes powerful if you use server side scripts that can use the $_GET variables to display different information. In the Google example, they probably search their database for the $_GET['q'] value and return links and information about the results. However, if you had a single page, like a blog, and wanted to display blogs from the month of January 2009, you could have a link to http://www.yoursitehere.com/blog?month=2009+Jan. You'd have to use a simple query or if/then statement in your server side script of choice from this point to display the appropriate blog posts.
There is a nice way to mask these variables. In the Google example, you can edit the URL directly to bypass their search box. By exposing the $_GET variables in the URL, you also expose part of the structure behind your website and the variables used to manipulate the code, a serious security risk. If you have an Apache server, you can mask $_GET variables and clean up the messy URL using a function called mod_rewrite.
First, you have to create a file in your root directory titled .htaccess. Turn on the rewriter with this line.
RewriteEngine on
...and start rewriting using regular expressions. If you don't know what regular expressions, a simple Google search will turn up plenty of cheat sheets and tutorials available. Here's an example of rewriting a date $_GET variable for a blog.
RewriteRule ^/blog/([0-9]{4}-[0-9]{2}-[0-9]{2})/$ blog.php?date=$1
RewriteEngine on
And another example for a members page.
RewriteRule ^/members/([a-z_]+)/$ members.php?member=$1
RewriteEngine on
What these two examples do is rewrite the URL on the server end so you can use it easily for server side scripts. The user visits http://www.yoursitehere.com/blog/2009-04-01/, but in your blog.php script, you can use $_GET['date'] to pull the date 2009-04-01. Same with the second example - user visits http://www.yoursitehere.com/members/joe_black/ and you can use the variable $_GET['member'] to see that the user wants to see information about joe_black.
Since the user will see different pages based on the last end of the URL, you can extend this even further and have a small set of scripts create an entire website. With a series of if/then statements, a single script can parse out the $_GET variable and display a page's worth of information. This becomes much more powerful combined with a database full of unique information - based on the variables, you can pull dozens, if not hundreds, of pages worth of unique data from a single script.
There are a few pitfalls with this approach. First of all, you run the risk of a user directly typing in an invalid URL and returning no information from your script. Also, the regular expressions used in the .htaccess file will need to comply with URL allowed characters - some characters (? and = are two) are reserved for special uses.
There are many other uses for the .htaccess file as well, with and without mod_rewrite. These include 301 redirects, keeping/removing the www in front of your domain, and blocking certain IPs or robots from visiting your website. This is just one unique, and very powerful, use of this file combined with Apache.
Comments (0)