There's a regular question that comes up on the CF-Talk list every now and again about how to do Search Engine Safe (SES) URLs. This article is designed to examine the issues, show some code that will provide SES functionality, explain the code And Answer Common Questions That May Come Up. The Code Assums IIS 5 (or Better) and cfmx 6.1. Minor Alterations Need to Be Made For Other Versions of ColdFusion or Other Webservers.
The basic idea of SES URLs is to create a URL that is passing variables (variable = value pairs) but does not look like it is. Why? Because search engines do not always pick up links to pages where variables are passed. Before I Go Into this, let me define some terms:
Static page - a page called from a Url That Does Not Contain Variables
http://www.houseoffusion.com
Dynamic Page - a page called from a url That Contains Variables
http://www.houseoffusion.com/cf_listers/Messages.cfm?threadid=33724&ForumId=4
The general rule that I was told (and have seen) is that a search engine will index a static page without problems. It will also index a dynamic page as long as the link to that dynamic page is coming from a static page (or what Looks like a static page). It will not index a dynamic page........................ ..
This means that if you have a large number of pages that are only accessible by passing variables, you may be out of luck. The solution is to hide the variable / value pairs and make the link LOOK like a static one.
The first issue with SES URLs is the question mark (?). This separates the URL from the variables and is the primary marker of a dynamic page. The second issue, which does not occur with all search engines, is the variable / value separator Which is an equals sign (=). The third Issue is The Linker Between Multiple Variable / Value Pairs, The ampersand (&). Our goal is to turn a url That Loo Like this:
http://www.houseoffusion.com/cf_listers/Messages.cfm?threadid=33724&ForumId=4
INTO One That Looks Like this:
http://www.houseoffusion.com/cf_lists/messages.cfm/threadid=33724:ForumID=4
A search engine will see the first URL as a dynamic one. It will see the second as a static one. In order to use this, a function needs to be written on the server side to convert the URL into something that is usable. The function below will ColdFusion, we're writing the function as a ColdFusion User Defined function (UDF), but the concept can be translated into just about any language take the static looking URL and extract variable / value pairs from it. As we are using Function SES () {var URLVARS = REREPLACENOCASE (Trim (cgi.path_info), '. /. cfm /? *', '); var loopcount = listlen (urlvars,' / '); var potential = "" ; if (cgi.script_name eq cgi.path_info) Return 0; for (i = 1; i lte loopcount; i = i 1) {potential = trim (listget (urlvars, i, '/'); if (Refindnocase ('^ [AZ] [A-Z0-9 _] *:. *', potential) setvariable ('url.' & listfirst (potential, ':'), listlast (potential, ':'));} Return 1 }
03: This line is the basic cleaner to remove the template being called, the domain name, etc. and leave us with any potential variable / value combinations It assumes a slash (/) as the character between the URL and the variable value pairs. (Which is common) .04: Assuming a Slash (/) IS Also Used to Separate Between Multiple Variable / Value Pairs, Get The Number of Such PaiRs.
06: If the cgi.script_name is the same as the cgi.path_info, then drop out of the functions There are no variable pairs here Note that we do this after setting the variables as the use of var in a CFSCRIPT based UDF requires.. the vars to be above any other code. Also note that we are scoping the CGI variables. This is not really needed, but is done to save the lookup time needed by CF to see if there's a path_info variable in variables, form, url, Etc. Before getting to the cgi scope. The Savings IS Almost Nill, but is good practice.
07: We Always Return Something from A UDF, Even if That Something Will Never Be Used. In General, A 0 IS Passed Back AS a Failure Code and A 1 IS Passed Back Asia Success.
08: Loop over each variable / value pair to turn the into url scope variables.
10: Get The Potential Variable / Value Pair. We call this potential as the next line of code will check if it has the next the next the next the next... ..
11: Check if the potential variable starts with a letter followed by 0 or more letters, numbers or underscores (_) These are the only characters that are allowed in a variable (yes, in CFMX a variable can start with a dollar sign,. But We're ignoring That). Note Also That We're Using A Colon (:) as The Separator Between The Variable and The Value. To Be Legal, A Colon Has To Exist. on The Other Hand, Weailow for a variable to be passed without any value.12:. If the potential pair is legal, then separate them into the variable portion, which will have the url scope placed before it and the value Note that I'm using the SetVariable () function for this Operation. I Personally Feel That this is The Proper Way of Setting A Dynamic Variable. The Code Will Work Just As Well if Done Using The Quote Pound Sign Method.
14: Again, we return Something at the end of the function just to be 'propr'.
To make use of this function, just call it on any page you expect to have SES URLs on. Works like a charm and is simple to put into effect. Of course, this is the simple, standard method. If you look at a URL Such as this http://www.houseoffusion.com/lists.cfm/link=m:4:33409:167612, you can see it'm doing some more intending things here. this serves to make the url shorter, HIDES Some of what's going on and request more exact code. But what's for another time. :)
BEFORE I end, Let me take a few quintions.
1. Why are you using a colon (:) to seperate Between Variables and values? Other systems use slashes (/) to seperate everything.there Are Two Reasons to use a colon (:). The first is style. There Should Be a clear seperation between variables and values. This is both for debugging purposes and to keep a clean style. The second reason is a more rational one. My system allows for a variable with a NULL value. Using slashes only does not.2. This doesn 'T Seem to Work ON My System! Why? I've Head People Say That in Order for this to work on Iis, There's a special setting That be turned off. in Your Iis Admin, Go To Home Directory, Configuration and Edit The Extension You Wish Ses To Work with (.cfm in Our Case). at The Bottom of The Edit Screen Is A Checkbox That Says "Check That File Exists". Make Sure This Is In The Off Position. I Personally Have Never Had A Problem with this, But I'm trusting in the error.
3. When using SES URLs, my relative paths are messed up. How do I fix this? Because we're hiding the variables behind slashes, any relative line will be looking to start from a directory that basically does not exist. For this reason WHEN USING SES URLS, You Have to Make All References To Web Content Absolute. This Means That ./Images/logo.gif Has To BE / Images / LOGO / GIF. An Annoyance, But a Price Most Are Willing to Pay for The results.
4. Passing dates in a mm / dd / yy format blows up. Is the the a fixed a slash (/) AS A DELIMITER, A DATE PASSED IN THAT IS Seen As Just More Variable / Value PAIRS And unbalanced at what). To fix this, you have to translate the date you're sending on the url inte you're dese. dashes (-) Are a good alternative. Remember not to use colons (:) As they're buy by the code for variable / value separators.I'm sure there are more questions and problems and I will address them as they come up. I strongly feel that the use of this code or other code of the sort is a must to get your dynamic Site indexed by the major search engine. House of fusion is Heavily Indexed by Just About Every Search Engine Due To this and i don't miss coding a small fix make it has works profilely.