Quantcast
Channel: MSDN Blogs
Viewing all articles
Browse latest Browse all 35736

Using Classic ASP and URL Rewrite for Dynamic SEO Functionality

$
0
0

I had another interesting situation present itself recently that I thought would make a good blog: how to use Classic ASP with the IIS URL Rewrite module to dynamically generate Robots.txt and Sitemap.xml files.

Overview

Here's the situation: I host a website for one of my family members, and like everyone else on the Internet, he wanted some better SEO rankings. We discussed a few things that he could do to improve his visibility with search engines, and one of the suggestions that I gave hime was to keep his Robots.txt and Sitemap.xml files up-to-date. But there was an additional caveat - he uses two separate DNS names for the same website, and that presents a problem for absolute URLs in either of those files. Before anyone points out that it's usually not a good idea to host multiple DNS names on the same content, there are times when this is acceptable; for example, if you are trying to decide which of several DNS names is the best to use, you might want to bind each name to the same IP address and parse your logs to find out which address is getting the most traffic.

In any event, the syntax for both Robots.txt and Sitemap.xml files is pretty easy, so I wrote a couple of simple Classic ASP Robots.asp and Sitemap.asp pages that output the correct syntax and DNS-specific URLs for each domain name, and I wrote some simple URL Rewrite rules that rewrite inbound requests for Robots.txt and Sitemap.xml files to the ASP pages, while blocking direct access to the Classic ASP pages themselves.

All of that being said, there are a couple of quick things that I would like to mention before I get to the code:

  • First of all, I chose Classic ASP for the files because it allows the code to run without having to load any additional framework; I could just as easily used ASP.NET or PHP, but either of those would require additional overhead that isn't really required.
  • Second, the specific website for which I wrote these specific examples consists of all static content that is updated a few times a month, so I wrote the example to parse the physical directory structure for the website's URLs and specified a weekly interval for search engines to revisit the website. All of these options can easily be changed; for example, I reused this code a little while later for a website where all of the content was created dynamically from a database, and I updated the code in the Sitemap.asp file to create the URLs from the dynamically-generated content. (That's really easy to do, but outside the scope of this blog.)

That being said, let's move on to the actual code.

Creating the Required Files

There are three files that you will need to create for this example:

  1. A Robots.asp file to which URL Rewrite will send requests for Robots.txt
  2. A Sitemap.asp file to which URL Rewrite will send requests for Sitemap.xml
  3. A Web.config file that contains the URL Rewrite rules

Step 1 - Creating the Robots.asp File

You need to save the following code sample as Robots.asp in the root of your website; this page will be executed whenever someone requests the Robots.txt file for your website. This example is very simple: it checks for the requested hostname and uses that to dynamically create the absolute URL for the website's Sitemap.xml file.

<%Option ExplicitOn Error Resume NextDim strUrlRootDim strHttpHostDim strUserAgent

    Response.Clear
    Response.Buffer = True
    Response.ContentType = "text/plain"
    Response.CacheControl = "public"

    Response.Write "# Robots.txt"& vbCrLf
    Response.Write "# For more information on this file see:"& vbCrLf
    Response.Write "# http://www.robotstxt.org/"& vbCrLf & vbCrLf

    strHttpHost = LCase(Request.ServerVariables("HTTP_HOST"))
    strUserAgent = LCase(Request.ServerVariables("HTTP_USER_AGENT"))
    strUrlRoot = "http://"& strHttpHost

    Response.Write "# Define the sitemap path"& vbCrLf
    Response.Write "Sitemap: "& strUrlRoot & "/sitemap.xml"& vbCrLf & vbCrLf

    Response.Write "# Make changes for all web spiders"& vbCrLf
    Response.Write "User-agent: *"& vbCrLf
    Response.Write "Allow: /"& vbCrLf
    Response.Write "Disallow: "& vbCrLf
    Response.End
%>

Step 2 - Creating the Sitemap.asp File

The following example file is also pretty simple, and you would save this code as Sitemap.asp in the root of your website. There is a section in the code where it loops through the file system looking for files with the *.html file extension and only creates URLs for those files. If you want other files included in your results, or you want to change the code from static to dynamic content, this is where you would need to update the file accordingly.

<%Option ExplicitOn Error Resume Next
    Response.Clear
    Response.Buffer = True
    Response.AddHeader "Connection", "Keep-Alive"
    Response.CacheControl = "public"Dim strFolderArray, lngFolderArrayDim strUrlRoot, strPhysicalRoot, strFormatDim strUrlRelative, strExtDim objFSO, objFolder, objFile

    strPhysicalRoot = Server.MapPath("/")Set objFSO = Server.CreateObject("Scripting.Filesystemobject")
    strUrlRoot = "http://"& Request.ServerVariables("HTTP_HOST")' Check for XML or TXT format.If UCase(Trim(Request("format")))="XML"Then
        strFormat = "XML"
        Response.ContentType = "text/xml"Else
        strFormat = "TXT"
        Response.ContentType = "text/plain"EndIf' Add the UTF-8 Byte Order Mark.
    Response.Write Chr(CByte("&hEF"))
    Response.Write Chr(CByte("&hBB"))
    Response.Write Chr(CByte("&hBF"))If strFormat = "XML"Then
        Response.Write "<?xml version=""1.0"" encoding=""UTF-8""?>"& vbCrLf
        Response.Write "<urlset xmlns=""http://www.sitemaps.org/schemas/sitemap/0.9"">"& vbCrLfEndif' Always output the root of the website.Call WriteUrl(strUrlRoot,Now,"weekly",strFormat)' --------------------------------------------------' This following section contains the logic to parse' The directory tree and return URLs based on the' static *.html files that it locates. This is where' you would change the code for dynamic content.' -------------------------------------------------- 
    strFolderArray = GetFolderTree(strPhysicalRoot)For lngFolderArray = 1 to UBound(strFolderArray)
        strUrlRelative = Replace(Mid(strFolderArray(lngFolderArray),Len(strPhysicalRoot)+1),"\","/")Set objFolder = objFSO.GetFolder(Server.MapPath("."& strUrlRelative))ForEach objFile in objFolder.Files
            strExt = objFSO.GetExtensionName(objFile.Name)If StrComp(strExt,"html",vbTextCompare)=0 ThenIf StrComp(Left(objFile.Name,6),"google",vbTextCompare)<>0 ThenCall WriteUrl(strUrlRoot & strUrlRelative & "/"& objFile.Name, objFile.DateLastModified, "weekly", strFormat)EndIfEndIfNextNext' --------------------------------------------------' End of file system loop.' -------------------------------------------------- If strFormat = "XML"Then
        Response.Write "</urlset>"EndIf
    Response.End' ======================================================================'' Outputs a sitemap URL to the client in XML or TXT format.' ' tmpStrFreq = always|hourly|daily|weekly|monthly|yearly|never ' tmpStrFormat = TXT|XML'' ======================================================================Sub WriteUrl(tmpStrUrl,tmpLastModified,tmpStrFreq,tmpStrFormat)OnErrorResumeNextDim tmpDate : tmpDate = CDate(tmpLastModified)' Check if the request is for XML or TXT and return the appropriate syntax.If tmpStrFormat = "XML"Then
            Response.Write " <url>"& vbCrLf
            Response.Write " <loc>"& Server.HtmlEncode(tmpStrUrl) & "</loc>"& vbCrLf
            Response.Write " <lastmod>"& Year(tmpLastModified) & "-"& Right("0"& Month(tmpLastModified),2) & "-"& Right("0"& Day(tmpLastModified),2) & "</lastmod>"& vbCrLf
            Response.Write " <changefreq>"& tmpStrFreq & "</changefreq>"& vbCrLf
            Response.Write " </url>"& vbCrLfElse
            Response.Write tmpStrUrl & vbCrLfEndIfEndSub' ======================================================================'' Returns a string array of folders under a root path'' ======================================================================Function GetFolderTree(strBaseFolder)Dim tmpFolderCount,tmpBaseCountDim tmpFolders()Dim tmpFSO,tmpFolder,tmpSubFolder' Define the initial values for the folder counters.
        tmpFolderCount = 1
        tmpBaseCount = 0' Dimension an array to hold the folder names.ReDim tmpFolders(1)' Store the root folder in the array.
        tmpFolders(tmpFolderCount) = strBaseFolder' Create file system object.Set tmpFSO = Server.CreateObject("Scripting.Filesystemobject")' Loop while we still have folders to process.While tmpFolderCount <> tmpBaseCount' Set up a folder object to a base folder.Set tmpFolder = tmpFSO.GetFolder(tmpFolders(tmpBaseCount+1))' Loop through the collection of subfolders for the base folder.ForEach tmpSubFolder In tmpFolder.SubFolders' Increment the folder count.
                tmpFolderCount = tmpFolderCount + 1' Increase the array sizeReDimPreserve tmpFolders(tmpFolderCount)' Store the folder name in the array.
                tmpFolders(tmpFolderCount) = tmpSubFolder.PathNext' Increment the base folder counter.
            tmpBaseCount = tmpBaseCount + 1Wend
        GetFolderTree = tmpFoldersEndFunction
%>

Note: There are two helper methods in the preceding example that I should call out:

  • The GetFolderTree() function returns a string array of all the folders that are located under a root folder; you could remove that function if you were generating all of your URLs dynamically.
  • The WriteUrl() function outputs an entry for the sitemap file in either XML or TXT format, depending on the file type that is in use. It also allows you to specify the frequency that the specific URL should be indexed (always, hourly, daily, weekly, monthly, yearly, or never).

Step 3 - Creating the Web.config File

The last step is to add the URL Rewrite rules to the Web.config file in the root of your website. The following example is a complete Web.config file, but you could merge the rules into your existing Web.config file if you have already created one for your website. These rules are pretty simple, they rewrite all inbound requests for Robots.txt to Robots.asp, and they rewrite all requests for Sitemap.xml to Sitemap.asp?format=XML and requests for Sitemap.txt to Sitemap.asp?format=TXT; this allows requests for both the XML-based and text-based sitemaps to work, even though the Robots.txt file contains the path to the XML file. The last part of the URL Rewrite syntax returns HTTP 404 errors if anyone tries to send direct requests for either the Robots.asp or Sitemap.asp files; this isn't absolutely necesary, but I like to mask what I'm doing from prying eyes. (I'm kind of geeky that way.)

<?xmlversion="1.0"encoding="UTF-8"?><configuration><system.webServer><rewrite><rewriteMaps><clear/><rewriteMapname="Static URL Rewrites"><addkey="/robots.txt"value="/robots.asp"/><addkey="/sitemap.xml"value="/sitemap.asp?format=XML"/><addkey="/sitemap.txt"value="/sitemap.asp?format=TXT"/></rewriteMap><rewriteMapname="Static URL Failures"><addkey="/robots.asp"value="/"/><addkey="/sitemap.asp"value="/"/></rewriteMap></rewriteMaps><rules><clear/><rulename="Static URL Rewrites"patternSyntax="ECMAScript"stopProcessing="true"><matchurl=".*"ignoreCase="true"negate="false"/><conditions><addinput="{Static URL Rewrites:{REQUEST_URI}}"pattern="(.+)"/></conditions><actiontype="Rewrite"url="{C:1}"appendQueryString="false"redirectType="Temporary"/></rule><rulename="Static URL Failures"patternSyntax="ECMAScript"stopProcessing="true"><matchurl=".*"ignoreCase="true"negate="false"/><conditions><addinput="{Static URL Failures:{REQUEST_URI}}"pattern="(.+)"/></conditions><actiontype="CustomResponse"statusCode="404"subStatusCode="0"/></rule><rulename="Prevent rewriting for static files"patternSyntax="Wildcard"stopProcessing="true"><matchurl="*"/><conditions><addinput="{REQUEST_FILENAME}"matchType="IsFile"/></conditions><actiontype="None"/></rule></rules></rewrite></system.webServer></configuration>

Summary

That sums it up this blog; I hope that you get some good ideas from it.

For more information about the syntax in Robots.txt and Sitemap.xml files, see the following URLs:


Viewing all articles
Browse latest Browse all 35736

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>