Brettb.Com
  HOME | ABOUT ME | BIOTECHNOLOGY | ARTICLES | GALLERY | CONTACT
Search: Go
TECHNICAL ARTICLES
 ASP
 ASP.NET
 JavaScript
 Transact SQL
 Other Articles
 Software Reviews

PHOTO GALLERIES
 Canon EOS 300D Samples
 Akihabara Maids!
 More Galleries...

TRAVEL LOG
 2009: China
 2008: Tokyo
 2007: Tokyo
 2006: Hong Kong
 2005: New York City

MORE STUFF
 Search Engine Optimisation
 Build an ASP Search Engine
 My Tropical Fishtank
 Autoglass
 SQL Month Name
 SQL Get Date Today
 SQL Year Month
 Other New Stuff...

POPULAR STUFF
 Regular Expressions
 Index Server & ASP
 JavaScript Ad Rotator

Home > JavaScript Articles

Creating a JavaScript Search Engine for your Website

A guide to using The Website Utility to create a JavaScript Search Engine for your website.

Why Websites need Search Facilities

Once a website grows beyond half a dozen pages then it can sometimes be difficult to create a site navigation scheme that allows users to quickly find what they're looking for. One way to improve site navigation is to add a search facility to the website. Adding a search facility brings major benefits to a website, making it easier to find information as well as adding an additional method of navigating a website. Search facilities are generally well used, and will frequently appear within the top ten most requested pages on a website.

Search Engines Allow Visitors to Search your Content

One of the easiest ways to add a facility for searching the pages in your website is to link to search results for your website from one of the major search engines. Google and other major search engines allow you to do this. However, using this method it can be difficult to integrate the search results with the design of your website. It also carries the obvious risk of a website visitor leaving your website and not returning! Even worse, your website visitors may see an advert for a competitor on the search results page, and so go and do their business elsewhere!

Building Your Own Search Engine

There are a number of software solutions that allow you to put your own search engine on your website. These include server-side search solutions available such as Microsoft's Index Server or ht://Dig. Although they allow sophisticated search facilities to be created, they generally require a high level of technical knowledge to install and configure. To create a search page with these solutions, programming knowledge of server-side scripting languages such as Active Server Pages (ASP) or PHP is also usually required, or you will need to employ somebody to create the code for you. To complicate matters, not many web hosting companies support these search solutions, and those that do often charge additional hosting fees.

An alternative is to use a purely client-side solution using a browser scripting language such as JavaScript. This has no additional software requirements on the server, and will work regardless of whether the website is hosted on Windows, Unix or Linux servers. Indeed, with a bit of work it is also possible to make a JavaScript search facility work on disk based HTML content such as a website on a CD-ROM or DVD.

The Website Utility Builds JavaScript Search Engines

The Website Utility is able to create a client-side JavaScript search facility for a website. The walkthrough below shows the steps involved:

Configuring The Website Utility to Produce JavaScript Search Engines

The Website Utility is configured using a small Windows application. There is a Create ASP/JavaScript Search Facility checkbox in the Report Settings part of the window that needs to be ticked in order for the JavaScript search facility to be created:

A screenshot of The Website Utility's graphical user interface, showing the options used to create a JavaScript Search Engine for a website

Note that if your website uses query strings then it is a good idea to tick the checkbox called Use URL Query Strings. This will ensure that in the search results pages with different query strings will be treated as different search results pages. So for example www.mywebsite.com/news.php?ID=12 will link to a different news article from www.mywebsite.com/news.php?ID=21 and so The Website Utility will ensure they are indexed separately.

Running The Website Utility

Clicking on the Run button will start The Website Utility's web robot. This web robot start at a user specified page in the website and will automatically crawl all of the pages in that website. The Website Utility extracts all of the words from these pages, and finds the most relevant pages in the website for each word. Common English words (e.g. got, like, then) are removed, as are words of one or two characters. Word rankings depend on many factors, including their distribution through the entire website and their distribution in the content of a specific page.

Incorporating the JavaScript Search Facility into any Website

The Website Utility creates two JavaScript files that can be used on the website's search results page:

  • A Search Data JavaScript File contains the rankings for each word and the most relevant pages for that word.
  • A Search Code JavaScript File contains the code required to parse the user's search query and finds the most relevant pages for that query.

Pages are sorted in search results according to their ranking for the particular word or words being searched for. The ranking scale goes from 0 to 99. Rank is higher for pages that most closely match the search term. In general, searching for words that are common on the site will produce search results with a lower rank than very specific words that occur on only one or two pages.

The search facility also requires a search form and a search results page. The search form can either be put on a separate search page on the site, or the search form could be added to all of the pages in a website (e.g. in the top right hand corner of the website's navigation). The HTML code for a typical search form is shown below. The search form needs a text box called TWUQuery. The form should use the GET method to submit to the search results page.

<html>
<head>
<title>JavaScript Search for http://www.brettb.com/</title>
</head>
<body>
<h1>Search http://www.brettb.com/</h1>
<form name="frmSearch" method="GET" action="searchresults.htm">

Search for: <input type="text" name="TWUQuery" maxlength="50">
<input type="submit" name="submitbutton" value="Submit">
</form>

</body>
</html>

The search results page needs to include references to the two JavaScript files created by The Website Utility (TWUSearchData.js and TWUSearchCode.js):

<html>
<head>
<title>Search Results</title>
<script language="JavaScript" src="
TWUSearchData.js"></script>
<script language="JavaScript" src="
TWUSearchCode.js"></script>
</head>

<body>

<script language="JavaScript">
var TWU_MaximumSearchResults = 50;
var TWU_DisplayPageTitles = true;
var TWU_DebugMode = false;
</script>


<h1>Search Results for <script language="JavaScript">document.write(TWU_OriginalSearchQuery);</script></h1>

<script language="JavaScript">TWU_DisplaySearchResults(TWU_SearchQuery);</script>

</body>
</html>

This page can of course be customised to fit in with the existing design of your website. If you want to display the search terms the user was searching for, then use this JavaScript code: <script language="JavaScript">document.write(TWU_OriginalSearchQuery);</script>. To display the search results, place this JavaScript code where you want the search results to appear: <script language="JavaScript">TWU_DisplaySearchResults(TWU_SearchQuery);</script>.

The search results page defines three JavaScript variables that can be used to change the output:

  • TWU_MaximumSearchResults Controls the maximum number of pages that will be listed in the search results. This stops users getting confused by seeing large numbers of pages in the search results.
  • TWU_DisplayPageTitles If set to true then the pages displayed in the search results will show their HTML titles as clickable links. If set to false then the URL is displayed instead. URLs are also shown if a page does not have a title. If the website does not contain accurate page titles you might have to turn this feature off.
  • TWU_DebugMode If set to true then debugging information is displayed (you should not need to use this).

If you have a basic knowledge of JavaScript, it is also possible to change the display of the search results. This will involve editing the Search Code JavaScript file. Here is an example website search facility that has been customised:

  • A JavaScript search facility created by The Website Utility then customised: Search Brettb.com.

Performance Issues

A client-side JavaScript search engine is obviously going to have a performance overhead on the client web browser. The size of the TWUSearchData.js JavaScript include file will depend on the number of pages in the website indexed, and also the amount of content on each page in the website. It is also dependent upon the nature of the website itself - websites with pages about similar subjects will tend to require a smaller file than a website with pages about different subjects.

For this reason, The Website Utility is also able to generate a search engine that makes use of server-side ASP. This search facility requires no client-side JavaScript, so it can be used to build search facilities websites that are not able to use client-side scripting (e.g. for client accessibility requirements).

Download the Evaluation Version

The evaluation version of The Website Utility will allow you to determine whether the JavaScript (and ASP) Search Engines it creates are suitable for use on your own websites:

Purchase The Website Utility

  Site Map | Privacy Policy

All content is 1995 - 2012