Sunday, January 17, 2010

Apache mod_rewrite (.htaccess) - Rewrite or Redirect Request URLs if page not found, or no www

Apache mod_rewrite (.htaccess) - Request rewrites & redirects to site root if page or directory not found

There are many scenarios where mod_rewrite can be used on a web server to improve usability, optimise for search engines, perform redirects based of specific criteria and much more. The following are some examples of how to configure .htaccess files on an Apache web server to alter requests and actions taken by the web server, including configuring the server to redirect to the site root (or other specified page) if a page or directory is not found, configuring the server to convert components of a http request into query string parameters to be passed to a specific page/script for processing and display (without altering the original address entered into the browser). Other examples demonstrate common uses of mod rewrite such as configuring conditions and rules to ensure that the mod_rewrite engine affects requests that contain specific criteria, including working with, and handling multiple domain names using .htaccess and mod_rewrite. This allows you to configure redirects or rewrites based on the domain or subdomain entered such as forwarding "host.com" to "www.host.com" if the www was not included in the request. Some explanations of the special characters used to build the expressions used in conditions and rules are also explained.

Apache mod_rewrite Examples:


  • Redirect to site root if page not found
  • Convert part of a request to Query String Parameters to pass to a different page on the server
  • Redirect to include www (HTTP/1.1 301 Moved Permanently)

mod_rewrite - Background & General Information

Operators:
< : is lexically lower
> : is lexically greater
= : is lexically equal
! : not

CondPatterns
-d : is a Directory
-f : is a regular file
-s : is a regular file with size
-l : is a symbolic link
-F : is existing file via subrequest
-U : is existing URL via subrequest

eg.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

The above conditions will be true if the requested filename is not found as a file or directory on the web server. Any proceeding rules will be executed when the filename or directory is not found. If found, the conditions will not allow the rules to process, so the request will remain the same allowing the page to load.

Regular Expressions

Text:
. : any single character
[chars] : one of the chars from the set
[^chars] : not any of the chars from the set
choice1|choice2 : Alternative - choice1 or choice2


Quantifiers:
? : 0 or 1
* : 0 to N (many)
+ : at least 1 to N (many)

Grouping:
(text) - allow a string of characters to be grouped and quantified if required. eg: ^(www)+(.*) requires that "www" is included in the request string once only, followed by anything.

Anchors:
^ : Start of line
$ : End of line

Escape Special Characters:
\char : Escape special characters for use explicitly in a string.


Regular Expression Examples:

Expression
Input
Result
^blog(.*).com$blog.master-sharepoint.com
true

blog.master-sharepoint.netfalse

www.master-sharepoint.com
false

blog.master-sharepoint.com/aboutfalse
!^(www.)+master-sharepoint.com(.*)master-sharepoint.com/about
true

www.master-sharepoint.com/aboutfalse

For more information about mod_rewrite conditions, regular expressions and server variables avalilable for use by the mod_rewrite engine, see Module mod_rewrite URL Rewriting Engine.

Enable the mod Rewrite engine:

RewriteEngine On

Set the base location:
RewriteBase /



Redirect to site root if page not found:

#if page or directory on website is not found, external redirect to site root

RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*) http://%{HTTP_HOST} [R]

!-f : Request filename is not a file
!-d : Request filename is not a directory
[R] = External Redirect

Rule Breakdown:

^ - Start of string

(.*) - Set: 1 to many single characters

http://%{HTTP_HOST} - Http response: uses data from the the HTTP_HOST request variable to redirect the user to the site root, ignoring the page requested that was not found. To redirect to a specific page, such as a custom not found page, append the required page onto the end of the response address ( http://%{HTTP_HOST}/custom_error.php )

[R] - Tells the server to redirect the user to the address generated using an external redirect (the address bar of the browser will display the generated address after the page has loaded)



Convert part of a request to Query String Parameters to pass to a different page on the server:

#if request not found, rewrite to specific page/script. Convert request details (directory/filename) into query string parameters

RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^page/(.*) show-page.php?page=$1 [L]

Example:
Browser Request: "http://host/free-php-scripts/"

Loads the following page on the web server: "http://host/show-page.php?page=free-php-scripts" (but will still display "http://host/free-php-scripts/" in the browser)

The browser will load the show-page.php file with the directory/filename details as the query string parameter for "page". The address bar in the browser will still display "http://host/free-php-scripts/" even though "http://host/show-page.php?page=free-php-scripts" was the request processed by the server to display the page. You will need to make sure that paths (images, urls, stylesheets, JavaScript files, etc.) are relative to the site root and not the current directory. For example, a directory at the root of the web site called "images" is used to store images displayed on the website. An address (href) to an image in the images directory that is accessible from the "http://host/show-page.php?page=free-php-scripts" page might be "images/logo.jpg". If you use mod-rewrite to access the same page by requesting "http://host/free-php-scripts/", the browser will try to access the image the following location: "http://host/free-php-scripts/images" which is no longer correct.

One solution when linking images, stylesheets or JavaScript files from a webpage that has the address generated using mod_rewrite rules is to use the full absolute path ( "http://host/images/logo.jpg" ), or make all paths relative to the site root ( "/images/logo.jpg" ). Another solution may be to determine the level or depth of a page request within the directory structure of a web server using the request data, then incorporate the path back to the root of the site into page urls dynamically. For example, urls on the "http://host/free-php-scripts/" page pointing to the "images/" would include "../" at the beginning making the full address "../images/" when the html of the webpage is generated. Using the second method may be useful when directories and files linked to from the web page are relative to the current page and not the site root.



Redirect to include www (HTTP/1.1 301 Moved Permanently):

RewriteBase /
RewriteCond %{HTTP_HOST} !^www(.*)
RewriteRule ^(.*) http://www.%{HTTP_HOST}%{REQUEST_URI} [R]

This will redirect any request that is missing the www to the same host including www. For example, a request to "http://host.com" will be redirected to "http://www.host.com". If you have a "blog" subdomain ( http://blog.host.com ), to prevent mod_rewrite from redirecting this request to "http://www.blog.host.com/" the following conditions and rules could be used:

RewriteBase /
RewriteCond %{HTTP_HOST} !^www(.*)
RewriteCond %{HTTP_HOST} !^blog(.*)
RewriteRule ^(.*) http://www.%{HTTP_HOST}%{REQUEST_URI} [R]

This will redirect any request that doesn't start with "blog..." and that is missing the www, to the equivalent request including the www. If you have many subdomains, it may be easier to test the domain name explicitly and redirect to include the www if required:

RewriteBase /
RewriteCond %{HTTP_HOST} ^host.com
RewriteRule ^(.*) http://www.host.com%{REQUEST_URI} [R]

The condition above includes the domain explicitly without the www. This will mean that the condition will be satified allowing the rules to be executed only when the domain (TLD) is accessed without any subdomain or www included in the address. Then this is the case, the Apache mod_rewrite engine will redirect the request to the host with www included. The original page being requested will be included in the redirect URL with the www at the start.



Mod_Rewrite References:
  • Mod_Rewrite - Apache 1.3 Documentation - This module provides a rule-based rewriting engine to rewrite requested URLs on the fly.
  • Apache Module mod_rewrite - Apache 2.0 Documentation - "This module uses a rule-based rewriting engine (based on a regular-expression parser) to rewrite requested URLs on the fly. It supports an unlimited number of rules and an unlimited number of attached rule conditions for each rule, to provide a really flexible and powerful URL manipulation mechanism. The URL manipulations can depend on various tests, of server variables, environment variables, HTTP headers, or time stamps. Even external database lookups in various formats can be used to achieve highly granular URL matching".
  • URL Rewriting (Ross Shannon) - "The Apache server’s mod_rewrite module gives you the ability to transparently redirect one URL to another, without the user’s knowledge. This opens up all sorts of possibilities, from simply redirecting old URLs to new addresses, to cleaning up the ‘dirty’ URLs coming from a poor publishing system — giving you URLs that are friendlier to both readers and search engines."
  • Learn Apache mod_rewrite: 13 Real-world Examples - "Apache's low-cost, powerful set of features make it the server of choice for organizations around the world. One of its most valuable treasures is the mod_rewrite module, the purpose of which is to rewrite a visitor's request URI in the manner specified by a set of rules."

Saturday, January 9, 2010

Php RSS2Writer (v2.0) - Generate RSS 2.0 Feed - Usage Instructions & Download

Generate RSS 2.0 Compatible feeds in Php, using the Free Php RSS2Writer Class.

The Php RSS2Writer class has been updated. Version 2.0 includes improved usability when using the RSS2Writer class to generate an RSS Feed from website or database content, as well as additional functionality required to generate valid RSS 2.0 Feeds.

Download Php RSS2Writer - Version 2.0
The download page also provides some additional information about the class and sample controller.









Details of the Php RSS 2.0 Writer Package:


RSS2Writer.php
The package incudes the RSS2Writer class file that contains the functions and variables required to produce a valid RSS 2.0 compatible feed using Php.

sampleController.php
The sample controller script provides examples for using the class and functions, including additional information about optional elements that can be added to a feed channel or individual item. Basic instructions are provided in the sample controller to help get started using the class.


Php RSS2Writer Usage
The following are instructions with examples for using the Php RSS2Writer class to generate RSS 2.0 compatible feeds using Php. The data added to the feed could be from a website, database, etc. that is accessible using a Php script.

Download the PhpRSS2Writer Package and copy the RSS2Writer.php file to your web server.


In the Php script that will serve the RSS Feed:

1. Include the RSS2Writer Class:

require_once("RSS2Writer.php");
//or

//include("RSS2Writer.php");


2. Initiate the an RSS2Writer Object:

When the RSS2Writer Object is constructed, the Title, Description and link for the Feed channel are provided as parameters when constructing the object.

$rss2_writer = new RSS2Writer('Feed Title', 'Feed Description', 'feed url');


3. Add additional information to the Channel
(optional ) such as categories, copyright information, or the generator used to generate the RSS feed content.

//Add channel data to the feed
$rss2_writer
->addCategory("RSS Feed"); $rss2_writer->addCategory("Free Php Script");
$rss2_writer->addCategory("Php: Generate RSS 2.0");
//Optional Elements

$rss2_writer
->addElement('copyright', '(c) Daniel Soutter 2010');
$rss2_writer
->addElement('generator', 'Php RSS2Writer by Daniel Soutter');


4. Add Items to the Feed

Add items to the RSS feed using the addItem function, which takes the title, description and link for each item. Once the item has been added, the addCategory, or addElement functions can be called to add categories, or other optional elements to the feed item that sore information such as the name and email address of the author, and the date that the item was published.

//Example Item
$rss2_writer->addItem('item title', 'item content/description', 'item url');

//Add categories to the item

$rss2_writer->addCategory("Free Php Script");
$rss2_writer->addCategory("Php: Generate RSS 2.0");
$rss2_writer->addCategory("Php RSS2Writer Usage Instructions");
//Optional Elements
$rss2_writer->addElement('author', 'daniel@webmasterhub.net (Daniel Soutter)');



5. Output the RSS Feed XML

Output the RSS feed XML for use in a web browser or application that syndicates RSS feeds.

echo $rss2_writer->getXML();


Notes:
The addCategory and addElement functions can be used to add categories or additional elements to both the feed channel and individual feed items. To add additional information to the feed channel, call the functions before adding items to the feed. Once items have been added to the feed, calling the addCategory or addElement functions will add the specified information to the feed item most recently added to the feed.

There are a few functions which can be called at any time to associate an image to the RSS Feed Channel, or include channel Cloud data, for use by RSS Feed readers and other applications when retrieving and presenting data from your feed.

The channelImage() function associates an image with the RSS Feed Channel, passing a title, link, url, and the dimensions of the image (width, height).
$rss2_writer->channelImage($title, $link, $url, $width, $height);

The channelCloud() function adds cloud information to the Feed Channel. The domain, port, path, registerProcedure and protocol are passed to the function. The Port, registerProcedure and protocol paramaters use '80', 'pingMe' and 'soap' as defaults.

$rss2_writer->channelCloud($domain, $port, $path, $registerProcedure, $protocol);


If you have any queries about, or issues with using the Php RSS2Writer class, please leave a comment on this post. Suggestions for improvwement or additional functionality are welcome as well, and will be incorporated into new versions of the class if appropriate.

Wednesday, January 6, 2010

Php Date Time - 7 Methods to Calculate the Difference between 2 dates

Calculate the difference between two Dates (and time) using Php. The following page provides 7 methods for calculating the difference between two dates using Php, as well as some additional related resources to help with Date/Time calculations and manipulation using Php.

Depending on the type of server running Php (Windows/IIS/Apache, Unix/Apache), standard date methods provided by Php can sometimes produce inconsitent or incorrect results. The various methods provided on the page below cater for different scenarios where date/time calculations need to be made using Php to calculate the difference between dates.

See Php Date Time - 7 Methods to Calculate the Difference between 2 dates for details.

Wednesday, December 23, 2009

Improve Performance for Websites with a Database Back-end & Dynamic Content

Improve Performance for Websites with a Database Back-end & Dynamic Content

How to improve the performance of web pages, and the overall performance of a website by minimise front-end processing and additional server/client DNS requests required to fully load each page.

  1. Identify the areas/scripts for a website which are frequently used
  2. Identify scripts which take the longest time to process or load
  3. Identify DNS Requests that can be removed from front-end pages
  4. Create scripts to perform background processing, then store website content (HTML) in a database table.
  5. Replace code in scripts to retrieve the pre-formatted content from a database

The techniques described below focus on reducing the processing requirements and DNS requests required to generate and load a page for both the server-side and client-side aspects of a web application or website. The steps explain how to identify, and reduce the amount of processing required by the front-end server of a website by performing background tasks to complete processing which would usually take a long time to complete. This includes identifying scripts/pages which are frequently used, and considering reducing the loading time by retrieving pre-prepared html from a database instead of processing and displaying from the front-end script. The background tasks will periodically execute scripts which will perform the processing required to generate and store the pre-formatted html in a database table ready for retrieval when loading the web page.

The techniques explained below will be most effective when the databases for a website are hosted on the same server or farm.

1. Identify the areas/scripts for a website which are frequently used

Identifying the areas of a website which a re most used allows you to prioritise pages in an order which development time will be most beneficial. Spending a lot of time optimising scripts that are not frequently used on the website is not good use of your time.

Website Traffic Statistics
There are many common (and popular) methods which can be used to analyse page and visitor statistics for a website, that are also free. Most of these allow you to at least view traffic information to a website in terms of the content (the pages being visited), the visitors (country/region/city, browser, OS, host, etc.) and the source of the traffic (referral, search engine query, direct). From this information, you can see what pages on your site are most used, as well as the location of the users most frequently accessing the pages (See Geographical Targeting below).

Google Analytics
Google Analytics is a free service that tracks information about content and users of a website to great detail. This information can be viewed using the selection of core reports, but functionality is aloso provided to track custom data and generate reports using custom criteria.

Site Meter
Site Meter is a free or paid service that tracks statistics for websites. The statistics can be made public if required, allowing anyone to view all or some of the details about your site.

Website Traffic Stats
Website Traffic Stats provide reports similar to the above, but also include reports detailing the navigation structure of a website, which can also come in handy when Optimising a Website for Search Engines (SEO).

2. Identify scripts which take the longest time to process or load

Identifying and optimising specific scripts that take a long time to process is a good method for improving the performance of a website. There are many factors that have an affect on the performance of a website, some which are out of your control, or not easy to control. A large amount of processing is generally required to generate and display information on a website when there are many complex database queries, calculations, comparisons and graphical representations. In many cases, it will be possible to process and generate the above information and graphics by processing background scripts executed periodically by the server. A common scenario where this may assist is when statics are being generated and displayed for content on a website, which may require many thousands or millions of database records to be retrieved and processed to be able to display and use the information. In the above example, background scripts could process and generate the statistical information which would then be stored in a separate database or table to be retrieved and displayed only by the websites front-end, or pages. Retrieving the pre-processed data from the database and displaying directly on the page with minimal processing required will reduce the amount of time required to display the page. You need to find a balance between the frequently used scripts determined above in Step 1, and the scripts determined in this step that require a large amount of time to process or load when deciding which scripts and pages to optimise to improve the overall performance of a website.

There are tools and software is available to help identify scripts and pages in websites that take a long time to process and display, or alternatively, you could incorporate into your own scripts the ability to test page load times (get the current time at the beginning and end of processing a script/page, then calculate the length using the difference of the two values).

Web Page Analyzer

'Try our free web site speed test to improve website performance. Enter a URL below to calculate page size, composition, and download time.'


Website Speed Test

'Find out how fast your website loads. Too slow? Perhaps you need to optimize the page or move to a faster server.'


Website speed check

'The website speedtester shows the duration of a given website. This value can be used for showing how long a website take to load and if it is better to optimize the website or change a (slow) ISP.'


3. Identify DNS Requests that can be removed from front-end pages

Reports generated using tools such as the 'Website speed checks' listed above will often tell you how many DNS requests were required to load the page. It is quite common for a web page to require multiple DNS requests (over 3 or 4) in order to retrieve information from external websites and web services. An common scenario is a page that is served Ads from a public Ad server, a page that displays the content (or titles) of an rss feed or a page with content scraped from pages on other websites. Reducing these types of requests can have a significant affect on the performance of a website, as external dependencies will be eliminated or at least reduced.

Cache RSS Feed Data
A method for improving the performance of a page which displays the content of one or more RSS feeds is caching the feed data, and displaying from a local database instead of directly from the external location. A background script could retrieve and format the feed content, then updated/store in a local database (see Step 4 below for details about running background scripts, or 'cron jobs'). Many feeds remain unchanged for long periods of time making it unnecessary to re-load a fresh copy from the original (external) source every time a page is loaded on a website. A simple database query, with no additional processing or formatting will be much more efficient than retrieving and processing the data each time a page on a website is requested.

If an RSS feed needs to be displayed using JavaScript by the client, you should include the script close to the bottom of the page, so the the majority of the content will have loaded before it tries to retrieve the RSS feed data. This will also help if the external site which the RSS feed comes from is down or running slowly, which will also affect the performance of pages on your website.

Queue Connections to Web Services
It is often a requirement of a website or web page to use web services on an external website or server on the internet. This is usually server-side processing, which can be quite slow when multiple external hosts are being connected to. The XML-RPC provides a schema for web services and methods and is common example used around the internet. If one or more external hosts are slow or not accessible, the script will take longer to complete or time-out depending on (your) web server configuration.

One possible method to improve performance of pages that connect to web services from multiple external hosts is to store the information for each individual connection and request in a database table, which can then be processed as a queue by a background script/cron job. An example may be a website that allows users to submit details of multiple sites to many directories using the XML-RPC web service. A separate connection would be required for each external host, and for each site being submitted, which could reach the thousands quite easily. To manage such a large number of connections to external hosts without making the user wait for each to complete, the details of the site being submitted can be added to a separate database table, which is processed by a script executed by the server. For a service such as this, it is essential that the queue is being processed as close to "real-time" as possible, so the interval between execution of the background script would be short. The script would need to be configured to allow multiple instances of the same script to run in parallel without conflicting with each other. This will cater for scripts that take a long time to complete, and will also allow you to execute more than one instance of the script at at time. This would mean that each script would process a small number of rows from the queue table minimising the chance of one timing out if some external servers were not accessible.

When an end-user executes a transaction that would usually take over 30 seconds for the front-end script to complete, the page can load almost instantly, as it is much quicker to add the details to a local database table than to complete a connection to each individual external host. The background processing required to process the queue will have a minimal affect on the servers processing requirements and in return, almost no affect on the end users' experience with the performance of the website.

Script Organisation
If you need to connect to external hosts to get information required to display on a page, you should send the majority of the page content to the browser before attempting to connecting to external hosts. This can usually be done by including the script close to the bottom of the page if using JavaScript, or by buffering results for connections to external hosts if processing using a server-side programming or scripting language. This is a perceived improvement in performance, as it is clear to the user that their action is being processed even when it takes a long time to load fully.

4. Create scripts to perform background processing, then store website content (HTML) in a database table.

Once you have determined which scripts should be focused on to optimise your website's performance, you need to write separate scripts to complete the demanding processing and requests for information from external sources. These scripts will then populate database tables with pre-formatted HTML, which will be later display on pages when called from front-end scripts. A common background script may be one which retrieves updated information from multiple feeds, which can then be used to display the feed content on web pages in full, or as line-items (Feed item, or category titles) to be used as a shortcuts to the category or individual articles, posts and pages (display Related Posts dynamically on blogs). As the request for the feed data, including the process of adding the feed content to a html template (which usually requires further reads from the file system of the web server) is completed by the background scripts or cron jobs, the pages on the website are able to load very quickly, as only a single query to the database is required, that returns a small set of rows containing fields with the pre-generated html. In most cases, this will significantly improve the performance of your website, as you have eliminated the need for additional DNS requests every time a page is loaded.

Cron Jobs
Cron Jobs are scripts or commands that are executed by the server at a specific time, or repeatedly with specified intervals between execution of each script. In this case, where we are using cron jobs to perform the in-depth processing and external requests. As long as the Cron Jobs have been configured and are working properly, the pre-formatted html in the database will always be up-to-date. You will need to find a balance between the frequency that the scripts are run which may affect the overall performance of the web server, to the importance of having up-to-date information. For example, if an RSS feed is only updated once every few months, there is no need to run a script that updates that particular feed every minute.

Many popular web hosting services, including many that are free offer a solution which includes the ability to incorporate 'cron jobs' into a web application. Many hosts using Unix based servers provide this feature with theire hosting solutions making it essential to have at least a basic knowledge of Unix commands that can be useful for web and database applications. The hosts below offer free web hosting with many features including the ability to execute cron jobs, or background scripts on a periodic basis.

Free Web Hosting Allowing Cron Jobs

A comparison of approximately 30 free Web Hosting solutions that allow Cron Jobs.


000WebHost.com - Free Web Hosting With Cron Jobs


5. Replace code in scripts to retrieve the pre-formatted content from a database

The final step once you have completed and tested the Cron Job scripts that run in the background is to modify the front-end scripts or pages on your website to retrieve the pre-formatted html from the database. In some cases, this will require you to modify the HTML, or client-side code, as JavaScript is often used to insert content from an external location during or after the page has loaded. In other cases you will need to modify the server-side (Php, ASPX) scripts to retrieve the pre-formatted html instead of performing calculations and retrieving the information from external sources.

You should write a set of functions that can be used and reused throughout the scripts and pages on your website. Some examples for a feed cache may be getAllFeeds(), getFeed(ID), updateFeedData(ID), updateAllFeeds(), getFeedSummaryHTML(ID), getFeedHTML(ID) etc.

Other Performance Considerations

Relational Database Design
- If not already applied: You can reduce the amount of data retrieved by database queries by separating different categories or types of data into separate tables. For example, a website which started off with a small number of users who can be assigned to a group and have a permission level assigned may store have a foreign key in the users table which indicates the group and permission. As usage and the number of members of the website increase, the ability to assign users to multiple groups may be required, as well as needing to manage a larger amount of data/information for each user. To store all user, group and permission information in a single "Users" table, would mean that a query to the table which retrieves data from all columns would take a long time to complete, and in most cases would contain data which is irrelevant to the web page or script which required the user data. An alternative approach would be to have a separate table for group information, permissions, basic user information and one for each of the user-group assignments and user-permission assignments. Database queries can then be constructed to retrieve all information if required, but in most cases will retrieve only the data required to complete a specific task. The result generally requires information from specific, but not all tables and can significantly reduce the amount of data retrieved from the database to complete the processing. Another benefit is that when operations are being performed on the database that require rows or tables to be locked, you can be much more granular with the data which is locked allowing other scripts to access / modify data in other tables and rows.

Geographical Targeting
If the majority of users are from a specific country or region, you should consider hosting your website on a server in that country, or by configuring your DNS to point to a mirror hosed in the country closest to each user. This will reduce the amount of time for pages to load for most of your users as the request should not leave the country, or travel far at all.

Images and other Rich Media
Images and other media is often the reason a page takes a long time to load. A page that consists mainly of text will load quickly and much more consistently than a page with lots of images or other media, even when on a slower web server or if the web server is managing a large load. If you do have images on you site, you should try to minimise the size/dimensions of images as much as possible without loosing information. There are also many image formats which offer various methods of compression which are often configurable. You need to use formats that are legal (you have rights to use), that have good compression (small in size), but at an acceptible quality level. Where possible, you should use 1xN, or Nx1 sixed images that can then be stretched horizontally or vertically when used for the background of elements on a web page.

Allow users to choose if their browser will start downloading or buffering rich media sucg as streamin video, as this can consume a large amount of the users' bandwidth, which will reduce the performance of your website of they continue browsing without closing or stopping the video/media.


Related Articles:

Saturday, December 19, 2009

SEO Tips: Keyword Research, Target Key Phrases, Site Navigation

SEO Tips for Website & Database Application Development

I wanted to share a few of my experiences with developing websites and database applications in relation to optimising the structure, navigation, layout and content of websites to help increase rankings in popular search engines, including Google, Bing and Yahoo. Some key areas covered include designing and developing a website to have a logical page hierarchy and navigation paths. I have developed a range of websites both static and interactive with a database or sometimes file system back-end to enable content to be published and maintained dynamically.

The Search Engine Optimisation techniques described below are based on my own personal experiences with various methods of building a dynamic website. Any concepts or conclusions reached are a result of my own research, experimentation and opinion. As there are numerous factors involved when Search Engines rank a site or page, there may be factors other than the techniques which I perceived to be the reason for an increase in search engine ranking. If you have a different opinion, or would like to discuss the techniques below, feel free to post a reply to the thread.


Keyword Research & Targeting

Optimising the content on your site by targeting specific keywords and key phrases is a common and very effective technique to increase traffic to your site. The following are some thing to consider when optimising content and pages to target specific keywords and key phrases.

Target Achievable Key Phrases

Online/Web Marketing is a competitive industry. Generally, the more generic and short a key phrase becomes, the more competition you will face in order to achieve top 10 results in search engines with the key phrase. You are much more likely to achieve top 10 results for targeted key phrases if they are specific to the overall topic of content on a page. You should try to write about something specific that also applies to a general category. When setting internal anchor tags and doing external link building, you should target the more specific key phrase, and include the general category as well. For example, if you were to publish a page explaining how to construct a racing bike from playdough, you should consider or apply the following:

  • Targeting the general keywords ("Racing", "Bike", "Bicycle Racing") will not be as effective as targeting a specific key phrase "How to build a racing bike using playdough". You can almost guarantee that a popular search engine will never display a result in the top 10 for single word key phrase such as "Racing" if the content is related to a more specific topic, or the site is not well established. To achieve this may not be possible at all in some cases, but would take time, as well as a large number of descriptive backlinks from pages on external sites with a very high level of relevance as well as ranking in search engines. You should not focus of general/popular keywords. Instead, focus on achievable goals such as reaching the top position for a more specific key phrase, and you will find that rankings for more general search terms will improve over time. Each time a page is displayed first in search engines for a specific key phrase, the ranking of your site will increase overall (will vary for some search engines depending on the ranking system used). As the overall ranking of your site or domain increases, the chances of achieving top 10 results for general/popular keywords or key phrases will increase.
  • Including the general category/topic when creating deeplinks, page title and description can help target the more general, but more popular keywords in the long run. For example, if many separate pages published after 5 years relating to a general topic ("Racing") but cover many sub-topics ("Bicycle Racing", "Motor Racing", "Historical Racing Events", etc.) all reach the top 2-3 position in search results by targeting a specific key phrase relating to the content, you would be more likely to start getting hits from search results for the more general/popular terms "Racing", "Bicycle Racing", etc. . In this case, an example of a specific key phrase for one of the pages might be "How to build a racing bike using playdough". The use of the words "Racing" and "Bike", are the general terms which are more relevant in the long term, but are not the focus of the content. In this case, the use of "How to build.." and "using playdough" is what distinguishes the specific topic from the general and would need to be the main topic of the content on the page for the key phrase to be targeted successfully.
  • The page title, description and content should include, and compliment the key phrase being targeted. In other words, you should always include the keyword/phrase in the title and description meta tags for the page. You should use the exact key phrase at least a few times within the page content, and use variations and related phrases as well. Anchor text on backlinks pointing to pages on your site should also include the key phrase, or a slight variation. There are mixed opinions to how much a keyword or key phrase should be used in content to be successfully targeted in search engines.
Check out Google Trends to see and compare search statistics for specific keywords and keyphrases.


Logical Structure and Navigation

As most Search Engines will suggest, designing the site so that all pages are accessible using a logical navigation sequence or hierarchy. It is important for a site's navigation to be easily understood by the end user, it is equally important that Search Engine crawlers / bots to be able to correctly interpret and utilise the navigation structure. You should keep in mind that although Search Engine are automated, they have been designed and optimised to find and rank information on pages for people, which are ultimately who you are trying to get to your site. Developing a website which can be understood by the end-user is essential. A website with a large amount of information on many pages should be configured to provide more general information at a high level, or close to the root of the site. The information should be categorised or grouped into separate areas, which would then contain varying levels of sub categories and groups, depending on the depth required to finely sort a general or high level category/topic. If configured properly the high level pages with more generic information that also provide links to more specific information, over time you may find yourself a strong competitor for a position in the the top 10 results for single and two word keywords and key phrases. The following are some techniques to help when designing the structure, navigation and layout of a website to help you achieve the above:


Site Templates
Using a template for each page on a site not only helps when developing and maintaining a website, it is a method which will help search engines reach all pages on a site when crawling and understand the information provided, and the context of each page in a hierarchical fashion. Using clear and concise titles for links pointing deeper into the site where the information becomes more specific is important to help search engines categorise the information on the page. This, along with the page titles and keywords determined by crawlers also helps search engines put the page into context with other information on the site and around the internet.

Site Navigation
Navigation should be in a consistent location on the website, before the main content or body of the page. Adding source code for a left navigation menu close to the top of the opening body tag is recommended, but a top navigation (or another location) should also work if required. Having the source code close to the top of the page will reduce the chance of the crawler associating the links with the content directly above and below instead of a consistent menu. This is also essential for Site Links or similar to be generated, and to appear in search engine results such as Google. When a consistent menu is used properly, a crawler/bot can successfully use links from the menu and sub-menus to put pages into a logical hierarchy and relational context.

Duplicate Information

Duplicate information can be common on websites and web applications which retrieve and display information on pages from one or more databases. This can result in the same information being displayed on more than one page, which is common in some circumstances, but can be and easy trap to fall into if abused. Basically, providing a duplicate copy of the same information on separate pages on the same website is useless for Search Engines, and in most cases will decrease the ranking of your site. An alternative, would be to provide a short summry, or truncated version of the the information with a link to the page containing the complete version. This would help reduce the amount of duplicate content found on a site, but should be avoided unless the summarised versions reside on pages which are relevant to the same topic. Cramming a page full of links is also a waste of time which has little or no effect on search engine results for the site containing the links page, or the sites being linked to from the links page. This would need to be designed and executed carefully to ensure that a network of related pages is created that people would be able to understand and find useful.

There are some cases where it may be required, or beneficial to the end-user to display the same information on multiple pages. A common example of this is blogging applications and platforms that allow you to view posts by selecting a category or tag. Any post with the selected tag will be displayed on the page. The individual pages/posts containing the most specific information should be included by Search Engines, and other pages such as the category pages for a blog should be blocked from being indexed by search engines. Popular blogging applications, including some that are Open Source will usually be configured to block search engines from indexing pages containing duplicate content using meta tags in the head. this is a good method for instructing crawlers/bots, but I highly recommend that you create and use a robots.txt file at the root of the website. See the link below for a robots.txt generator.


Related:


References:
  • Playdough Image (http://media.photobucket.com/image/playdough/kiwidutch/STEP%20BY%20STEPS/playdough3Small.jpg)

Wednesday, December 16, 2009

Affiliates

Addon Solutions
Website Design and Development firm providing website designing services with hire dedicated web developers / programmers for Php, Asp.Net, Ajax, Java, Joomla, Drupal, WordPress, Magento, OsCommerce, Zencart, Mobile Application Developer, iPhone Application Developer, Android Application Developer / Programmer and SEO/SEM


WebmasterHub.net
Tools and resources for web development / programming, graphic and web design web marketing (SEO) and SharePoint. WebmasterHub.net provide a free Bulk Ping service to submit multiple sites/blogs to over 100 directories for free. Join the WebmasterHub.net community for free use the Bulk Pinger, WebmasterHub.net Forums or submit a new resource to the resource directory.


Web-Resource.org
SharePoint & InfoPath, Web Development & Programming and SEO (Search Engine Optimisation) / Web Marketing resources. SharePoint forums and popular blogs, Programming forums and blogs, SEO forums and communities. Free Php Scripts, free SharePoint Web Parts.


SharePoint Development & Administration (Blog)
SharePoint & InfoPath Form development, integration and management tips and tutorials, SharePoint Administration tutorials, SharePoint Designer Tutorials, Workflow development tips and tutorials. Learn how to develop and customise SharePoint environments.

Web Programming, SEO & SharePoint Resources

Web-Resource.org - SharePoint, Web Programming & SEO Resources

Web-resource.org provide a range of resources from around the web for Web Development & Programming, SharePoint & InfoPath Development and Search Engine Optimisation. Resources have been hand selected by the team at Web-Resource.org, and suggested by users.


SharePoint & InfoPath Resources

A range of tools and resources are provided by Web-Resource.org for Administering, Designing, Customising and Developing a SharePoint environment. Resources include a range of articles with tips, how to's, tutorials, third party utilities for administering and developing SharePoint. Tutorials are also available to help with InfoPath form development and integration with SharePoint.
SharePoint Design & Customisation resources include tutorials for creating and customising Data View Web Parts to display data from SharePoint lists which is formatted based on metadata values for each list item. This includes the ability to format rows of a task list for example, to set the background or text colour of the task item to a color which represents the priority or amount of time until due. The articles specifically describe how to format tasks based on the due date and other fields. An overdue item can be formatted with a red background and bold text. Items which are due within three days are formatted orange, 1 week: yellow, 1 month: green. When the conditional formatting is combined with sorting and grouping, a Data View can display a large amount of information in a logical and readable format.

SharePoint Administration resources include tips and techniques for configuring SharePoint environments of various sizes, migrating content and configuration databases, tips and techniques to help when troubleshooting errors and issues with a SharePoint environment and much more. Other administrative tips and resources include configuring profile imports for multiple Active Directory domains, installing and registering custom web parts assemblies to be used on Web Part pages.
InfoPath form development resources include tutorials for creating and using Data Connections to lists on SharePoint sites from the InfoPath form. Tips and tutorials to help with the management of form templates and content types in SharePoint are available, including troubleshooting outcomes and information relating to using InfoPath forms as the Interface for SharePoint systems which comprise of many lists, libraries and workflows.
Web-resource.org also provide a section for SharePoint & InfoPath Forums, Blogs and Communities which all contain many resources to help you learn SharePoint and InfoPath. Go to the SharePoint Forums, Blogs & Communities section. Go to the InfoPath Forums, Blogs & Communities section.
View the SharePoint & InfoPath Resources on Web-Resource.org:
Other sites providing SharePoint / InfoPath Resources:


Web Development & Programming Resources

The Web Development & Programming resources section contains useful tools and resources for developing websites and web applications, and using various programming languages to customise and integrate different systems and software. Programming languages include PHP, Web Application Development, HTML, CSS, JavaScript, ASP.NET, XML, VB, .NET Framework and more.
A section for free Php Scripts provided by Web-Resource.org and from around the web. The php scripts provided by Web-Resource.org include a free Php RSS 2 Writer script, which can be used to generate an RSS feed from website or database content. Create your own RSS feeds for your site with php.
Popular Web-Development & Programming Forums, Blogs and Communities are also listed available under the Web Development & Programming Resources section of Web-Resource.org

Other recommended Web Development & Programming sites and resources:



Search Engine Optimisation (SEO) Resources

Increase traffic to your site by optimising for Search Engines. Web-Resource.org provide a range of tips, tools and resources for getting started with optimising websites for Search Engines. Boost traffic to your website using various link building techniques and tools which are sometime known as offsite optimisation. Information about onsite optimisation tools and techniques for web & search engine marketing is also provided, including SEO & Search Marketing Tips to help improve your site ranking in Search Engines.

Resources include a SEO Forums and Communities section, which provides a list of some of the popular networking sites and tools to help with learning and applying Search Engine Optimisation techniques to a website.

Other sites providing SEO Tools & Resources:
Free SEO - Bulk Ping
Web Marketing Tips
SEO Tips for Blogger