SEO Tip – Use Sitemap.xml files

Site Map files are a new standard for search engines. You can create an XML file as part of your site, search engine crawlers can find them, and the file helps define pages and their relation to each other
More information is available at http://www.sitemaps.org
The Sitemap file should typically go in the root directory of your site
- http://www.yoursite.com/sitemap.xml
Each Sitemap file that you provide must have no more than 50,000 URLs and must be no larger than 10MB
If you need to provide more than 50,000 URLs or the file goes over 10MB, you must provide more than one sitemaps file. You will then need to list each Sitemap file in a Sitemap index file. Sitemap index files may not list more than 1,000 Sitemaps and must be no larger than 10MB.
- http://www.yoursite.com/sitemap_index.xml
All URLs listed in the Sitemap must use the same protocol (http, in this example) and reside on the same host as the Sitemap
Some ideas on automating the creation of the sitemap.xml file
- These pages can be generated dynamically as part of the build process
- These files can also be submitted dynamically to the major search engines for indexing
- These files are a huge benefit to Search Engine optimization
- This may be something that can be built into the continuous integration process

SEO Tip – Hyperlinks

Hyperlinks are the nervous system of a crawler. Crawlers follow these links to determine which pages of your site should be crawled. If your hyperlinks are broken or unusable, this will prevent your pages from being crawled.

Keep the links on a given page to a reasonable number (fewer than 100).
JavaScript for navigation is bad, as I described in the JavaScript section. Dynamic links created by JavaScript cannot be followed by a crawler.
Image maps for navigation are also bad. Since the navigation is based on an image, crawlers may not be able to process the links properly, and will not follow them to their referring pages
Broken links are obviously bad as well. If the link is broken, that page will not be crawled or indexed, and will not be found by your search users. You should check your site with a Link Tester to prevent this from happening.
The search engines basically figure that if you are linking to something from your page, whatever it is you are linking to is likely to be closely related to the content of your page. For that reason some of the engines actually look for keywords in the hyperlinks and any text immediately surrounding the hyperlinks. What this means to you is that if you can you should include your most important keyword phrases in the link itself and possibly the surrounding text.
The text of your link should be natural content text, not “click here.” You should also try to incorporating your keywords in your hyperlinks, but do it without being artificial.
Changing the style of your link with a cascading style sheet will not affect crawling or Page Rank, however blue underlined text is a usability convention that if not used may confuse your users.
Just as the robots tags can prevent links from being followed, each individual link can do the same, using the rel=nofollow attribute of the link tag.
- http://www.seoconsultants.com/html/links/nofollow.asp
- Search Engine Link that will not be followed

SEO Tip – Querystrings

Leave a reply

The treatment of querystrings is a controversial topic amongst SEO experts. This should add a bit of insight onto how querystrings really affect SEO.

If Google and other search engines couldn’t traverse dynamic sites, then huge swaths of the Internet such as online databases, blogs, threaded discussion forums, and e-commerce sites (to name a few) would go unlisted
Querystrings were frowned on initially by search engines because querystrings could define an infinite number of pages, and the crawler could choke. Google has never had this problem, and querystrings, if managed correctly, can be used successfully on any site.
Google’s Webmaster Guidelines says not to use “&id=” as a parameter in your URLs, as they don’t include these pages in their index.
According to Matt Cutts, the number of querystring parameters should be limited to one or two

SEO Tip – File Names, Directory Names, and Directory Depth

Leave a reply

The physical directory structure and names of your files can have an impact on the success of your Page Rank and keyword success.

Whether it helps or not to put your keywords in your URLs is a debatable point. If you’re doing a logically organized, SEO-oriented site, each page is going to be associated with a main key phrase regarding what that page is about. It makes sense to carry that over to the URL name as it makes it much easier to mentally associate the content and the URL name so you can find whatever page it is that you need to deal with.
Google wants to follow patterns of (natural) human nature for showing us results, so this would only be logical to make sure that your file names and directories make sense when organizing your material in your site.
On the contrary, artificially inflating your file names with your keywords would not be a good idea (not natural), and might do harm to your rankings.
Always use hyphens and not underscores in your URLs. This includes files names and directories. The reason why is because Google disregards underscores (_) but interprets hyphens (-) as a space.
Depth of Directories – very deep pages tend to rank less well, or get less traffic. This is generally because:
- All other things being equal, a web page in the root of your web site is seen as more important as a web page nested a folder or more deep.
- Deep pages are more specific and therefore tend to grab only very specific subsets of larger searches
- Deep pages tend to receive less links, link pop, and Page Rank (in Google’s case)

SEO Tip – Well Formed HTML

Leave a reply

Well formed HTML is easier for the crawlers to read. They are computer programs, after all, and expect HTML to be in a specific content. If the crawlers see no errors, your site is indexed more accurately.
Well-formed HTML is HTML that conforms to a specific HTML standard
Your code should be W3C compliant – go to http://validator.w3.org and verify that your HTML code is valid
The most common standards for HTML are HTML 4.01 Transitional and XHTML 1.0 Transitional
Verbose, duplicative, bloated, or error-prone HTML code makes it more difficult to get to the content, and decrease the content to code ratio
Use a text browser such as Lynx to examine your site. Most search engine spiders see your site much as Lynx would. If fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash keep you from seeing the entire page content in a text browser, then search engine spiders may have trouble indexing your site.

SEO Tip – Cascading Style Sheets

Leave a reply

Cascading Style Sheets, or CSS for short, are extremely powerful development tools. They allow you to define global styles for any page element. these styles are reusable, making the page more consistent looking and easier to maintain. CSS allows developers to manage not only font, color, and size, but also position and visibility.

Here are some great SEO hints about CSS:

Storing CSS in separate files keeps content closer to the top of the page
Development techniques, like a table-less layout, that leverages CSS put all the CSS into an external file, will have many benefits:
- Make the pages load faster, since it does not need to draw tables, etc.
- Increase the content to code ratio
- Place the content closer to the top of the page
- Put the content closer together
- And thus make the content easier to be indexed by a spider
- Lowers code complexity, decreasing the chance of improperly formed HTM and other page errors
CSS will also give the designers of the page the ability to reorder content, and list content blocks in order of importance. So for example, if the header should appear first on the page to the user, but is the least relevant for search engines, it can appear lower within the code and have more SEO relevant content (such as H1 tags, etc) closer to the top.
There are lots of other benefits to using CSS that are not as much related to SEO a they are good design practices:
- The site becomes easier to maintain, as the decoration is independent of the actual content copy
- Better browser caching and server resources

SEO Tip – JavaScript

Leave a reply

JavaScript is a very powerful tool for designers and developers alike. There is lots of functionality, and allows you to do a lot of great things within the browser on the client side. But if not used carefully and consciously, you can end up sacrificing your site’s search engine optimization.

Storing JavaScript in separate files keeps content closer to the top of the page
JavaScript based links or navigation schemes cannot be followed by crawlers. Be sure all links are standard HREFs in anchor tags
If JavaScript is used to show or hide different sections of content, there is a good chance that the crawlers may not be able to see it and index it properly
If JavaScript navigation is absolutely required, noscript tags should be used to provide content both for users without JavaScript capability, and for the search engine crawlers. A good idea is to put a link to a static sitemap page into the noscript tag, ensuring that your site will be fully indexed. Content within this tag should be used very sparingly, as it has been abused in the past, and overuse could hurt your page rank instead of help.
There are lots of other benefits to using JavaScript that are not as much related to SEO a they are good design practices:
- The site becomes easier to maintain, as the decoration is independent of the actual content copy
- Better browser caching and server resources

New Book – Peopleware : Productive Projects and Teams, 2nd Ed.

Leave a reply

The review from amazon.com looks like it would be a great read for an airplane ride:

Peopleware asserts that most software development projects fail because of failures within the team running them. This strikingly clear, direct book is written for software development-team leaders and managers, but it’s filled with enough commonsense wisdom to appeal to anyone working in technology. Authors Tom DeMarco and Timothy Lister include plenty of illustrative, often amusing anecdotes; their writing is light, conversational, and filled with equal portions of humor and wisdom, and there is a refreshing absence of “new age” terms and multistep programs. The advice is presented straightforwardly and ranges from simple issues of prioritization to complex ways of engendering harmony and productivity in your team. Peopleware is a short read that delivers more than many books on the subject twice its size.

Link to Amazon – Peopleware : Productive Projects and Teams, 2nd Ed.

Great articles – In-House SEO for Large Companies

Leave a reply

This is a great two part series called Laying the Foundation for In-House SEO Success in Large Organizations. The simple answer is to create a matrix team across all the organizations involved. The complex answer is in these two articles.

Laying the Foundation for In-House SEO Success in Large Organizations: Part I

Laying the Foundation for In-House SEO Success in Large Organizations: Part II

Apple Invasion Continues – Safari On Windows

Leave a reply

Scott Hanselman has a review of a beta version of Safari on Windows. Just what we needed… another browser to test.