Category Archives: SEO

Dynamic sitemap.xml Files in ASP.Net

I know this is not a new topic. It is not even a new topic for me. I have posted on defining what a sitemap.xml file is for, and on dynamic sitemap.xml files in C#. But my team is finally ready to start implementing this as part of our custom development platform for the external brand sites.

When one searches for dynamic sitemap.xml creators in Google, you get a plethora of sites back. Some are code, some are online based tools. Since we are looking to create our file dynamically from within the site on demand, that helps narrow down our search. I have found a small number of code sources we can use to start with.

There is still the HTTP Handler from my original post. This project, ASP.Net Google Sitemap Provider by Bruce Chapman, is available on CodeProject. You can also read about it in a blog post on his iFinity site. It still looks like the most flexible solution.

There is a great looking solution on the ASP.Net site by Bertrand Le Roy called Google Sitemaps for ASP.NET 2.0. It has been ported over into the ASP.Net Futures July 2007 package. This solution is an HTTP Handler that uses the Web.sitemap file to generate a sitemap.xml file on the fly.

Another interesting idea I found in my searches was some code that shows a site map when a user gets a 404 error. This solution is also implemented as an HTTP Handler, but is only for 404 Page Not Found server errors. This code is also available on CodeProject in an article called Generate a Google Site Map Using the HTTP 404 Handler.

Here are some other sites of note to look at. They have similar solutions to the ones above, and it is always a good idea to see what other people have come up with.

If anyone has any additional resources, ideas, or suggestions, please leave me a comment and let me know what you think.

Netscape is Dead, Long Live Netscape!

Well, it is official.  The once-popular browser, from Mosaic through Netscape Navigator and all of its Mozilla variants, fought in the Browser Wars from 1994 through 2008, and is now throwing in the towel.  My once-favorite browser has finally fallen under the weight of Internet Explorer (and Firefox, too, I suppose). 

AOL announced on December 28, 2007 that as of February 1, 2008 they will no longer be providing Netscape Navigator.  Oh, how the might have fallen.  I found out about this on Engadget – Netscape finally bows out, browsers no longer supported, but you can read about it on the Netscape Blog – End of Support for Netscape web browsers.  The Browser Wars will continue, but without one if its original participants. 

20 Bad Ideas – Black Hat SEO Practices

I have spent a lot of time outlining the right things to do for SEO – things that are typically called White Hat SEO. Some of the practices I have written about would even lean a bit towards the Gray Hat arena. I have even discussed the 3 different hats of a search engine optimizer. But I have never outlined Black Hat SEO Practices. These are content practices, techniques, or methodologies that are sure to get your blog or web site banned from one, or all, of the major search engines. I list these here to help draw the line between what is acceptable, and what is not acceptable to the search engines. I do not list these techniques to advocate Black Hat practices. Use These Techniques At Your Own Risk!

1. Astroturfing

This is when a false public relations campaign or fake social media in the blogosphere generate increased attention to a site, blog, or wiki.
* Livingston Buzz – Astroturfing on the Dark Side of the Moon

2. Buying Expired Domains

Domains that have expired can carry a large page rank. By purchasing the domain, throwing up repetitive content, and linking to your other web sites and domains, you can use link juice to distribute the page rank to those other sites.

3. Cloaking

Cloaking is when a site is designed to show one set of content to your users, while showing a completely different set of content to crawlers, robots, and spiders. This is considered misrepresenting your content.

4. Comment Spamming

This method is implemented by leaving comments on sites with high PageRanks. These comments can be in the form of blog comments, guestbook entries, forum submissions, wiki pages, etc. The comments are filled with high density keywords, and have links back to the spamming site.

5. Doorway Pages

A doorway page is a “fake” page that the user will never see. It is purely for search engine spiders, and attempts to trick them into indexing the site higher. This method is dependent on useragent sniffing.

6. Fake CEO / Celebrity Avatars

This is when a blogger or forums user registers as if they are a person if significance, i.e. a CEO or celebrity. These people leave damaging messages that can sway a user in a specific direction about a product or service. This can swing the other way. A Celebrity or high level executive can act as an anonymous user to leave disparaging remarks about another person, company, or product, drive traffic to their site, and ultimately increase sales.
* CopyWrite, Ink – Silencing Crisis: Whole Foods Market, Inc.

7. Google Bombing

This is accomplished by creating links on multiple sites linking to the same page with the same text. The text link may not necessarily be relevant to the linked site, thus creating the Google Bomb. The most common Google Bomb can be seen by searching “miserable failure” and seeing sites for George Bush appear at the top of the results page.

8. Google Bowling

Google is penalizing (or even banning) sites that purchase site-wide links. A site-wide link is a link that is on every page of the entire site. Google Bowling is buying site-wide links as a competitor to get them banned.
* Web Pro News – Google Bowling: How Competitors Can Sabotage You; What Google Should Do About It

9. Invisible Text or Hidden Text

This Black Hat method manifests itself in many forms. One method is to put lists of keywords in white text on a white background in hopes of attracting more search engine spiders. Another method is to embed and overload keywords into unseen places that crawlers look will get you banned as well. Places like alt tags, comments, JavaScript tags, noframe tags, div tags that are hidden, etc.

10. Interlinking

When multiple web sites are built by the same person or company, with similar content, with links pointing back and forth between them, in an attempt to increase each others’ page ranks.

11. Keyword Stuffing

Filling your page with long lists of keywords in an attempt to rank higher for those words. You don’t view this as high quality content, and neither will Google. This method is typically accompanied with the Hidden Text and Redirecting black hat methods.

12. Link Farming

Another name for a link farm is a free-for-all site. The objective of these sites is strictly to generate inbound links to your site at any cost. This will typically work in the short term, but hurt your site (or get it banned) long-term.
These kinds of sites are also known as mutual admiration societies.

13. Redirecting

Redirects are commonly used along with doorway pages, or spam pages filled with advertising. They are designed to take a user to a page that they did not want to go to. These can be both server side redirects, or client side redirects. Vicious redirect pages often get the user into an infinite loop that are difficult to break from.

14. Scraper Sites

Also known as Made-for-AdSense Sites, these pages are similar to spam pages, except that they are designed to scrape search engine results and dynamically “create” content pages. These are also used in conjunction with do

15. Selling PageRank

Sites can explicitly sell “advertising” (read inbound links) to your site. This essentially distributes some of the PageRank to the newly linked site, and its position in search engine results pages. This has been in the news a lot lately. Google has dropped the PageRank of anyone doing this. Both the buyer and seller of the link are dropped in PageRank.

16. Shill Blogs, Spam Blogs, or Splogs

Spam Blogs are when one person is paid to act as a fan for those who hired them. Generating a source of positive feedback and link sharing will increase inbound traffic and PageRank. These methods are similar in effect to a link farm.
* Business Week – Wal-Mart’s Jim and Laura: The Real Story

17. Spam Pages

Spam Pages are web pages that rank well for specific keywords, but actually hold no content. They are typically full of advertisements, listings to other sites, or are part of a pay-per-click scam.

18. Sybil Attacks

When a single user creates multiple identities to generate additional traffic. This could be in the form of multiple web sites with similar, if not identical, content. This also could be in the form of multiple social bookmark accounts, multiple comments, etc.

19. Wiki Spam

Wikis, just like blogs, are intended to be an easy way to create and organize content for non-developers (read anyone). But the distributed and open editability of wikis make the susceptible to spamming. By placing links in wikis back to the spam site, you hijack the link juice of the wiki, pass the page rank on, and increase results frequency. The subject of the wiki page is typically irrelevant. This is why large wikis like wikipedia have added the nofollow attribute to all of their links.

20. Resources

Are there any other Black Hat SEO techniques that you know of? Any other Black Hat resources that you know of? What do you think of Black Hat SEO? Let me know what you think by leaving me a comment.

50 Easy Tips to Keep your Blog Search Engine Optimized

Blogging is one of the easiest ways to get content published to the Internet.  Everyone, from the average Joe to the Corporate Communications Specialist wants to see their blog and their most recent posts on the Google results page.  But, just like SEO for any other web site, it takes time, effort, and patience.  Here is a collection of tips gathered from around the blogosphere on how to optimize your blog for search engines.

Content

  • Content is always king.  Make sure your content is new, fresh, engaging, and relevant.
  • Update your blog frequently.  The more it is updated, the more your content will be indexed.
  • Stick with your blog – don’t get discouraged!
  • Use an interesting title for your blog and each of your blog posts. 
  • Limit each of your posts to one topic, keeping your pages focused.
  • Keep your posts not too short, and not too long.  This keeps your readers interested and returning.
  • Provide a list of your top 10 blog posts on your site.
  • Make sure your tags, categories, labels, etc. also make good keywords.
  • Use your keywords as often as possible, but only in a natural context.
  • Use a blog service like WordPress, Blogger, etc.  These sites already have high content churn, and attract frequent indexing.
  • Make sure that anonymous users can leave comments.  You will get more feedback that way.

Linking

  • Increase your inbound links from other sites.
  • Link to your own posts that have a similar topic.
  • Outbound links to high quality sites help your page rank.

Markup

  • Make sure your blog’s HTML is W3C Compliant so that search engines can spider your blog easily.
  • Make sure your post titles are live links.
  • If your blog supports it, don’t forget to use meta tags in your blog template.
  • Use your primary keyword in strategic locations:
    • Your blog domain
    • In the title of your posts
    • In the anchor text of links
    • In the alt tags of your images
    • In Header tags – H1, H2, H3, etc.
    • In bold tags

Your RSS Feed

  • Be sure that RSS auto-discovery tags are placed in the header of every page, one for each RSS feed.
  • Your RSS feeds should provide full text for each post.
  • Maximize the number of blog posts provided in your blog feed.  Typical default is 10, 20 or more is better.
  • Provide a feed for every category your blog offers.

Post-Publishing

 

Do you have other tips or resources for bloggers in optimizing their sites for search engines?  Leave me some feedback and let me know.

 

Resources:

The Truth About High Content to Markup Ratio and SEO

A common SEO tip for web developers is to keep your content to markup ratio high.  This is supposed to make the crawling of your site easier, more efficient, and faster.  It is possible, however, that more modern site crawlers like Google ignore code already, since they behave like a text based browser anyway.  However, implementing this as a SEO strategy cannot hurt. In fact, it has benefits for not only SEO reasons, but for general web development practices as well:

  • Keeping the amount of code on the page low helps overall site maintenance and usability, focusing on your content. 
  • Including CSS and JavaScript in external files increase content-to-code ratio as well as making your site easier to maintain. 
  • Having validated HTML code will help the crawlers understand your site , as well as make your code compliant for maximum browser readability and performance.

For more information:

Generating sitemap.xml files in C#

One of the things that I would like to tackle before the end of the year is including sitemap.xml files with all of the new sites that my team develops. We could generate these files manually, but this would be tedious at best. There are a number of tools that will generate these files for you. Some of them are stand-alone desktop applications (Sitemap XML). Some web-based tools are built in PHP (AutoSitemap), Perl (TM Google Sitemap Generator), Python (Google WebMaster Tools), etc., and you can use in your own local environments. There is even a web site that you can submit your URL and it will generate the sitemap.xml file for you (XML Sitemaps).

My requirements for this feature are pretty simple:

  • Something built in C#
  • Something we can include in our projects
  • Something that can be run as part of our build process
  • Something that can be completely hands-free

So far, the only thing that I have found is the GoogleSiteMapProvider on the CodeProject web site. This project:

  • [Is] instantly useable with the majority of ASP.NET applications
  • [Is] a full ‘binary’ solution – no integration of code or compiling – just drop in a binary, modify the web.config and go
  • [Is] extendable so that more complicated ASP.NET applications could redefine the provider without restriction

It seems like this is a great fit for our architecture. The solution was to have a single assembly with three main types:

  1. An HTTP Handler which would return the XML on request (called GoogleSiteMapHandler)
  2. A Provider Type (called GoogleSiteMapProvider)
  3. A Controller class to glue the Handler and Provider together

This is a great place for us to start. The source code is available, it seems to fit my needs, and is simple to use.

Anyone else using something different? Do you have any other ideas? Have you found any other tools that might be useful for this? Leave a comment and let me know.

Windows Live Search Gets an Upgrade

On the Live Search Blog this week, Microsoft’s Search Team announced that it has released an upgrade to its Live Search engine.  The enhancements to the search engine include:

  • Improved Core Relevance – Improved search results for the searches you do day in and day out
  • Reduced Spam – Constantly improving to stay ahead of the curve to filter out sites that use illegitimate or malicious SEO techniques
  • Dramatically Improved “Snippets” – The site summaries have been improved and expanded in the search results page
  • Much Bigger  Index – The index has grown to 20 billion pages – 4 times larger than before
  • Smart Logic to interpret some keywords – for example NW and Northwest are the same

Take a look and see if you notice the difference.  Leave feedback about your Live Search experiences here.

TouchGraph Google Browser

Pandia reviewed a new tool called TouchGraph Google Browser.  This tool allows you to visualize the connections between sites.  You can read the initial review on the Pandia web site, and you can check out the TouchGraph Google Browser on the TouchGraph site.  The TouchGraph team is a group of interface designers who are exploring better ways to visualize information.  In addition to the TouchGraph Google Browser, they have also developed TouchGraph Amazon Browser and the TouchGraph Facebook Browser using the same visualization technology. 

This has some really interesting implications for Search Engine Optimization, Web Analytics, and Web Site Development.  Optimizing a site to have each of its pages appear in a graph like this could have its site map generated dynamically.  Having a Web Analytics Dashboard where you could click on each of the pages, or series of pages, and view data relationships between them would be very powerful. 

Check it out and let me know what you think. 

The 6 Test Styles of Google Website Optimizer

So I attended the Google Website Optimizer webinar this Tuesday afternoon.   I did not know too much about the feature set of this particular tool, so I thought the webinar would be a good way for me to find out more. 

The class was moderated by ROI Revolution.  They are a Google Analytics Authorized Consultant and AdWords Qualified Company, and offer webinars and training classes for Google products.  You can find more information about them on their web site. 

Essentially, Google Website Optimizer is a tool designed to track results of content changes to your web site before you commit to them.  It works in a similar way to Google Analytics – you tag your pages, your content blocks, your action items, and your goal pages.  Google Website Optimizer will then randomize your content or your page to test it how you choose. 

There are 6 different types of tests that you can use:

  1. A/B Testing – this is essentially a test to determine if one page layout is more effective than another
  2. Multivariate – this tests if different content blocks (copy blocks, headers, images, etc.) are more effective than others
  3. Split Path – this will test if content changes will affect the navigation through your site
  4. Multipage Multivariate – this test will measure if content changes on one page will affect navigation on other pages, and if there are any other cross-page interactions that change
  5. Linger – this test is good for sites that have no clear conversion, and will measure time on the page instead of number of conversions
  6. Anything – an open ended type of test, particularly if your site has multiple conversion points

There was also a brief demo on how you can intertwine Google Website Optimizer, Google Analytics, and Google AdWords to measure how changes in your page affect your AdWords advertising campaigns.

Did anyone else attend the session?  Has anyone used Google Website Optimizer?  Is this a tool that you would think is useful?

The Truth About Plurals in Keywords

So I was reading through some forum postings on SEO and came across a question on highrankings.com about plurals in keywords.  Everyone agreed in the forum posts that including plurals in your keywords will give more accurate search results. 

They recommend testing it yourself.  if you search for “search engine” and search for “search engines” you will get completely different results. 

I poked around some more, and found this great article on searchengineguide.com about plural vs. singular keywords.  Sumantra Roy outlined how each search engine handles the difference between singular and plural keywords.  And all twelve search engines reviewed have different results for singular and plural keyword searches. 

So there you have it.  When building a list of keywords, include singular and plural versions of keywords.