- Robots.txt files tell Search Engines what should and should not be crawled
- NOTE ā This is very different from the Robots Meta Tag. The crawler will see this file before it tries to call the page, so this file will override the Robots Meta tags on the pages.
- Robots.txt files should be stored in the root directory
- Remember, the point of the robots.txt file is to exclude pages from being crawled. So if a page or directory is banned, it will never even get to see what code is on those page(s). Accordingly, no code of those pages could change the bots behavior to re-index the page. So, robots.txt will overwrite the meta and robots tags on the page.
- More information regarding robots.txt can be found at http://www.robotstxt.org
- Sample robots.txt file to allow all pages to be crawled:
- User-agent: *
Disallow:
- User-agent: *
- With one minor adjustment, you can prevent all robots from indexing your site:
- User-agent: *
Disallow: /
- User-agent: *
- Here is a sample that will not index a specific directory for the Googlebot crawler:
- User-agent: googlebot
Disallow: /seo/
- User-agent: googlebot
SEO Tip – Use robots.txt file
Leave a reply