Thursday, May 31, 2012

Create a robots.txt file

The robots.txt file is used to instruct search engine spiders what your site pages are indexed, and therefore have been cataloged. Most websites have files and folders that are not relevant for creating the search engines (like images or admin files) so that a robots.txt file can actually improve your site's crawl.

Robots.txt is a simple text file that can be done in Notepad. If you're on WordPress robots.txt file will be displayed:

         User-agent: *
         Disallow: / wp-
         Disallow: / feed /
         Disallow: / trackback /

"User-agent: *"? means that all search engines (Google, Yahoo, MSN, etc.) You should use this guide to crawl a website. If the site is complex, and no need to specify other instructions to the spider.

"Disallow: / wp-" to ensure that search engines do not index the files in WordPress. This line includes all the files and Folder starting with "wp-"? Indexing and avoid duplicate content and admin files.

If you do, just replace your WordPress-line to a file or folder can not be indexed, for example:

         User-agent: *
         Disallow: / images /
         Disallow: / cgi-bin /
         Disallow: / another folder /

When you create a robots.txt file, just upload it into root directory and you're done!

No comments:

Post a Comment