Blocking 'Bots

Until a website is ready for production, you must block the website from being found in search engines. We will do this by blocking 'bots.

HOW TO BLOCK 'BOTS':

Robot.txt is a file. When placed in the document root directory, this file prevents access by search engine robots.

STEPS TO MAKE A ROBOT.TXT FILE:

  1. Open text editor.
  2. Text edit.
  3. Type the proper approach from table below.
  4. Save as .txt file.
  5. In your Administration Panel, turn on setting allowing website to be viewed.
  6. Test to see if robot.txt is blocking robots.

HOW TO RUN A ROBOT.TXT VALIDATOR:

  1. Check if Robots.txt file is valid to the Robot Exclusion Standard.
  2. Type in full URL of site.
  3. Run validator. The validator will find syntax errors, 404 errors, poorly typed words, and suggest changes.

Block All Bots

Why: To keep sensitive information private, like databases of credit card numbers. To keep expensive research, when posted online, from being accessed.

Use Wildcard(*)

            User Agent: *
            

Block Pages

Why: To keep robots from accessing duplicate content, i.e., the original page and a printable version of the page A robots.txt file would prevent the print version from being accessed.

Use Forward Slash(/)

            Disallow: /
            

Disallow

Why: To keep robots out of any folder containing important information like an administrative panel.

Use Two Slashes(//)

            Disallow:/cgi-bin/