Robots.txt – Stop Search Engines Like Google From Listing Some Pages of Your Site

What is the Robots.txt file?

A robots.txt file is not unique to Accrisoft Freedom websites. It can exist on any web service platform.

The robots.txt file is an essential part of web management, acting like a gatekeeper for search engines' bots or crawlers. It's placed in the root directory of a website and provides instructions on which parts of the site these bots are allowed or disallowed to access and index. This is particularly useful for site administrators who want to control the visibility of certain sections of their website in search engine results or manage the traffic load on their servers.

When a search engine crawler visits a website, it first checks the robots.txt file for permissions. This file can specify which areas of the site should be indexed, such as public blog posts or product pages, and which should be ignored, like admin pages or private directories. It's a way to guide crawlers towards the content that the site owners want to be publicly searchable, improving the efficiency of the indexing process and the relevance of search results.

However, it's important to remember that the directives in a robots.txt file are not enforceable. Compliant bots, like those of major search engines, will respect the rules set in the file, but it doesn't provide any security against non-compliant bots or those with malicious intent. Therefore, sensitive or confidential information should never rely solely on a robots.txt file for protection, as it's more of a guideline than a secure barrier.

How to edit Robots.txt in Accrisoft Freedom

In the Freedom Tools Module, click on the Search Optimization tab.

Then click on Get Started next to Robots and Spiders

This is where you can edit your robots.txt file on your site.

For example, if you wanted to block Google from showing two of your web pages:

  • https://www.yourdomain.com/membership/
  • https://www.yourdomain.com/main/terms-and-conditions/

Your robots.txt file might look something like this:

# ROBOTS.TXT

User-agent: *
Disallow: /membership/
Disallow: /main/terms-and-conditions/
crawl-delay: 5

For more information on options for how to use a Robots.txt file, check out this article: https://moz.com/learn/seo/robotstxt 

 

 

 

Related to