Go to content

Generate the robots.txt from Hippo CMS

Published on

The robots.txt is a response from your website that is unimportant for your human visitors but very important for search engine crawlers. That’s why we created a Hippo CMS / Hippo Site Toolkit (HST) plugin to manage the robots.txt in the CMS and return the proper output.

The plugin comes with an out of the box document type to manage the parts of the site that are disallowed to crawl for search robots. There’s usually one configuration for all crawlers but if you want, you can add multiple configurations per crawler.

Screenshot of Robots.txt configuration
Screenshot of Robots.txt configuration

In the first screenshot all crawlers should skip /donotindex/ and /search/, only "Googlebot" should ignore /hide/for/googlebot and the non existing "EvilBot" is kindly requested not to index the site at all.

Generating the response is mostly configuring the HST response for the request for "robots.txt". The plugin comes with a demo project and documentation how to configure the plugin for your existing project.

Sample robots.txt output
Sample robots.txt output

In the second screenshot you see the response by the HST for the robots.txt request. The HST returns a plain text response with all the fields we’ve configured in the CMS.