Generating robots.txt dynamically

I have a subfolder WP installation. It serves multiple domains, which are linked to the same folder. The requested domain is received in wp-config.php (from the $_SERVER variable) and used to define WP_SITEURL, WP_HOME and DOMAIN_CURRENT_SITE. So we can open the same both from domain.com and domain.co.uk. I need to add the link to XML sitemap to robots.txt and, obviously, it should be different depending on the domain requested.

There is the do_robots() native WP function which generates robots.txt for multisite and allows changing it dynamically using its robots_txt action from theme’s functions.php file or from a plugin. However, it does not seem the case for a single-site installation.

I can call do_robots from theme functions to generate the content and write to the robots.txt file but am not sure where I should hook it.

The question is: how do I have the robots.txt dynamically generated or have the possibility to change its content with hooks from theme functions.php?

1 Answer
1

I just tested the ‘robots_txt’ filter on a single installation to modify the output of the virtual /robots.txt that WordPress displays and it worked fine for me:

add_filter('robots_txt', 'wpse_248124_robots_txt', 10,  2);

function wpse_248124_robots_txt($output, $public) {

  return 'YOUR DESIRED OUTPUT';
}

What is really happening when you try to reach /robots.txt? Does it display the default robots.txt content or a 404? If you’re getting a 404, then you might have Apache or Nginx rules that are not allowing the /robots.txt request to go through PHP. It is very common to have something like this on a nginx configuration:

# Don't log access to /robots.txt
location = /robots.txt {
    access_log    off;
    log_not_found off;
}

You should replace it with something like:

# Don't log access to /robots.txt
location = /robots.txt {
    try_files     $uri $uri/ /index.php?$args;
    access_log    off;
    log_not_found off;
}

You should also check if the WordPress own internal rewrite rules are working correctly using the Rewrite Rules Inspector (or any other method available) by making sure the following rewrite rule exists:

robots\.txt$ index.php?robots=1

If it doesn’t, you should try to deactivate plugins, activate a default theme and flush the rewrite rules to check if the rewrite rule comes back, but if you have no time for that, just add this rewrite rule to your .htaccess:

RewriteRule robots\.txt$ index.php?robots=1

Leave a Comment