Modifying robots.txt for individual sites of a multisite install

WordPress creates a robots.txt dynamically. To overwrite it in a normal non-multisite installation, you can just upload a static robots.txt to the server. On a multisite install, this would overwrite the robots.txt for all sites, which is not always the wanted behavior. This post explains how you can modify robots.txt for individual sites of a multisite.

WordPress comes with the filter robots_txt which allows modifying the dynamically created robots.txt’s output. The function get_current_blog_id() returns the ID of the current multisite site, which we can use to check for a particular site to add rules to the robots.txt. This is how it looks currently for my site:

/** * Modify robots.txt for main site and english site * * @param $output * @param $public * * @return string */ function fbn_custom_robots( $output, $public ) { $site_id = get_current_blog_id(); if ( $site_id == 1 ) { $output .= "Disallow: /agb-und-widerruf/\n"; $output .= "Disallow: /mein-konto/\n"; $output .= "Disallow: /warenkorb/\n"; $output .= "Disallow: /impressum-und-datenschutz/\n"; } elseif ( $site_id == 11 ) { $output .= "Disallow: /account/\n"; $output .= "Disallow: /cart/\n"; $output .= "Disallow: /imprint/\n"; $output .= "Disallow: /terms/\n"; } return $output; } add_filter( 'robots_txt', 'fbn_custom_robots', 20, 2 );
Code language: PHP (php)

For the site with the ID 1 (this is florianbrinkmann.com) are added four Disallow rules, likewise for the site with the ID 11 (my English site florianbrinkmann.com/en).

To get the site’s ID, I just added the following line after $site_id = get_current_blog_id();:

$output .= $site_id;
Code language: PHP (php)

This way, the ID of the current site is displayed, when you visit its robotx.txt.

11 reactions on »Modifying robots.txt for individual sites of a multisite install«

  1. Hi Florian,

    Thank you for this solution. Can you please tell me what this line does? I'm not an expert in PHP.

    add_filter( 'robots_txt', 'fbn_custom_robots', 20, 2 );

    Thank you!
    Matt

      1. Thanks Florian,

        Where by chance would you add this code in a multisite setup? Is there a centralized file that would handle this for each site?

        1. Hi Matt,

          you’re welcome.

          I use it in a custom plugin that is enabled network-wide. I pasted the code into a (very basic) plugin file that you can use as a starting point: https://gist.github.com/florianbrinkmann/9236134e29c07bb10ae1932d93100984

          You could use it as a Must Use plugin (https://codex.wordpress.org/Must_Use_Plugins). For that, just upload it to the wp-content/mu-plugins folder via FTP (you may need to create the mu-plugins directory).

          Hope that helps,
          Florian

  2. Hi!

    $output .= "Disallow: /account/n";

    This is not '/n' working work for me when I want a new line. This works '\n';

    $output .= "Disallow: /account\n";

  3. This helped so just wanted to give you a solution to generate this dynamically for all your websites inside a multisite install 🙂

    function my_robots_txt( $output, $public ) {

    # Check if Woocommerce is active on the website for this request
    if ( ! class_exists( 'woocommerce' ) )
    return $output ;

    # These are the endpoint names used by Woocommerce, not the real slug of the page
    $endpoints = [ 'cart', 'checkout', 'myaccount' ];

    foreach ( $endpoints as $endpoint ) {

    $woo_page_id = wc_get_page_id( $endpoint );
    # You need to get the slug => the slug field is "post_name"
    $slug = get_post_field( 'post_name', $woo_page_id );
    # Add the rule for this page
    $output .= "Disallow: /{$slug}/\n";
    }

    return $output;
    }
    add_filter( 'robots_txt', 'my_robots_txt', 10, 2 );

Leave a Reply

Your email address will not be published. Required fields are marked *