How to configure apache to serve http://svn.example.com/robots.txt?

916 views Asked by At

I access SVN repositories via

  • http://svn.example.com/repo1
  • http://svn.example.com/repo2
  • ...

with the following Apache configuration

LoadModule dav_svn_module     modules/mod_dav_svn.so
LoadModule authz_svn_module   modules/mod_authz_svn.so

<VirtualHost xxx.xxx.xxx.xxx>
    ServerName svn.example.com

    <Location />
        DAV svn
        SVNParentPath /path/to/svn/repositories
        AuthzSVNAccessFile /path/to/svn/conf/auth_policy
        Satisfy Any

        AuthType Basic
        AuthName "Subversion repository"
        AuthUserFile /path/to/svn/conf/passwdfile
        Require valid-user
    </Location>
</VirtualHost>

I would like to prevent web crawlers from indexing the public repositories, but I cannot figure out how to properly set up the configuration to serve robots.txt from http://svn.example.com/robots.txt.

I have found a thread "stopping webcrawlers using robots.txt" from 2006, but it didn't help me solve the problem (Ryan's suggestion for redirection didn't work).

EDIT: I would prefer to keep the repositories at the top level rather than moving them to http://svn.example.com/something/reponame.

1

There are 1 answers

3
David W. On

Don't put your Subversion repositories' virtual directory in the root of your server:

Wrong

<Location />
    DAV svn
    SVNParentPath /path/to/svn/repositories

Right

<Location /svn>
    DAV svn
    SVNParentPath /path/to/svn/repositories

Instead of your repository root being http://svn.example.com, it will be http://svn.exmaple.com/svn. This frees up http://svn.example.com to be a true document root which means you can add some documentation about your site, and put in a robots.txt file under http://svn.example.com/robots.txt.

Now, a well behaved robot will see the robot.txt file and not index your Subversion repository.