I access SVN repositories via
- http://svn.example.com/repo1
- http://svn.example.com/repo2
- ...
with the following Apache configuration
LoadModule dav_svn_module modules/mod_dav_svn.so
LoadModule authz_svn_module modules/mod_authz_svn.so
<VirtualHost xxx.xxx.xxx.xxx>
ServerName svn.example.com
<Location />
DAV svn
SVNParentPath /path/to/svn/repositories
AuthzSVNAccessFile /path/to/svn/conf/auth_policy
Satisfy Any
AuthType Basic
AuthName "Subversion repository"
AuthUserFile /path/to/svn/conf/passwdfile
Require valid-user
</Location>
</VirtualHost>
I would like to prevent web crawlers from indexing the public repositories, but I cannot figure out how to properly set up the configuration to serve robots.txt
from http://svn.example.com/robots.txt
.
I have found a thread "stopping webcrawlers using robots.txt" from 2006, but it didn't help me solve the problem (Ryan's suggestion for redirection didn't work).
EDIT: I would prefer to keep the repositories at the top level rather than moving them to http://svn.example.com/something/reponame.
Don't put your Subversion repositories' virtual directory in the root of your server:
Wrong
Right
Instead of your repository root being
http://svn.example.com
, it will behttp://svn.exmaple.com/svn
. This frees uphttp://svn.example.com
to be a true document root which means you can add some documentation about your site, and put in arobots.txt
file underhttp://svn.example.com/robots.txt
.Now, a well behaved robot will see the
robot.txt
file and not index your Subversion repository.