In my multisite application, I need to include a robot.txt file for each of the site. The implementation for this goes as follows:
1- Included a RobotsContent property of type textarea within the Start page.
2- Added a hander as given below with a web config entry for the handler.
public void ProcessRequest(HttpContext context)
{
var uri = context.Request.Url;
var currentSite = _siteDefinitionRepository.List().FirstOrDefault(siteDefinition => siteDefinition.Hosts.Any(hostDefinition => hostDefinition.Authority.Hostname.Equals(uri.Host)));
if (currentSite != null)
{
var startPage = _contentLoader.Get<StartPage>(currentSite.StartPage);
var robotsContentProperty = startPage.RobotsContent;
// Generate robots.txt file
// Set the response code, content type and appropriate robots file here
if (!string.IsNullOrEmpty(robotsContentProperty))
{
context.Response.ContentType = "text/plain";
context.Response.Write(robotsContentProperty);
context.Response.StatusCode = 200;
context.Response.End();
}
}
}
I am aware there are a few nuget packages available for handling robot.txt but for some reasons & the need to have more control on this one ,I created a custom one. The above works as expected.
Referreing https://developers.google.com/search/docs/advanced/robots/create-robots-txt
It mentions that the rules are case sensitive ,comes in a group(user-agent, allow, disallow),directives(user-agent, allow, disallow )are required. With all these rules in place & this being a free textarea,I can add any random stuff within this.So is there any validations that I can apply to this?There are online validations avaliable for this but is there any way I can validate the text when it is being published.
You can implement an EPiServer validation attribute and use that on your
RobotsContent
property.If using an online validator is not an option this could be handled by a i.e. a regular expression inside the
Validate
method of the attribute implementation.