Episerver: Validating contents from Robot.txt

202 views Asked by At

In my multisite application, I need to include a robot.txt file for each of the site. The implementation for this goes as follows:

1- Included a RobotsContent property of type textarea within the Start page.

2- Added a hander as given below with a web config entry for the handler.

public void ProcessRequest(HttpContext context)
        {
            var uri = context.Request.Url;

            var currentSite = _siteDefinitionRepository.List().FirstOrDefault(siteDefinition => siteDefinition.Hosts.Any(hostDefinition => hostDefinition.Authority.Hostname.Equals(uri.Host)));
            if (currentSite != null)
            {
                var startPage = _contentLoader.Get<StartPage>(currentSite.StartPage);

                var robotsContentProperty = startPage.RobotsContent;

                // Generate robots.txt file
                // Set the response code, content type and appropriate robots file here
                if (!string.IsNullOrEmpty(robotsContentProperty))
                {
                    context.Response.ContentType = "text/plain";
                    context.Response.Write(robotsContentProperty);
                    context.Response.StatusCode = 200;
                    context.Response.End();
                }
            }
        }

I am aware there are a few nuget packages available for handling robot.txt but for some reasons & the need to have more control on this one ,I created a custom one. The above works as expected.

Referreing https://developers.google.com/search/docs/advanced/robots/create-robots-txt

It mentions that the rules are case sensitive ,comes in a group(user-agent, allow, disallow),directives(user-agent, allow, disallow )are required. With all these rules in place & this being a free textarea,I can add any random stuff within this.So is there any validations that I can apply to this?There are online validations avaliable for this but is there any way I can validate the text when it is being published.

1

There are 1 answers

3
Marcus Åberg On

You can implement an EPiServer validation attribute and use that on your RobotsContent property.

using EpiServer.Validation
public class RobotTxtValidatorAttribute : IValidate<StartPage>
{
    public IEnumerable<ValidationError> Validate(StartPage startPage)
    {
        // Validate the property value here, i.e. by using an HttpClient to use the online validation that you mentioned.
    }
}
public class StartPage
{
    [RobotTxtValidator]
    public string RobotsContent { get; set; }
}

If using an online validator is not an option this could be handled by a i.e. a regular expression inside the Validate method of the attribute implementation.