maven-linkcheck-plugin warning during link checking: cookie rejected, illegal domain attribute

418 views Asked by At

I am generating a project site with

mvn site

Linkcheck is activated as a reporting plugin

<reporting>
    <plugins>
        (...)
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-linkcheck-plugin</artifactId>
            <version>1.2</version>
        </plugin>
    </plugins>
</reporting>

The code is in Java.

The comment header of class files contains links to StackOverflow questions, for example

/**
 * Example written by Bruno Lowagie in answer to:
 * https://stackoverflow.com/questions/26853894/continue-field-output-on-second-page-with-itextsharp
 */

During the link checking phase of the site generation, I get warnings like these:

[ WARN] Cookie rejected: "$Version=0; __cfduid=dab443ca4b7fc1de5130856b7401f83cb1455551507; $Path=/; $Domain=.stackoverflow.com". Illegal domain attribute ".stackoverflow.com". Domain of origin: "stackoverflow.com"
[ WARN] Cookie rejected: "$Version=0; logged_in=no; $Path=/; $Domain=.github.com". Illegal domain attribute ".github.com". Domain of origin: "github.com"

I already looked at some other SO questions about "Cookie rejected: Illegal domain attribute". According to this aswer, the issue is not at my end, it's StackOverflow and GitHub that are setting a cookie that it they aren't allowed to send and the underlying http library of maven-linkcheck-plugin is telling me that. This behavior is exactly as specified by RFC 2109.

The linked question gives a workaround: set a cookie policy in httpclient that essentially says, I don't care about bad cookies, gimme gimme gimme. I need tolerance for non-compliant servers, so I need to use the browser compatibility cookie spec as described in the cookie guide.

My question is: since I'm not working with httpclient but with maven, what do I put in my pom.xml to get rid of these cookie warnings? I didn find anything useful in

1

There are 1 answers

0
Greg Chabala On

I am also affected by this issue.

The linkcheck documentation you mentioned shows there is an httpClientParameters configuration option.

httpClientParameters:

The extra HttpClient parameters to be used when fetching links. For instance:

<httpClientParameters>
 <property>
  <name>http.protocol.max-redirects</name>
  <value>10</value>
 </property>
</httpClientParameters>

See HttpClient preference page

That link at the bottom shows http.protocol.cookie-policy is one of the HttpClient parameters.

In theory, you can do the following in your pom to configure the underlying httpclient:

<reporting>
    <plugin>
        <artifactId>maven-linkcheck-plugin</artifactId>
        <version>1.2</version>
        <configuration>
            <httpClientParameters>
                <property>
                    <name>http.protocol.cookie-policy</name>
                    <value>ignoreCookies</value>
                </property>
            </httpClientParameters>
        </configuration>
    </plugin>
</reporting>

Where ignoreCookies is the string value of CookiePolicy.IGNORE_COOKIES.

In practice, this doesn't work. As you mention, there are layers here, maven‑linkcheck‑plugin -> doxia‑linkcheck -> httpclient. Debugging into the report generation, I can see that MLP passes the parameters into doxia-linkcheck but right about here something unfortunate happens.

if ( this.cl == null )
{
    initHttpClient();
}

if ( this.http.getHttpClientParameters() != null )
{
    for ( Map.Entry<Object, Object> entry : this.http.getHttpClientParameters().entrySet() )
    {
        if ( entry.getValue() != null )
        {
            System.setProperty( entry.getKey().toString(), entry.getValue().toString() );
        }
    }
}

cl is the httpclient that will make the request, http is a bean that's holding the values that were configured in the pom. Instead of putting them together, the bean properties are just dumped into the system property space and the httpclient doesn't look there. There's a notable exception, http.protocol.max-redirects gets special handling elsewhere in the code, and coincidentally is the example used in the maven‑linkcheck‑plugin documentation. So it looks like this general feature was really targeting something specific and the general use case was ignored.

So, we've exhaused our ability to configure it properly, what's left? We can silence the logger. Maven uses SLF4J with SimpleLogger these days, but doxia‑linkcheck is using commons‑logging with log4j 1.2.14. Old log4j isn't very easy for us to reconfigure as an external user, but we can shim onto SLF4J by adding maven‑linkcheck‑plugin to the pluginManagement section and adding a bridging dependency:

<pluginManagement> 
    <plugin>
        <artifactId>maven-linkcheck-plugin</artifactId>
        <version>1.2</version>
        <dependencies>
            <dependency>
                <groupId>org.slf4j</groupId>
                <artifactId>jcl-over-slf4j</artifactId>
                <version>1.7.28</version>
            </dependency>
        </dependencies>
    </plugin>
</pluginManagement>

You can now hide the warnings by increasing the log level for the logger in question:

mvn site -Dorg.slf4j.simpleLogger.log.org.apache.commons.httpclient.HttpMethodBase=error

As a bonus, the remaining logging from the plugin is now using maven's logging system instead of log4j, so you get colored output and consistent formatting.

But, you probably don't want to specify that system property to adjust the log every time you generate the site. I recommend the .mvn/jvm.config file:

mkdir -p .mvn && cd .mvn
echo "-Dorg.slf4j.simpleLogger.log.org.apache.commons.httpclient.HttpMethodBase=error" >>jvm.config

If you really want to keep it all in the pom, there's also properties-maven-plugin.

This is all a hack for something that would be better fixed in doxia-linkcheck, but it has not seen much development lately.