Can CDN create some kind of statistics by tracking visitors of my site or they download needed libraries without sharing the URL of the page they visit ?
Does CDN know which website the client is visiting when fetching jquery.min.js or other resource from CDN?
1.4k views Asked by knm AtThere are 3 answers
Yes, they know the URL of the page which requested the resource (e.g. by looking at the Referer
header). So they could track what websites requested a certain resource. The only exception is when an HTTPS page requests a resource over a non-secure connection. In that case the Referer
won't be set, but the Origin
header could be of some help anyway.
Tracking individual users could certainly be done, but it's impractical for a number of reasons:
CDN resources are meant to be heavily cached by browsers, so they will be requested and downloaded once for many different page views, making "passive" stats bogus.
Forcing the user to download the resource for each page visited makes CDNs pointless, slows down the navigation for no reason and overloads CDN's bandwidth. This was the technique used by long-dead views counters on GeoCities pages from the 90s (sigh).
Tracking users requires setting an identifying cookie at least. This adds complexity to the web service (since it can't be a simple file server anymore) and latency to the response time, since the UID has to be looked up in some form of DB or newly generated. Etags could be abused as well, with the same issues of cookies.
As an alternative, using query string parameters could work, but requires collaboration from the target page, which has to include the UID as a parameter to each request, which means URLs cannot be hard-coded. I guess this is not the case you are talking about.
To sum up, a CDN could track your visitors, but the downsides of doing so are actually larger than the hypothetical gain, assuming the performance and the linked profitability is the main goal of running a CDN. If analytics are more valuable than performance or economy of operation, like it could be for a free CDN, then one could sacrifice performance for gaining insights by applying points 2 and 3.
At that point, one would have to demonstrate the soundness of collected stats in order to be able to sell them for any marketing purpose. Besides, the nature of the files usually served by CDNs make them quite uninteresting. For instance, I don't see a lot of profitability in knowing how many people use a certain version of jQuery out there.
Yes, they can use the referer header field:
The field is part of the request header which could look like this, for example, reloading this page will show that this link to googleapis (see console F12 and the network tab):
Sent this request header:
Reporting which site it came from using the referer: