How do search engines obtain unlinked pages?

214 views Asked by At

I noticed that quite a lot Dropbox pages are indexed by Google, Bing, etc. and was wondering how these search engines obtain for instance links like these:

https://dl.dropboxusercontent.com/s/85cdji4d5pl5qym/37-71.pdf
https://dl.dropboxusercontent.com/u/11421929/larin2014.pdf

Given that there are no links on dl.dropboxusercontent.com to follow and the path structure is not that easy to guess, how is it possible that a search engine obtains such a link?

One solution might be that it was posted on a forum and picked up by the search engine but I looked up quite a lot of the links and checked the backlinks without success. I also noticed that Bing and Yahoo show a considerable amount of more results than Google which would mean that Bing does a better job in picking up these links which seems unlikely to me.

1

There are 1 answers

0
unor On

Even if the document is really unlinked (no link on their site, no link on someone other’s site, no sitemap, no Referer log from a site that gets linked in the document, etc.), it’s still possible for search engines to find the link.

Two ways are:

  • Someone could submit the URL to a search engine (whether via a public tool, or via the site’s webmaster account).

  • The search engine could get all URLs that certain users visit in their browsers. This could, for example, happen when the user has installed a toolbar from that search engine. This is the case with Bing, see my related answer on Webmasters SE:

    Microsoft has confirmed that they do discover and index URLs that they find through users surfing the Internet with the Bing Toolbar installed.

And there might be more ways, of course.