Why would you stop Google from indexing pages in your website?

74 views Asked by At

I've read some articles on how to stop the indexing, but I'm not clear WHY you would actually want to do that.

1) The explanation I found for why was:

"For marketers, one common reason is to prevent duplicate content (when there is more than one version of a page indexed by the search engines, as in a printer-friendly version of your content) from being indexed.

Another good example? A thank-you page (i.e., the page a visitor lands on after converting on one of your landing pages). This is usually where the visitor gets access to whatever offer that landing page promised, such as a link to an ebook PDF." [Basically you don't want the user to find your Thank You page with freebies through search without signing up]

However, in both these cases it actually seems like a bad idea to prevent indexing? You'd rather just redirect to the sign-in page (in the second example) after your user finds you? At least the user will be able to reach your website.

2) It's also mentioned that indexing is not the same as appearing in Google search results, but it's not really clear what the difference is. Could someone enlighten?

TIA.

1

There are 1 answers

0
Peter K On

Let me provide few good reasons from my experience, though I believe many more exist.

Traditionally known primary reason is to save computing resources. Imagine a search engine - probably it would not like another search engine to index all of its results.

A big part of it is to prevent waste of resources. Imagine a search engine would index itself, that can take some time. This also applies to binary data which has no text.

Your example somewhat falls into this category

"For marketers, one common reason is to prevent duplicate content (when there is more than one version of a page indexed by the search engines, as in a printer-friendly version of your content) from being indexed.

But this is not considered a valid reason any more, as resource consumption is generally low, and proper disambiguation should be done with html metadata like

<link rel='canonical' href='<permanent link>' ...>
<link rel='alternate' media='printed' ...>

Another big reason to prevent indexing is privacy. E.g. facebook profiles are not indexed if owner chooses so.

Another good example? A thank-you page (i.e., the page a visitor lands on after converting on one of your landing pages). This is usually where the visitor gets access to whatever offer that landing page promised, such as a link to an ebook PDF." [Basically you don't want the user to find your Thank You page with freebies through search without signing up]

This falls into privacy category. Even better, a search engine once indexed a set of these "thank you" pages from a website of mobile operator, which also included the message sent.

One observed reason is general newbie paranoia. It is a bad reason, because paranoia solution would be much better implemented with HTTP authentication.