Does a 'user'-url based website create issues with Google / Search Engines?

165 views Asked by At

I currently maintain the backend php code for a website which allows our sales representatives to sell products and services. If a sales representative is active, he/she is given a "custom" website URL which essentially tags any activity on that particular site to that representative. Sales are only collected on representative websites (we do this to 'protect' our employees and make sure they feel we are not selling behind their back on an open parent site).

For example:

  • www.site.com may highlight all the products and services available but does not give a customer the ability to purchase

  • www.site.com/SOMEREPCODE where SOMEREPCODE is a unique identifier to a specific agent, presents the same options but opens the ability to sell that product. There are thousands of these sales representatives, therefore thousands of links pointing to the same page and content.

There has been a lot of debate as to whether we should open the site up to front end sales as well recently. Our industry is very specific so we are not too worried about lost sales from web shoppers but I do believe they exist. Some of our front end developers have "noindex, nofollow" code on the pages and we are told this is to prevent Google and others from 'blacklisting' the site as trying to have multiple links all going to the same content (think SOMEREPCODE representing over 1000 sales representatives with nearly the exact same page minus name and contact number shown).

edit - showing htaccess file

#if file or directory do not exist, try as an repid
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}.php !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([0-9a-zA-Z-]+)$ index?Rep=$1 [QSA,NC,L]

The htaccess logic above checks to make sure the code entered is not an existing file or directory. If it is not, the SOMEREPCODE is stored as a variable to index?Rep=SOMEREPCODE.

At the top of my index page, I include a function to then check if the value of Rep is a valid sales representative and if they are active. If invalid or not active, the page is redirected to a landing page giving an error. If the rep is active and exists, the page continues to load after setting the appropriate SESSION variables.

indexInclude

<?php
if(isset($_GET['Rep']) && $_GET['Rep'] != NULL) {

    //DB connectors called
    $sql = "SELECT * FROM reps WHERE repcode = ? AND status = 'Active' LIMIT 1";
    $stmt = $db->prepare($sql);
    $stmt->execute(array($_GET['Rep']));

    while ($row = $stmt->fetch()) {
        $_SESSION['repname'] = $row['repname'];
        //collect other rep information
    }

    if( !isset($_SESSION['repname']) && empty($_SESSION['repname']) ) {
        header("Location: unavailable");
        exit;
    } else {

        $_SESSION['sales'] = "Y";
    }

} elseif( !isset($_SESSION['sales']) && !isset($_GET['Rep']) ) {
    $_SESSION['sales'] = "N";
}
?>

The index page does not change at all in this case, only areas of the site that 'display' in the presence of $_SESSION['open'] == 'Y'.

Is this in fact true? Are there ways to handle this situation which would allow us to open the site up for web sales as well?

3

There are 3 answers

5
Franz Enzenhofer On BEST ANSWER

if it's not complete mirror, than it's not a big issue.

best practice would be

www.site.com/SOMEREPCODE -> set a selling cookie -> HTTP 301 redirect -> www.site.com

basically all /SOMEREPCODE redirects to an canonical version of the URL, only the canonical version of the URL gets communicated to google. if you can't do an HTTP 301 redirect, try the canonical element http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394

with the canonical element the flow would look like this

www.site.com/SOMEREPCODE -> set a selling cookie -> HTTP 200 (deliver the page content) -> page has <link rel="canonical" href="http://www.site.com/"/> in the HEAD section

get rid of the "nofollow" it does not make sense and devalues all links that point from these pages to other pages. if you use an HTTP 301 redirect (or the canonical element) the noindex is unnecessary (but doesn't hurt).

but as a matter of fact: if you do not know how much of sales pot. your are missing and are not sure how to handle that situation (+ obviously you have devs who do not understand SEO but think they do because they use "nofolow" and talk about 'blacklisting') you should think about consulting a serious SEO. any good SEO can give you good enough answers to all of these questions.

1
James L. On

If I am understanding you correctly www.site.com/SOMEREPCODE is an exact mirror of www.site.com, the one difference being the ability to purchase.

The main concern here for SEO would be duplicate data on different urls http://googlewebmastercentral.blogspot.com/2008/09/demystifying-duplicate-content-penalty.html

ex: www.site.com/producta.html contains the same data as www.site.com/SOMEREPCODE/producta.html

All links that go to www.site.com or www.site.com/page.html, as opposed to www.site.com/SOMEREPCODE/page.html, should not have the noindex nofollow set. All links going to mirrors (www.site.com/SOMEREPCODE/.../) should have the noindex nofollow set.

If you allow selling on the main site, have the /SOMEREPCODE/ pages place a cookie so that your rep still gets credit if someone buys later but only navigates to the main site.

1
Stephen Ostermiller On

To sum up: each sales rep has a different set of URLs that they use (essentially a copy of your website for each rep as far as search engines can tell) and there is NO "canonical" website (no generic website that has no sales rep, and you are not playing favorites and choosing one sales rep's site as the canonical one).

I can see two issues:

  1. Depending on the number of pages on your site (n) and the number of sales reps (m), search bots are going to have to do a lot of crawling to index your entire site (n x m). This could put extra load on your servers, or it could mean that search bots will give up and not crawl your entire site
  2. You are going to have duplicate content issues with the search engines. Googlebot won't rank multiple copies of the same content. This may or may not cause your site to get penalties, but it will dilute the power of your site as any inbound links to your content will be spread between the "sites" for each of your sales reps.

Your options as far as I see them are:

Leave everything as is

  • Search engines will have to sort out the duplicate content for themselves (and they may be doing a decent job)
  • You will need to monitor that search bots don't overload your servers
  • Your organic rankings won't be as high as they could be due to duplicate content

Block the site using robots.txt

  • Search bot load on your servers will be under control
  • You will get almost no rankings at all and just have to rely on your sales reps

Launch a non-sales-rep site and canonicalize all traffic to it

  • You indicated that your reps might not like this

Favor one sales rep as the canonical sales rep

  • You'd have to choose a favored sales rep (or create a fake one)
  • Sales reps may or may not notice the presence of canonical tags on their site pointing to another sales rep's site
  • Sales reps other the favored one would lose any organic search traffic and resulting sales they currently get.