CouchDB URL Rewriting for SEO

456 views Asked by At

I'm trying to create an entire site hosted purely on CouchDB (no nginx reverse proxy either) using a lot of client side Jquery/AJAX magic. Now I'm in the process of making it SEO friendly. I'm using vhosts and URL rewrites to route traffic from the root to my index.html file:

vhost:

example.com /dbname/_design/dd/_rewrite/

In my rewrite definition:

rewrites:[
   {
       "from": "/db/*",
       "to": "/../../../*",
       "query": {
       }
   },
   {
       "from": "/",
       "to": "../../static/index.html",
       "query": {
       }
   }
]

When optimizing a site for SEO, Google requires you to do a few things:

  • Use the hashbang (#!) in your friendly URL to tell the web crawler that you are an AJAX site with web crawlable material: http://example.com/index.html#!home
  • Use an http query argument to provide an HTML escaped fragment of that AJAX page: http://example.com/index.html?_escaped_fragment=home

I tried the following with no luck:

rewrites:[
   {
       "from": "/db/*",
       "to": "/../../../*",
       "query": {
       }
   },
   {
       "from": "/",
       "to": "../../static/index.html",
       "query": {
       }
   }, /* FIRST ATTEMPT */
      {
       "from": "/?_escaped_fragment=:_escaped_fragment",
       "to": "/_show/escaped_fragment/:_escaped_fragment",
       "query": {
       }
   }, /* SECOND ATTEMPT */
      {
       "from": "/?_escaped_fragment=*",
       "to": "/_show/escaped_fragment/*",
       "query": {
       }
   }, /* THIRD ATTEMPT */
      {
       "from": "/",
       "to": "/_show/escaped_fragment/:_escaped_fragment",
       "query": {
       }
   }
]

From what I've seen, CouchDB's URL rewriter is not capable of distinguishing the difference between a URLs with args and no args. Has anyone had luck creating such a rule with CouchDB URL rewrites?

1

There are 1 answers

1
fiatjaf On

I don't have a answer to the question, but I've developed a solution for the bigger problem of making crawlable sites hosted on CouchDB. It is a system that makes use of Facebook's React, list and show functions, ajax on the client and window.history to render the same HTML components filled with data at CouchDB and at the browser:

https://github.com/fiatjaf/reactive-couch

This solution doesn't need the hashbang, because for each unique URL the browser navigates to, using ajax and window.history or simple links (be it _list/listName/viewName/_show/displayKind/c305ee4d-8611-4e08-b9d3-3318835632a9 or something rewritten as /name//kind/c305ee4d-8611-4e08-b9d3-3318835632a9), the server can render the pertinent content.