My SPA employs the Backbone.js router which uses pushstate and hashed URLs as a fallback method. I intend to use Google's suggestion for making an AJAX web-app crawlable. That is, I want to index my site into static .html files generated by PhantomJS and deliver them to Google via the URL:
mysite.com/?_escaped_fragment_=key=value
.
Keep in mind that the site does not serve static pages for end-users (it only works with a Javascript-enabled browser). If you navigate to mysite.com/some/url
the .htaccess file is setup to always serve up mysite.com/index.php
and the backbone router will read the URL in order to display the JavaScript-generated content for that URL.
Furthermore, so that Google will index my entire site, I plan on creating a sitemap which will be a list of hashbang URLs. The URLs must be hashbanged so that Google will know to index the site using the _escaped_fragment_key URL.
Soooo....
(1) Will this approach work?
and
(2) Since backbone.js does not use hashbang URLs, how can I convert the hashbang URL to the pushstate URL for when the user arrives via Google?
reference: https://stackoverflow.com/a/6194427/1102215
I ended up stumbling through the implementation as I've outlined in my questions. So...
(1) Yes, the approach seems to work rather well. The only downside is that even though the app works without hash-bangs, my sitemap.xml is full of hashbang URLs. This is necessary to tip-off Google to the fact that it should query the _escaped_fragment_ URL when crawling these pages. So when the site appears in Google search results there is a hashbang in the URL, but that's a small price to pay.
(2) This part was a lot easier than I had imaged. It only required one line of code before initializing the Backbone.js router...
After the hashbang is replaced with just a hash, the backbone router will automatically remove the hash for browsers that support pushState. Furthermore, those two URL state changes are not saved in the browser's history state, so if the user clicks the back button there is no weirdness/unexpected redirects.
UPDATE: A better approach
It turns out that there is a dead simple approach which completely does away with hashbangs. Via BromBone:
This is a modified version of BromBone's suggested .htaccess rewrite rules: