The company I used to work for, Elastic Path in Vancouver, invested heavily in making sure that their ecommerce engine was Search Engine Optimized. This was a complicated black art that involved shaping the URL and content of your web page so that Google and Yahoo could pick up the results. There are tons of books out there about this subject, and even people who specialize in this. The company I’m working for now, through it’s iProspect.com branch, also focuses heavily on Search Engine Optimizing websites and pages.
SEO becomes harder with dynamic Rich UI applications that don’t rely on Ajax. While Google is able to pick up Flash SWF files during it’s crawl, this does not guarantee that the content is parsed correctly or given the same weight as any other file formats or a pure HTML/AJAX page. Worse, if the application uses a web service, how can it be guaranteed that all the pages are crawled and returned correctly.
Tools like Silverlight or Flex make the SEO question very complicated. How do you define an application that could have dynamic content and robust interaction while at the same time enable a web crawler to understand and categorize the underlying content correctly? What good is a website with superb interaction and all the benefits of Flash/Flex when people can’t find it?
One of the possible solutions is to make Flex and Flash projects into ‘pages’ and linking them to different URLs. Compared to the previous iteration where Flash SWFs were just one big file, deep linking will allow people to bookmark pages, and even include things like “my favorite shoe” into their blogs. The support for deep linking introduced in Flex 3, while in the right direction, still suffers . One of the problems might be the way in which deep linking is actually implemented. The #, which is also used as an HTML anchor tag, is used to tell the Flash SWF file where to go. So Flex URLs look like this : http://mydomain.com/doggie.swf?jakdsfl#browse=2 Does the googlebot know that this is a separate page? Only God knows. SWFAddress works better by allowing an URL of the form http://mydomain.com/doggie/#/is/in/heaven. But even if we have the most SEO friendly and crafter URLs, it still wouldn’t matter if the content was only living within web services and backend databases.
I went out for lunch with our architect, Tim, when we were working on our last project in Flex, and we brainstormed about ways in which SEO would work in the Flash/Flex world. Could there be a way in which we could present the data in web services intelligently into the front end? TIm suggested that one way to do this was to grab the XML data being returned by the web services and presenting it on the front end whenever a web crawler was detected as the user agent. In other words, we would serve up static versions of our content pages to be cached and googlelized, making their chances of returning much higher.
Fast forward one month later, and we’re writing most of our backend code in Grails. Grails has an incredible content negotiation feature (that I blogged about before). With Grails, we get a default scaffolding HTML page that gets generated pretty much for free. Moreover, with content negotiation, we can have both REST web services and static HTML pages backed by the same data. This means that instead of serving up some badly formatted junk XML page to a web crawler, we can serve up HTML pages for those who don’t have Flex/Flash/Silverlight installed. Plus, we could potentially change these pages for mobile devices like the iPhone that don’t yet support Flash.
Flex also has a robust URL Mapping and link generation mechanism, which allows our links to generate human readable links ( http://www.mydomain.com/this+is+awesome vs http://www.mydomain.com/show/content/49494 ).
Finally, the Layout functionality provided via Sitemesh in Grails means that we are able to embed the Flash swf file into every page that gets generated. While we need to figure out a way to not have the SWF reload on every page, this means that all the pages that we make publicly visible could link to our RIA application with very little effort. On clients that have the Flash player installed, we could provide the rich interaction available in programs like Flex and Flash. On those that don’t, we can provide a functional page in HTML via Grails. More importantly, to the search engines, these pages that are generated will be tagged and indexed correctly, making the content of our Flash applications visible and increase their visibility.
Could this be? Does Grails provide all the technical pieces for us to build our super-friendly-SEO enabled Rich Internet Application? Can we have robust and funky Flex applications that are also Search Engine Friendly? All the pieces seem to be there, but we won’t know until we plug in the pieces.
Many of these techniques could be ported into Ruby on Rails, Django and similar rapid development platforms, and if we figure them out, hopefully someone smarter than I will figure out how to do this.
This is all very exciting, but we have yet to figure out and implement the technical details of how to do this over the next week or so, so stay tuned for future blog posts about the potential success or failure of this technique.