Couple of months ago we discussed the Web Analytics and the application of the existing methods for the mobile content with one of my colleagues. It was quite useless discussion for him - he did not know anything about the mobile communications and mobile content and, in fact, he did not know too much about Web Analytics either. He mostly specialized in the passive Web traffic monitoring for last couple of years of his career and somehow believed that Web Analytics is almost the same with just a couple of additional useless reports for the marketing guys :) However, after this discussion I really started thinking about the differences between typical Web Analytics methods and tools and the ones we would need to do the same kind of analysis for the mobile content. Here I would like to present some of my thoughts.

First of all, what is Web Analytics everyone is talking about these days? Briefly, Web Analytics is a set of methods for collecting, storing, analyzing and reporting of the Web traffic. The goal of Web Analytics is to optimize the traffic for achieving various (business) goals. It is important to mention that Web traffic is essentially what you generate when surfing through a Web site using one of the modern browsers. Web Analytics does not require the complete traffic information and sometime does not require traffic information at all. Think about it more like about the set of events associated with the user navigating through your Web site - “user X used the search page”, then “user X added an item to the shopping cart”, then “user X accessed the checkout page” and then the absence of traffic for this user would be interpreted as a bounce.

Thanks to the modern (and very old) Web standards, the Web traffic is very easy to describe. The only protocol used is HTTP (over TCP or TLS), the type of the content transferred to the user is primarily HTML. JavaScript running on the client side allows you to do complicated things on the user's side. A little bit of magic and you have a wonderful tool like Google Analytics that produces tons of useful information with minimal configuration efforts, for free and almost seamlessly for the end user. This tool will work for a e-commerce Web site and for the personal blog without any changes. We will not go in the details, I assume you are familiar with the technology - since you are reading this post.

Now we are approaching the most interesting part. Why the same methods do not work for them mobile content? What is the mobile content, first of all?

Unlike in the modern desktop world with the powerful personal computers connected to multi-megabit-per-second networks, the mobile world is quite different. First of all, not all mobile data traffic is HTTP-based. And not all data being transferred is HTML. Mobile world is full of rich applications using their own data formats (not always something more or less standard like XML-based), quite frequently compressed and often very application-specific. Even if HTTP is used as transport, the navigation is mostly controlled by the rich client application UI rather then by the links and buttons on the Web page. As result, while the application is downloading the avatar image for one of the Instant Messaging contacts the mobile user may be engaged in the chat with someone else. Just by looking at the data exchange between the application and the handset, even if you can decode the traffic, you will never able to figure out the navigation path.

It may be also difficult to track the unique users with the rich client applications. First of all, some services may be session-less or anonymous, just like in the Web world. Secondly, even if the service supports a notion of session, the association between the user ID and the session ID happens somewhere at the initial handshake and then the session ID is hidden somewhere in the protocol. Unlike in HTML data transferred via HTTP there is unlimited number of ways to embed the session ID in the protocol, it is application-specific. Thus, even while having complete access to the traffic between the handset and the service destination, the user tracking method has to be application protocol-specific.

It is also impossible in most of the cases to tag the data being transferred. Even if the rich client application is using HTTP as transport, the HTTP implementation is usually quite limited. It does not support the cookies (except the ones that the application requires itself). The application may not be able even to handle very basic HTTP features like redirects.

However it does worth mentioning that sometimes the mobile communications offer something you will not get in the Internet. For example, quite often when the HTTP traffic passes through WAP gateway, additional information about the device and the subscriber gets inserted in the HTTP headers. Sometimes it can be the phone number of the subscriber, but this usually has to be configured on the WAP gateway. However, since in Web Analytics we are interested primarily in detecting the unique users, any kind of value provided as "x-up-subno" header would work. The biggest issue with using these values is that they can be easily spoofed by someone. However, it is not worse than spoofing the HTTP cookies so I do not believe this is an issue.

Of course, significant part of the mobile traffic is generated by the mobile browsers. And I believe that with the time the portion of the traffic generated by the rich client applications will decrease, as we get better devices with high resolution screens and the browser software getting as close as possible to the PC. However, at this point there is a number of completely different mobile browsers: ranging from OpenWave browser that understands only WML and does not support even WML Script and up to real Firefox running on Maemo-based devices. Most of these browsers are not the same as ones we run on the PCs, even if they do support Javascript - it is not usually quite limited. The same applies to HTML, CSS and Flash. Even some basic things like HTTP Referrer may not be supported or this field may be filtered out. And if the extreme case - Opera Mini. When you access a Web site using Opera Mini, a server at Opera actually downloads the contents of the page for you, renders it and sends it to the handset in very compact OBML format. Due to this variety of the Web browsers it is quite often when the content providers host special versions of the Web sites optimized for the mobile users.

Due to these numerous browser issues the only method that will really work would be the server log parsing. And, you will be lucky if the cookies work for some of the users :)

Again, just like in case with the rich applications, there may be some interesting opportunities out there: additional HTTP headers or even the original client IP address (if provided) could be used to uniquly identify the users.

The following methods will most likely work for the mobile Web traffic:

  • Image tags - if the chandset browser supports images (and they are not disabled because of the limited amount of traffic offered with the data plan!)
  • Redirecting the links - modifying the links on your pages so they point to your analytics engine and then redirected to the real content

All of this being said, in the mobile world most of the standard data collection methods used for Web Analytics either do not work at all or are not reliable enough. Even if a method appears to be working, there is always a danger that it provides partial data and completely ignores entire user segment. This is not acceptable for Web Analytics (unlike regular data loss which is perfectly acceptable as long as it is not specific to a particular user segment).

Thus, I believe that it is quite hard or even impossible to come up with a solution similar (in terms of simplicity and portability) to Google Analytics for the mobile content, unless you are targeting mobile Web users with top-of-the-line handsets (and data plans). But if that is the case, there is not too many differences between the regular Web users and mobile Web users, except that the latter ones probably access special mobile version of the Web site.

There is definitely an interesting area related to the Web Analytics for rich client applications. It is quite obvious that any Analytics solution that targets this kind of traffic has to be either passive and highly customizable (to be application protocol-specific) or the mobile application has to be modified in order to collect the client-side events and pass this information to the Analytics server (just like instrumented Flash applications do).

blog comments powered by Disqus


05 August 2009