That JUST happened!

28/05/2008

Navigating the "NOISE" of the Internet

VC Stu Phillips has a great post on his blog talking about the ever-growing noise of the Internet.  The basic premise is that every day new services are launched online and as a result the overall universe of possible information to consume expands at a exponential rate.

One major side effect of this situation is that the pool of information that is personally relevant and ultimately useful to you becomes a smaller and smaller percentage of the available universe. 

One very small real world example is the Facebook news feed.  At the beginning, the news feed only showed you basic profile and status updates of your friends, which was fairly relevant to you.  Then they added advertising within the feed, which was relevant, but marginally useful (especially given the purportedly abysmal clickthrough rates of Facebook ads.  Now add on top of that a never ending slew of “application spam” from all of the newest Funwalls, Superpokes, and “which spice girl are you most like?” Facebook Apps.  As much as I want to know my friend Tim is totally like Scary Spice, I could hardly consider that relevant and/or useful in any way, shape or form.  And this is in a WALLED GARDEN!!!  Think about all the new services, sites, etc. that are launching every day online and the flood of new available information that comes with them.  Information OVERLOAD!

Shouldn’t search solve this problem?

Stu’s reaction:

“Search only gets you so far – it’s good if you are looking for the same information as everyone else or it’s already been indexed but if you are looking for new or breaking information that is relevant to you – you have a much bigger challenge than you did in the past. “

Now there have been attempts by search engines to track your personal search history (in addition to other behavioral metrics)—most notably Google’s personal search history—as a means of understanding what exactly you might be looking for/interested in as an individual.  In fact, Microsoft has been researching this very topic for a while (see MSFT paper from 2005).  Sure this can help make your search results somewhat more accurate, but the truth is that its really only going to get us so far.  We need to do better.

What does that mean?  Stu uses the engineering metaphor of a Signal-to-Noise ratio exhibited by a radio.  

“Think of yourself as a high performance radio receiver…

The challenge in building a receiver is to make sure that you have a high enough signal-to-noise ratio to get the job done – detect the faintest signal in which you are interested from the ambient noise.

Part of the system engineering choices of a high quality receiver is balancing the capture and amplification of the signal you want while at the same time minimizing the sources of noise in the receiver. The more signal you can capture and amplify while minimizing system noise, the better the signal-to-noise ratio of the receiver and the clearer the received signal.

One of the other important aspects of receiver design is minimizing spurious signals – signals that are a by-product of how the receiver has been implemented – these spurious signals are sources of interference and noise that mask the signal that really has your interest.”

What we need is a better “radio”.  

My friends and I have been discussing this phenomenon at length over the past couple of years and we are thoroughly convinced there is a HUGE opportunity in the market for a product/company that figures out the transparent method of making the Internet more personally relevant and useful for individual users.  

One major point we seem to debate over and over is whether or not the solution can be solved on either the server or client side alone.  On the server side solutions could include behavioral tracking technologies in the form of recommendation engines (see Aggregate Knowledge) or cross site tracking tags (see Tacoda)  or even something fairly narrow in scope, but with many data points such as Google’s personal search history.  On the client side we’ve seen all sorts of Firefox plugins from players like StumbleUpon, me.dium, atten.tv (now bunk I believe), etc.  

For a short while, I was convinced that privacy concerns and industry adoption were going to force a client side solution.  One friend and I even went as far as to scope out and build an experimental test product to collect data points and see if we could determine causal relationships between surfing habits and relevance of new information.

The data does show that an individual’s past browsing/usage history serves as a reasonable indicator of future interest in yet-to-be consumed information.  What was remarkable is what happened when we linked data from multiple individuals who were friends on a “social graph” of primitive sorts and tried to glean insights from their cumulative behavior.  The initial data suggests that people you know and trust tend to either influence directly or indirectly what you might look at online.  

This actually shouldn’t be much of a surprise to anyone.  Think about the last 5 hilarious videos you watched online.  How did you find them?  How about the last 5 news articles you found genuinely personally relevant?  The smart money says that chances are you were emailed, instant messaged, twittr’d or sent the link from a friend/family member Vs finding it by searching randomly on your own.  I would also note that a mediocre proxy for this might be found in tech savvy users who manage to subscribe to and read a larger (probably > 50) RSS feeds on a regular basis.  Google Reader has an OK idea of what I’m interested in and based upon my subscription habits and those of people like me, it manages to recommend new feeds I’d probably be interested in with what I would describe as a 2-5% hit rate.

So what IS the answer?

I am now fairly convinced that the ultimate winner will be some combination of both server side and client side technologies since there is specific value in having insight into broader trends as well as personal data points.  Could this be Firefox 4.0?  Imagine a browser that understands what you look at on the web, not just in terms of the URLs you visit, but the actual context of the pages you are looking at.  Oh yeah, the same browser also understands broader behavioral trends captured in a vast server side tracking database.  Now instead of visiting a news site and perusing for the things you like (let’s say Yankee’s Scores, movie reviews, and middle eastern current events) which may be below the fold or buried in a subsection, you now are presented immediately with those items front and center as soon as the site renders for the first time.  Let’s also say that I have been visiting quite a few car manufacturer sites recently and the browser’s model decides to pull some auto reviews and prominently feature them as well.  Now we’re cooking with grease!

The movement towards enabling the broader semantic web will ensure that the web pages of the future are structured in a way that browsers are able to deeply understand the content of pages in a deep contextual way and use that data to build more accurate models of users’ preferences.  In my opinion, this evolution of how pages are authored will be critical to fully maximizing the potential of the new personally relevant Internet.  That said—we should not expect it to be available or even completely necessary at for the first successful products to launch in the space.   

Contextual relevance isn’t just for advertisers anymore.

So who wants to build a better radio?  Any takers?

-->
Tumblr » powered Sid05 » templated