IMPROVING BIT BY BIT
BusinessWeek has an article on Microsoft potentially taking steps to add "social search" features to it's service a la Yahoo! and Google. The opening sentence in the article promises good things to come:
"Software giant Microsoft is taking its MSN Search division on a comeback tour."
As a number of folks on memeorandum are pointing out, Microsoft may be playing catch-up on the social search front with Google and Yahoo!
However, there's one area that Microsoft may be ahead of both Google and Yahoo!, and that is in having a search platform that is based on a 64-bit architecture. And if this architecture has a competitive differentiation value, it may be Google and Yahoo! who'll potentially have to play catch-up to Microsoft.
A few days ago, in an interview John Battelle had featured on his blog with Gary Flake, formerly of Yahoo! and Overture, and now at Microsoft's Search Research Labs, there was a discussion of this possible differentiating advantage.
The whole thing was in response to a key question that Battelle asked. Here's the whole segment for your convenience:
"Say more about 64 bit architecture. Why is it different? What does it allow you to do with search you can't with 32 bits? How easy or hard might it be for Google and others to migrate to a similar architecture?
"64 bits" refers to the native address space of the CPUs that we use in our architecture. Having the entire architecture be 64 bit carries several implications that are quite subtle and mostly of interest only to engineers, but I'll take a shot at highlighting some of the important differences.
First, all search architectures tier their data stores so that the data can reside on relatively slow disks or fast RAM. For performance purposes, you want the most frequently accessed data to be on the fastest store (RAM). A 64 bit system can have vastly more RAM than a 32 bit system, which means that we can have a higher ratio of RAM to disk space (per CPU), versus others in the industry.
Second, we've only scratched the surface on the possibilities for using advanced hyperlink analysis for web relevance. There are a class of algorithms (like Pagerank) that work well in either 32 or 64 bit systems (technically, because the algorithms can be realized via a linear sweep of hyperlinks). However, there is an even richer class of algorithms that can only be efficiently built on a 64 bit system because you essentially have to have a significant part of the web stored close to a single CPU.
So, 64 bit systems pave the way for entirely new forms of relevance that look at how pages relate to one another. Finally, with the bigger address space, we may also see 64 bit systems critical to realizing more sophisticated relevance algorithms that analyze a result set in the aggregate as a final step. In this scenario, you may have a huge amount of data in memory, which 64 bit systems can do just fine. So, in short, it's a real big deal and it will be more important as the entire industry matures.
Google and others can certainly start to migrate towards 64 bit. In fact, everyone will have to at some point. However, there is a huge cost to make the jump, in terms of hardware, porting existing code, but especially in revamping algorithms to make the most use of the new resources. We took our medicine early because we could (we didn't have a legacy system as baggage). This is an example of us taking a longer-term view, in the sense that we increased our startup costs, but are now have a more strategic technology platform."
The question that I have for you all, especially those of you with a better understanding of the architectural directions of Google and Yahoo! Search, is
"How big of a deal is this for the next generation of search technologies?"
I'm also trying to figure out how expensive this would be got Google and Yahoo! to migrate towards, both in terms of infrastructure and transition costs.
Gary makes it sound like it's pretty heavy lifting.
I'd love to get a second opinion from the infinite wisdom of the crowds. Thanks.
The massive address space you get from going 64-bit is breathtaking. However, 32 to 64-bit is often a huge deal. Infrastructure, transitions, code, testing (lots of testing.)
I've been fixing stuff so that it works on Alpha for years, and there are some things that end up requiring complete rewrites. Non-trivial stuff.
In basic, basic terms, the biggest problem that gets introduced is that all of a sudden your pointers are bigger than standard integers (usually, sometimes the integers are bigger too, but this causes other problems), and lots of optimizations and "magic" code goes straight out the window.
Posted by: candice | Sunday, April 16, 2006 at 12:27 AM