Hadoop in Practice
I first encountered Hadoop in the fall of 2008 when I was working on an internet
crawl and analysis project at Verisign. My team was making discoveries similar to those
that Doug Cutting and others at Nutch had made several years earlier regarding how
to efficiently store and manage terabytes of crawled and analyzed data. At the time, we
were getting by with our home-grown distributed system, but the influx of a new data
stream and requirements to join that stream with our crawl data couldn’t be supported
by our existing system in the required timelines....