I am not able to understand the real essence of hadoop. If I have the enough resources to buy a supercomputer that can process petabytes of data, then why would I need a Hadoop infrastructure to manage such huge data?
I am not able to understand the real essence of hadoop. If I have the enough resources to buy a supercomputer that can process petabytes of data, then why would I need a Hadoop infrastructure to manage such huge data?
Having enough resources often make us dumb. Let me give you an example(don't worry, it involves Hadoop) which will make it clear. The cost of Cray's cheapest supercomputer, XC30-AC is $500,000(IIRC). And what is the cost of a decent computer with decent RAM, CPU and disk???And how much would you need to buy a bunch of them and use their power collectively???How much space and resources do you need to place and handle these machines???How difficult is it to find folks with decent programming skills so that they can write MR jobs for you???
These are just a few things. Hadoop is open source. Use it and tweak it as you wish. Get awesome support through the mailing list for free. Not only support, but suggestions as well. I hope you get the point.
Utilizing your resources wisely is more important than just having them.