Impala looks like an interesting tool. I’m not sure when I’ll have a chance to dig into it first hand.

A couple weeks ago Cloudera released Impala , with the promise of Hive like functionality with much lower latency. I decided to give Hadoop-based SQL -like technologies another shot.

We set up a couple of machines in a cluster, pulled together a few sample datasets, and ran a few benchmarks comparing Impala, Hive, and MySQL, and the results were encouraging for Impala.