November, 7th 2013

We encountered the issue about large amount of intermediate results are generated by Map Task. It degrades our performance a lot. By searching solution online, we decided to setup Hadoop cluster using Lzo module to compress both map and reduce results. Couple of good resources & tutorial for setting up Lzo on Hadoop list below:


Details of setting up Lzo package:

  1. Download lzo package: git clone git://
  2. Install required tools: lzo-devel, ant, java and gcc
  3. Build it by ant: ant clean compile-native tar
  4. Copy built library to path ~/hadoop-1.2.1/lib/native/Linux-amd64-64/ on all nodes (master and slaves)
  5. Add configure in both core-site.xml and mapped-site.xml




