Note (2025): This article documents behaviour from the CDH 4.x/5.x Oozie + MapReduce stack. Oozie is legacy and many teams migrate to Apache Airflow, Dagster, Argo or cloud-native schedulers, but troubleshooting old clusters still requires understanding these patterns. The solutions here apply to Hadoop MR/LZO deployments still in maintenance or migration mode.
Symptom: Oozie fails after enabling LZO compression
When you add LZO codecs to core-site.xml, Oozie may suddenly fail to start any MapReduce job. Typical configuration looks like:
<property>
<name>io.compression.codecs</name>
<value>
org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
com.hadoop.compression.lzo.LzoCodec,
com.hadoop.compression.lzo.LzopCodec,
org.apache.hadoop.io.compress.BZip2Codec
</value>
</property>
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
After restarting services, Oozie reports:
java.lang.ClassNotFoundException:
Class com.hadoop.compression.lzo.LzoCodec not found
Root cause
Even though LZO is installed on Hadoop nodes, Oozie runs inside its own server JVM and classpath. If hadoop-lzo.jar is missing from /var/lib/oozie/ (or the directory your Oozie installation loads from), Oozie cannot load the LZO codec classes and refuses to start jobs.
Fix: Install the LZO JAR on the Oozie server
Copy (or symlink) the hadoop-lzo.jar into the Oozie library directory and restart Oozie:
cp /usr/lib/hadoop/lib/hadoop-lzo.jar /var/lib/oozie/
service oozie restart
Once Oozie can load com.hadoop.compression.lzo.LzoCodec, MapReduce jobs using LZO compression start normally.
The second common issue: missing or incorrect ShareLib
Even with the LZO JAR fixed, Oozie may still fail if the sharelib is missing or has wrong permissions. The sharelib provides the job launcher classpath for Oozie workflows.
Typical setup:
# Create Oozie home in HDFS
sudo -u hdfs hadoop fs -mkdir /user/oozie
sudo -u hdfs hadoop fs -chown oozie:oozie /user/oozie
# Extract the sharelib
mkdir /tmp/share
cd /tmp/share
tar xvfz /usr/lib/oozie/oozie-sharelib.tar.gz
# Upload it to HDFS
sudo -u oozie hadoop fs -put share /user/oozie/share
After uploading, restart Oozie or run:
oozie admin -sharelibupdate
Without a valid sharelib, Oozie cannot assemble the runtime classpath for MapReduce actions and will fail even if the LZO JAR is present on the Oozie server.
Uber JAR support (CDH 4.1+)
Starting with CDH 4.1, Oozie introduced a lightweight uber-jar feature. An uber-jar doesn’t bundle all dependencies into one fat JAR; instead, it carries references to dependent libraries in an internal lib/ directory.
To enable it globally, set the following in oozie-site.xml:
<property>
<name>oozie.action.mapreduce.uber.jar.enable</name>
<value>true</value>
</property>
Once enabled, workflow authors can tag a MapReduce action with:
oozie.mapreduce.uber.jar = <path-to-jar>
This tells Oozie that the provided JAR should be treated as an uber-jar, and Oozie will expand and distribute its libraries accordingly when launching the job.
Summary
- LZO in
core-site.xmloften breaks Oozie because Oozie’s server classpath cannot seehadoop-lzo.jar. - Fix by copying
hadoop-lzo.jarinto/var/lib/oozie/. - Ensure
/user/oozie/shareexists and contains a valid sharelib. - Enable uber-jar support if your workflows depend on it.
With these steps, Oozie resumes launching MapReduce jobs correctly even when LZO compression is enabled cluster-wide.
If you need help with distributed systems, backend engineering, or data platforms, check my Services.