Since Cloudera doesn't support Tez in their Distribution right now (but it'll come, I'm pretty confident), we experimented with Apache Tez and CDH 5.4 a bit. To use Tez with CDH isn't so hard - and it works quite well. And our ETL and Hive jobs finished around 30 - 50% faster. Anyway, here the blueprint. We use CentOS 6.7 with Epel Repo. 1. Install maven 3.2.5 wget http://archive.apache.org/dist/maven/maven-3/3.2.5/binaries/apache-maven-3.2.5-bin.tar.gz tar xvfz apache-maven-3.2.5-bin.tar.gz -C /usr/local/ cd /usr/local/ ln -s apache-maven-3.2.5 maven => Compiling Tez with protobuf worked only with 3.2.5 in my case 1.1 Install 8_u40 JDK mkdir development && cd development (thats my dev-root) wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/8u40-b26/jdk-8u40-linux-x64.tar.gz" tar xvfz jdk...
Hey, I'm Alex. I founded X-Warp, Infinimesh, Infinite Devices, Scalytics and worked with Cloudera, E.On, Google, Evariant, and had the incredible luck to build products with outstanding people in my life, across the globe.