novatechflow

Posts

Showing posts with the label sql

Hadoop based SQL engines

Apache Hadoop comes more and more into the focus of business critical architectures and applications. Naturally SQL based solutions are the first to get considered, but the market is evolving and new tools are coming up, but leaving unnoticed. Listed below an overview over currently available Hadoop based SQL technologies. The must haves are: Open Source (various contributors), low-latency querying possible, supporting CRUD (mostly!) and statements like CREATE, INSERT INTO, SELECT * FROM (limit..), UPDATE Table SET A1=2 WHERE, DELETE, and DROP TABLE. Apache Hive (SQL-like, with interactive SQL (Stinger) Apache Drill (ANSI SQL support) Apache Spark ( Spark SQL , queries only, add data via Hive, RDD or Parquet ) Apache Phoenix (built atop Apache HBase , lacks full transaction support, relational operators and some built-in functions) Presto from Facebook (can query Hive, Cassandra , relational DBs & etc. Doesn't seem to be designed for low-latency responses across...