Setup edit

Elasticsearch for Apache Hadoop is an open-source, stand-alone, self-contained, small library that allows Hadoop jobs (whether using Map/Reduce or libraries built upon it such as Hive, Pig or Cascading) to interact with Elasticsearch. Data flows bi-directionaly so that applications can leverage transparently the Elasticsearch engine capabilities to significantly enrich their capabilities and increase the performance. Elasticsearch for Apache Hadoop offers first-class support for vanilla Map/Reduce, Cascading, Pig and Hive so that using Elasticsearch is literally like using resources within the Hadoop cluster.

While the official name of the project is Elasticsearch for Apache Hadoop throught out the documentation the term elasticsearch-hadoop will be used instead to increase readability.

Tip

If you are looking for Elasticsearch HDFS Snapshot/Restore plugin (a separate project), please refer to its home page.