search your hadoop data and get real-time results

deep api integration makes searching data in hadoop easy

  • Real-time search for Hadoop data
  • Works with popular libraries
  • Simple to use
  • Documentation

Need Support? Don't forget that we've got your back.

Great fit for “Big Data” and Hadoop

Elasticsearch’s distributed nature allows it to search ­and store­ vast amounts of information in near real­time.

Rich Language to Ask Better Questions

Elasticsearch provides a rich language for users to ask better questions in order to get clearer answers, significantly faster.

Leave Search to the Search Engine and Focus on Data Transformation

Elasticsearch is natively integrated with Hadoop so there is no gap for the user to bridge; we provide dedicated Input and OutputFormat for vanilla Map/Reduce, Taps for reading and writing data in Cascading, and Storages for Pig and Hive so you can access Elasticsearch just as if the data were in HDFS.

Scale Your Hadoop Cluster Alongside Elasticsearch

Distributed nature of the Map/Reduce model fits really well on top of Elasticsearch to correlate the number of Map/Reduce tasks with the number of Elasticsearch shards.

Documentation

Get documentation for elasticsearch-hadoop support. If that isn’t sufficient, you can email the mailing list. See the community page for all your options.

Enhanced Workflow

Elasticsearch enables Hadoop users (including Map/Reduce, Hive, Pig, Cascading and Spark) to enhance their workflow with a full­-blown search engine.

spark-logo

Eliminate Network Traffic and Improve Performance

Our integration enables cluster co­locations by exposing shard information to Hadoop. Job tasks are run on the same machines as the Elasticsearch shards themselves, eliminating network traffic and improving performance through data locality.

Real-time Responses Improve Job Execution and Cost

Elasticsearch provides near real-­time responses (think milliseconds) that significantly improve a Hadoop job’s execution and the cost associated with it, especially when running on ‘rented resources’ such as Amazon EMR.

sounds cool doesn’t it?

download elasticsearch
for apache hadoop