Moving Elasticsearch indexes with elasticdump

Elasticdump is an open-source tool, which according to its official description has the goal of moving and saving Elasticsearch indexes.
Elasticdump works by requesting data from an input and consequently shipping it into an output. Either input or output may be an Elasticsearch URL or a File.
To install Elasticdump, we can make use of an npm package or a docker image.
For those of you who wonder what is npm, it is short for Node Package Manager, which first and foremost it is an online repository hosting open-source Node.js projects.
You can install npm through:
# Ubuntu
sudo apt-get install npm

# CentOS
sudo yum install npm

With npm installed, install elasticdump with:

npm install elasticdump -g

Using elasticdump

Using elasticdump is as simple as performing something similar to:
elasticdump \
  --input={{INPUT}} \
  --output={{OUTPUT}} \
  --type={{TYPE}}
Where:
{{INPUT}} or {{OUTPUT}} can be either one of:
Elasticsearch URL: {protocol}://{host}:{port}/{index}
or
Fille: {FilePath}
{{TYPE}} can be one of the following: analyzer, mapping, data

Export my data – Option 1

OK. Let’s get our hands dirty and export some data.
On my case, I want to export data from a docker-daemon index and push it into a remote index.
Using Elasticsearch URL for both input and output, this is my command:
elasticdump \
  --input=http://user:password@old_node:9200/docker-daemon \
  --output=http://user:password@new_node:9200/docker-daemon \
  --type=data
If you follow the output, you will see something similar to:
Thu, 21 Sep 2017 14:40:29 GMT | starting dump
Thu, 21 Sep 2017 14:40:31 GMT | got 53 objects from source elasticsearch (offset: 0)
Thu, 21 Sep 2017 14:40:33 GMT | sent 53 objects to destination elasticsearch, wrote 53
Thu, 21 Sep 2017 14:40:33 GMT | got 0 objects from source elasticsearch (offset: 53)
Thu, 21 Sep 2017 14:40:33 GMT | Total Writes: 53
Thu, 21 Sep 2017 14:40:33 GMT | dump complete
Let’s now confirm that the index was transferred successfully:
On the target Elaticsearch node, perform:
[kelson@localhost ~]# curl -u user:password localhost:9200/_cat/indices?v | grep docker-daemon
Positive output would be:
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                             Dload  Upload   Total   Spent    Left  Speed
100  2394  100  2394    0     0   245k      0 --:--:-- --:--:-- --:--:--  259k
green  open   logstash-docker-daemon        eilJdiZvSGixTNIfMwP-kw   5   2         41            0    292.3kb        292.3kb

Export my data – Option 2

In option 1, we want first to output the index into a file before moving it to the new Node.
This can be achieved by something similar to:
elasticdump \
  --input=http://user:password@old_node:9200/docker-daemon \
  --output=/data/docker-daemon.json \
  --type=data
elasticdump \
  --input=/data/docker-daemon.json \
  --output=http://user:password@new_node:9200/docker-daemon \
  --type=data
Note that we first export the data from the index into the /data/docker-daemon.json file.
We then use this file as input to be moved into the new node.

Analyzers and Mappings

What was shown was the most basic method of moving an index from a node into a new one.
In a probable scenario when moving an index, you will want to move the index with its appropriate analyzers and field mappings.
If this is the case, you will want to move these before moving the index. This would be achieved by cascading the 3 statements as shown:
elasticdump \
  --input=http://user:password@old_node:9200/docker-daemon \
  --output=http://user:password@new_node:9200/docker-daemon \
  --type=analyzer
elasticdump \
  --input=http://user:password@old_node:9200/docker-daemon \
  --output=http://user:password@new_node:9200/docker-daemon \
  --type=mapping
elasticdump \
  --input=http://user:password@old_node:9200/docker-daemon \
  --output=http://user:password@new_node:9200/docker-daemon \
  --type=data

Extra Options

What was shown is the basic manipulation of elasticdump.
A series of other parameters may be used depending on your requirement.
Some commonly used parameters include:
–searchBody: Useful when you do not want to export an entire index. Example:  –searchBody ‘{“query”:{“term”:{“containerName”: “nginx”}}}’
–limit: Indicates how many objects to move in batch per operation. Defaults to 100
–delete: Delete documents from the input as they are moved.
A full list of these can be found on the official tool page here.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.