How to Manually Clean Indexes from Elasticsearch

elastic-logo-200In a previous post we covered getting started with the ELK stack (Elasticsearch, Logstash, Kibana).  Now we need to remove old indexes manually from an Elasticsearch datastore sorted by age.  We’ll be using the curator tool in a sandbox using pip and virtualenv on a CentOS/EL system.  Let’s get started.

 

 

Install Pip and Virtualenv
Pip is to Python what CPAN is to Perl, it’s a package manager for Python.
Virtualenv is a way of installing Python libraries and running applications in a sandboxed virtual environment, similar to a system chroot.  You almost always want to be using virtualenv when doing anything involving installation of Python libraries which might conflict with the ones shipped by the operating system distribution.

yum install python-pip python-virtualenv

Create a Virtualenv
Now you’re going to create a virtualenv sandbox from which to install all the necessary Python libraries via pip for running the curator.

virtualenv elk_cleanup

At this point you should see python setuptools run and complete..

New python executable in elk_cleanup/bin/python
Installing Setuptools............................
done.

Enter the Virtualenv
Now we’re going to activate our Python virtualenv, you should see your shell prompt change.

 . elk_cleanup/bin/activate
(elk_cleanup)[root@bullwinkle ~]#

Install Curator and Libs
Now that we’re safely inside our cozy virtualenv, install curator

pip install elasticsearch-curator

You’ll see a bunch of stuff happen – libraries and dependencies install.  Namely you’ll just need elasticsearch-curator, elasticsearch, click, urllib Python modules.  You might see some warnings too but that’s ok – so long as they aren’t fatal.

Downloading/unpacking elasticsearch-curator
  Downloading elasticsearch-curator-3.4.1.tar.gz (89kB): 89kB downloaded
  Running setup.py egg_info for package elasticsearch-curator
  Downloading/unpacking elasticsearch>=1.8.0,<2.4.0 (from elasticsearch-curator)
  Downloading elasticsearch-2.2.0.tar.gz (57kB): 57kB downloaded
  Running setup.py egg_info for package elasticsearch
  Downloading/unpacking click>=3.3 (from elasticsearch-curator)
  Downloading click-6.2.tar.gz (281kB): 281kB downloaded
  Running setup.py egg_info for package click
  Downloading/unpacking urllib3>=1.8,<2.0 (from elasticsearch>=1.8.0,<2.4.0->elasticsearch-curator)
  Downloading urllib3-1.14.tar.gz (161kB): 161kB downloaded
  Running setup.py egg_info for package urllib3
  Installing collected packages: elasticsearch-curator, elasticsearch, click, urllib3
  Running setup.py install for elasticsearch-curator
  Installing curator script to /root/elk_cleanup/bin
  Installing es_repo_mgr script to /root/elk_cleanup/bin
  Running setup.py install for elasticsearch
  Running setup.py install for click  
  Running setup.py install for urllib3

Successfully installed elasticsearch-curator elasticsearch click urllib3
Cleaning up...

What’s important is the last bolded line – necessary Python packages are installed.

Run the Curator
Ensure your ELK stack is up and running, or at the last that elasticsearch is accepting API connections.  Now run curator.  The following example will cull anything older than 30days and assumes you’re running it from the localhost where elasticsearch resides.

curator --host 127.0.0.1 delete indices --older-than 30 --time-unit days --timestring '%Y.%m.%d'

You should see output like below, and if all goes well – success!

2016-02-18 17:57:11,655 INFO      Job starting: delete indices
2016-02-18 17:57:13,696 INFO      Pruning Kibana-related indices to prevent accidental deletion.
2016-02-18 17:57:13,696 INFO      Action delete will be performed on the following indices: [u'logstash-2016.01.01', u'logstash-2016.01.02', u'logstash-2016.01.03', u'logstash-2016.01.04', u'logstash-2016.01.05', u'logstash-2016.01.06', u'logstash-2016.01.07', u'logstash-2016.01.08', u'logstash-2016.01.09', u'logstash-2016.01.10', u'logstash-2016.01.11', u'logstash-2016.01.12', u'logstash-2016.01.13', u'logstash-2016.01.14', u'logstash-2016.01.15', u'logstash-2016.01.16', u'logstash-2016.01.17', u'logstash-2016.01.18', u'logstash-2016.01.19']
2016-02-18 17:57:13,699 INFO      Deleting indices as a batch operation:
2016-02-18 17:57:13,699 INFO      ---deleting index logstash-2016.01.01
2016-02-18 17:57:13,699 INFO      ---deleting index logstash-2016.01.02
2016-02-18 17:57:13,700 INFO      ---deleting index logstash-2016.01.03
2016-02-18 17:57:13,700 INFO      ---deleting index logstash-2016.01.04
2016-02-18 17:57:13,700 INFO      ---deleting index logstash-2016.01.05
2016-02-18 17:57:13,700 INFO      ---deleting index logstash-2016.01.06
2016-02-18 17:57:13,700 INFO      ---deleting index logstash-2016.01.07
2016-02-18 17:57:13,700 INFO      ---deleting index logstash-2016.01.08
2016-02-18 17:57:13,700 INFO      ---deleting index logstash-2016.01.09
2016-02-18 17:57:13,700 INFO      ---deleting index logstash-2016.01.10
2016-02-18 17:57:13,700 INFO      ---deleting index logstash-2016.01.11
2016-02-18 17:57:13,700 INFO      ---deleting index logstash-2016.01.12
2016-02-18 17:57:13,700 INFO      ---deleting index logstash-2016.01.13
2016-02-18 17:57:13,700 INFO      ---deleting index logstash-2016.01.14
2016-02-18 17:57:13,701 INFO      ---deleting index logstash-2016.01.15
2016-02-18 17:57:13,701 INFO      ---deleting index logstash-2016.01.16
2016-02-18 17:57:13,701 INFO      ---deleting index logstash-2016.01.17
2016-02-18 17:57:13,701 INFO      ---deleting index logstash-2016.01.18
2016-02-18 17:57:13,701 INFO      ---deleting index logstash-2016.01.19
2016-02-18 17:57:43,831 INFO      Job completed successfully.

Exit and Cleanup
You can exit a virtualenv at any time by running deactivate

(elk_cleanup)[root@bullwinkle ~]# deactivate
[root@bullwinkle ~]#

You can also switch between multiple virtualenv environments if needed via the workon command.

(elk_cleanup)[root@bullwinkle ~]# workon ospurge
(ospurge)[root@bullwinkle ~]#

Deleting Specific Indices
Sometimes you want to remove specific indices, you can use curl against the API to do this.  First, query the indices that are available:

curl -GET http://localhost:9200/_cat/indices
yellow open logstash-2016.12.14 5 1 134956540 0 35.9gb 35.9gb 
red    open .kibana             1 1                           
red    open logstash-2016.12.07 5 1  66032152 0 20.1gb 20.1gb 
red    open logstash-2016.12.06 5 1 103104019 0 33.4gb 33.4gb 
red    open logstash-2016.02.19 5 1                           
red    open logstash-2016.02.18 5 1                           
yellow open logstash-2016.12.10 5 1 286599323 0 74.9gb 74.9gb 
yellow open logstash-2016.12.11 5 1 102210780 0 26.9gb 26.9gb

You can delete them by name:

curl -XDELETE http://localhost:9200/logstash-2016.12.10

If you wanted to delete all indices:

curl -XDELETE http://localhost:9200/_all

F$#% Ruin It: Starting from Scratch
Useful in a testing environment or for starting over: remove everything.

systemctl stop elasticsearch
rm -rf /usr/share/elasticsearch
yum erase elasticsearch -y
yum install elasticsearch -y
sytemctl start elasticsearch

Another reason for this might be data loss on disk and Elasticsearch is still trying to recover a non-existent index.  The data is gone and you don’t care but Elastisearch won’t start because of it.

About Will Foster

hobo devop/sysadmin/SRE
This entry was posted in open source, sysadmin and tagged , , , , , , . Bookmark the permalink.

3 Responses to How to Manually Clean Indexes from Elasticsearch

  1. jab2805 says:

    When I run the above cleanup I get the following error:

    (elk_cleanup)[root@va-log-mon ~]# curator –host 127.0.0.1 show indices –older-than 30 –time-unit days –timestring ‘%Y.%m.%d’
    Error: no such option: –host
    (elk_cleanup)[root@va-log-mon ~]#

    version:

    (elk_cleanup)[root@va-log-mon ~]# curator –version
    curator, version 4.0.1
    (elk_cleanup)[root@va-log-mon ~]#

    Like

    • Will Foster says:

      You might try running without the –older-than syntax and see, but what you are doing is valid syntax. Worst case, try the packaged version of it.

      Like

    • I got the same issue and found that happens because of version incompatibility, You can use curator_cli command in order to overcome this issue. Following sample command will help you to resolve your issue.
      Sample Command : curator_cli –host 127.0.0.1 delete_indices –filter_list ‘{“filtertype”:”age”,”source”:”creation_date”,”direction”:”older”,”unit”:”days”,”unit_count”:4}’
      My curator version : curator, version 5.4.1

      Like

Have a Squat, Leave a Reply ..

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.