This article is a mirror article of machine translation, please click here to jump to the original article.

View: 9001|Reply: 0

Elasticsearch automatically cleans up indexes to free up disk space

[Copy link]
Posted on 3/13/2022 1:22:29 PM | | | |
Using Elasticsearch to store the logs of the business system, a new index will be generated every day according to the date, there are millions of documents every day, if the Elasticsearch index is not cleaned in time, it will cause a large disk space, cause a lot of pressure on the disk, and the search speed will also become slower. Previously, we had been manually cleaning ES's indexes and using shell scripts to clean them regularly, but ES can create an index lifecycle policy to help us delete old historical indexes.

The version of Elasticsearch used in this article is: 7.10.2

Today is 2022-03-11, and you can see that the index and documents for January 2022 still exist, as shown in the figure below:



Create an index template

Open the Kibana admin panel, click Index Management -> Index Template -> to create an index template with the following name:logger-template, the index module fills in the indexes that need to be matched, and then goes all the way to the next step, as shown in the figure below:



Note: When creating a new Elasticsearch index, the template will be automatically matched to complete the basic part of the index. That is, when creating the itsvse-2022-03-13 index, it will match the logger-template version, and then execute the rules set by the template.Historical indexes (even if they match templates) are not associated with index templates!!! In this article,Indexes that are not created based on a template cannot be automatically deleted

Create an index lifecycle policy

Open the Kibana admin panel and click Index Lifecycle Policy -> to create a policy, as shown in the following image:



Name filling"auto-clean-30-days-logger, uncheck Enable rolling updates, as shown in the following figure:



Open the delete phase to automatically clean up indexes up to 30 days ago, as shown in the following image:



Finally, add the policy to the index template, as shown in the following image:



At this point, the newly created index will be automatically saved for 30 days, and the index beyond 30 days will be deleted, and the historical index will need to be manually cleaned!! To manually delete an index, refer to:

Elasticsearch deletes and indexes all document data
https://www.itsvse.com/thread-6321-1-1.html
Review the index template you just created, which will be applied to the final template that matches the index. Details are as follows:

{
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "auto-clean-30-days-logger"
        }
      }
    },
    "aliases": {},
    "mappings": {}
  }
}


Review the policy configuration interface, for example:http://192.168.1.182:9200/_ilm/policy/auto-clean-30-days-logger?pretty

{
  "auto-clean-30-days-logger" : {
    "version" : 1,
    "modified_date" : "2022-03-11T06:26:01.972Z",
    "policy" : {
      "phases" : {
        "hot" : {
          "min_age" : "0ms",
          "actions" : {
            "set_priority" : {
              "priority" : 100
            }
          }
        },
        "delete" : {
          "min_age" : "30d",
          "actions" : {
            "delete" : {
              "delete_searchable_snapshot" : true
            }
          }
        }
      }
    }
  }
}


ILM (Index Lifecycle Management)

Our new index lifecycle policy is executed through timed polling.

ILM Service polls the execution policy in the background:

ILM defines four life cycle stages:
Hot: Actively updating and querying the index.
Warm: No more updating indexes, but still querying.
cold: No more indexes updated, few queries. The information still needs to be searchable, but it's okay if those queries are slower.
Delete: The index is no longer needed and can be safely deleted.

ILM Periodic Operation (indices.lifecycle.poll_interval),The default is 10 minutesto check that the index meets the policy criteria and take any steps required.
To avoid race situations, ILM may need to be run multiple times to perform all the steps required to complete an action. So, even if indices.lifecycle.poll_interval is set to 10 minutes and the index is rollover-compliant, it may take 20 minutes to complete the rollover.
For example, if we set more than 10 documents to roll over, but because this process is not implemented immediately, the 11th and 12th articles may still fall on the previous index, and the newly created documents will not fall on the new index until the rollover is successful.

Index lifecycle management settings in Elasticsearch:The hyperlink login is visible.

View ILM status



It's 2022-03-13, 2 indexes are automatically created every day, you can see the created indexes, alreadyPolicies are automatically associated, as shown in the figure below:



(End)





Previous:Maven projects package projects and dependencies separately
Next:【Practical Action】Use Docker to build an IPsec VPN server
Disclaimer:
All software, programming materials or articles published by Code Farmer Network are only for learning and research purposes; The above content shall not be used for commercial or illegal purposes, otherwise, users shall bear all consequences. The information on this site comes from the Internet, and copyright disputes have nothing to do with this site. You must completely delete the above content from your computer within 24 hours of downloading. If you like the program, please support genuine software, purchase registration, and get better genuine services. If there is any infringement, please contact us by email.

Mail To:help@itsvse.com