This article is a mirror article of machine translation, please click here to jump to the original article.

View: 11050|Reply: 1

Tutorial on installing elasticsearch-analysis-ik in elasticsearch-6.5.2

[Copy link]
Posted on 12/13/2018 10:21:21 AM | | | |
The official default word segmentation plugin of elasticsearch is not ideal for Chinese word segmentation. For example, I will now take a specific example to show why the word segmentation plugin provided by the ES official website is not effective for Chinese word segmentation.
Reference Documentation:

https://www.elastic.co/guide/en/ ... ting_analyzers.html
https://www.elastic.co/guide/en/ ... ndices-analyze.html


We submit a piece of data to the analysis interface, as follows:

http://ip:9200/_analyze POST请求

{
  "analyzer": "standard",
  "text": "Architecthttp://www.itsvse.com, beautiful"
}


If you use Elasticsearch directly, you will definitely encounter embarrassing problems when dealing with Chinese content searches.Chinese words are divided into Chinese characters one by oneWhen using Kibana to draw, group according to term, and as a result, a Chinese character is divided into a group.

Fortunately, there are two Chinese word segmentation plug-ins written by medcl (one of the earliest people to study ES in China), one is ik and one is mmseg, and the following only introduces the usage of ik.


The IK Analysis plugin integrates the Lucene IK analyzer into elasticsearch and supports custom dictionaries.

elasticsearch-analysis-ik project address:https://github.com/medcl/elasticsearch-analysis-ik

Install elasticsearch-analysis-ik

First, stop Elasticsearch running, JPS finds the process ID, and kill it with kill -9 process ID! (I didn't test it anyway, anyway, it's safer to stop and install)

Install using elasticsearch-plugin (supported from v5.5.1 version):



Note: Replace 6.3.0 with your own version of elasticsearch

I installed it on my serverElasticsearch is 6.5.2version, so, the command is as follows:




The installation error is as follows:

Exception in thread "main" java.nio.file.FileSystemException: /usr/local/elasticsearch-6.5.2/config/analysis-ik: Operation not permitted


Elasticsearch installation plugin needs to be run as rootSo, we use su root to switch to administrator status, re-perform the installation, and it is successful, as shown in the figure below:



We test by submitting a post request to the interface again, and the post content is as follows:

{
  "analyzer": "ik_max_word",
  "text": "Architecthttp://www.itsvse.com, beautiful"
}


I found that I could understand our semantics normally, and divided "architect", "beautiful", and "architecture" into one word.

What is the difference between ik_max_word and ik_smart?

ik_max_word: The text will be split into the finest granularity, such as the "National Anthem of the People's Republic of China" will be split into "People's Republic of China, Chinese People, China, Chinese, People's Republic, People, People, People, Republic, Republic, He, National Anthem", and will exhaust all possible combinations;

ik_smart: It will do the coarsest level of splitting, such as splitting the "National Anthem of the People's Republic of China" into "National Anthem of the People's Republic of China".





Previous:Install the elasticsearch-6.5.2 elasticsearch-head plugin
Next:NetEase News APP, Himalayan APP data interface
 Landlord| Posted on 12/13/2018 11:04:13 AM |
Disclaimer:
All software, programming materials or articles published by Code Farmer Network are only for learning and research purposes; The above content shall not be used for commercial or illegal purposes, otherwise, users shall bear all consequences. The information on this site comes from the Internet, and copyright disputes have nothing to do with this site. You must completely delete the above content from your computer within 24 hours of downloading. If you like the program, please support genuine software, purchase registration, and get better genuine services. If there is any infringement, please contact us by email.

Mail To:help@itsvse.com