dokcer中安裝elastic search
(1)下載ealastic search和kibana
docker pull elasticsearch:7.6.2
docker pull kibana:7.6.2
(2)設定
mkdir -p /mydata/elasticsearch/config
mkdir -p /mydata/elasticsearch/data
echo "http.host: 0.0.0.0" >/mydata/elasticsearch/config/elasticsearch.yml
chmod -R 777 /mydata/elasticsearch/
(3)啓動Elastic search
docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.6.2
設定開機啓動elasticsearch
docker update elasticsearch --restart=always
(4)啓動kibana:
docker run --name kibana -e ELASTICSEARCH_HOSTS=http://192.168.137.14:9200 -p 5601:5601 -d kibana:7.6.2
設定開機啓動kibana
docker update kibana --restart=always
(5)測試
檢視elasticsearch版本資訊: http://192.168.137.14:9200/
{
"name": "0adeb7852e00",
"cluster_name": "elasticsearch",
"cluster_uuid": "9gglpP0HTfyOTRAaSe2rIg",
"version": {
"number": "7.6.2",
"build_flavor": "default",
"build_type": "docker",
"build_hash": "ef48eb35cf30adf4db14086e8aabd07ef6fb113f",
"build_date": "2020-03-26T06:34:37.794943Z",
"build_snapshot": false,
"lucene_version": "8.4.0",
"minimum_wire_compatibility_version": "6.8.0",
"minimum_index_compatibility_version": "6.0.0-beta1"
},
"tagline": "You Know, for Search"
}
顯示elasticsearch 節點資訊http://192.168.137.14:9200/_cat/nodes ,
127.0.0.1 76 95 1 0.26 1.40 1.22 dilm * 0adeb7852e00
存取Kibana: http://192.168.137.14:5601/app/kibana
(1)GET/cat/nodes:檢視所有節點
如:http://192.168.137.14:9200/_cat/nodes :
127.0.0.1 61 91 11 0.08 0.49 0.87 dilm * 0adeb7852e00
注:*表示叢集中的主節點
(2)GET/cat/health:檢視es健康狀況
如: http://192.168.137.14:9200/_cat/health
1588332616 11:30:16 elasticsearch green 1 1 3 3 0 0 0 0 - 100.0%
注:green表示健康值正常
(3)GET/cat/master:檢視主節點
如: http://192.168.137.14:9200/_cat/master
vfpgxbusTC6-W3C2Np31EQ 127.0.0.1 127.0.0.1 0adeb7852e00
(4)GET/_cat/indicies:檢視所有索引 ,等價於mysql數據庫的show databases;
如: http://192.168.137.14:9200/_cat/indices
green open .kibana_task_manager_1 KWLtjcKRRuaV9so_v15WYg 1 0 2 0 39.8kb 39.8kb
green open .apm-agent-configuration cuwCpJ5ER0OYsSgAJ7bVYA 1 0 0 0 283b 283b
green open .kibana_1 PqK_LdUYRpWMy4fK0tMSPw 1 0 7 0 31.2kb 31.2kb
儲存一個數據,儲存在哪個索引的哪個型別下,指定用那個唯一標識
PUT customer/external/1;在customer索引下的external型別下儲存1號數據爲
PUT customer/external/1
{
"name":"John Doe"
}
PUT和POST都可以
POST新增。如果不指定id,會自動生成id。指定id就會修改這個數據,並新增版本號;
PUT可以新增也可以修改。PUT必須指定id;由於PUT需要指定id,我們一般用來做修改操作,不指定id會報錯。
下面 下麪是在postman中的測試數據:
建立數據成功後,顯示201 created表示插入記錄成功。
{
"_index": "customer",
"_type": "external",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
這些返回的JSON串的含義;這些帶有下劃線開頭的,稱爲元數據,反映了當前的基本資訊。
「_index」: 「customer」 表明該數據在哪個數據庫下;
「_type」: 「external」 表明該數據在哪個型別下;
「_id」: 「1」 表明被儲存數據的id;
「_version」: 1, 被儲存數據的版本
「result」: 「created」 這裏是建立了一條數據,如果重新put一條數據,則該狀態會變爲updated,並且版本號也會發生變化。
下面 下麪選用POST方式:
新增數據的時候,不指定ID,會自動的生成id,並且型別是新增:
再次使用POST插入數據,仍然是新增的:
新增數據的時候,指定ID,會使用該id,並且型別是新增:
再次使用POST插入數據,型別爲updated
GET /customer/external/1
http://192.168.137.14:9200/customer/external/1
{
"_index": "customer",//在哪個索引
"_type": "external",//在哪個型別
"_id": "1",//記錄id
"_version": 3,//版本號
"_seq_no": 6,//併發控制欄位,每次更新都會+1,用來做樂觀鎖
"_primary_term": 1,//同上,主分片重新分配,如重新啓動,就會變化
"found": true,
"_source": {
"name": "John Doe"
}
}
通過「if_seq_no=1&if_primary_term=1 」,當序列號匹配的時候,才進行修改,否則不修改。
範例:將id=1的數據更新爲name=1,然後再次更新爲name=2,起始_seq_no=6,_primary_term=1
(1)將name更新爲1
http://192.168.137.14:9200/customer/external/1?if_seq_no=6&if_primary_term=1
(2)將name更新爲2,更新過程中使用seq_no=6
http://192.168.137.14:9200/customer/external/1?if_seq_no=6&if_primary_term=1
出現更新錯誤。
(3)查詢新的數據
http://192.168.137.14:9200/customer/external/1
能夠看到_seq_no變爲7。
(4)再次更新,更新成功
http://192.168.137.14:9200/customer/external/1?if_seq_no=7&if_primary_term=1
(1)POST更新文件,帶有_update
http://192.168.137.14:9200/customer/external/1/_update
如果再次執行更新,則不執行任何操作,序列號也不發生變化
POST更新方式,會對比原來的數據,和原來的相同,則不執行任何操作(version和_seq_no)都不變。
(2)POST更新文件,不帶_update
在更新過程中,重複執行更新操作,數據也能夠更新成功,不會和原來的數據進行對比。
DELETE customer/external/1
DELETE customer
注:elasticsearch並沒有提供刪除型別的操作,只提供了刪除索引和文件的操作。
範例:刪除id=1的數據,刪除後繼續查詢
範例:刪除整個costomer索引數據
刪除前,所有的索引
green open .kibana_task_manager_1 KWLtjcKRRuaV9so_v15WYg 1 0 2 0 39.8kb 39.8kb
green open .apm-agent-configuration cuwCpJ5ER0OYsSgAJ7bVYA 1 0 0 0 283b 283b
green open .kibana_1 PqK_LdUYRpWMy4fK0tMSPw 1 0 7 0 31.2kb 31.2kb
yellow open customer nzDYCdnvQjSsapJrAIT8Zw 1 1 4 0 4.4kb 4.4kb
刪除「 customer 」索引
刪除後,所有的索引
green open .kibana_task_manager_1 KWLtjcKRRuaV9so_v15WYg 1 0 2 0 39.8kb 39.8kb
green open .apm-agent-configuration cuwCpJ5ER0OYsSgAJ7bVYA 1 0 0 0 283b 283b
green open .kibana_1 PqK_LdUYRpWMy4fK0tMSPw 1 0 7 0 31.2kb 31.2kb
語法格式:
{action:{metadata}}\n
{request body }\n
{action:{metadata}}\n
{request body }\n
這裏的批次操作,當發生某一條執行發生失敗時,其他的數據仍然能夠接着執行,也就是說彼此之間是獨立的。
bulk api以此按順序執行所有的action(動作)。如果一個單個的動作因任何原因失敗,它將繼續處理它後面剩餘的動作。當bulk api返回時,它將提供每個動作的狀態(與發送的順序相同),所以您可以檢查是否一個指定的動作是否失敗了。
範例1: 執行多條數據
POST customer/external/_bulk
{"index":{"_id":"1"}}
{"name":"John Doe"}
{"index":{"_id":"2"}}
{"name":"John Doe"}
執行結果
#! Deprecation: [types removal] Specifying types in bulk requests is deprecated.
{
"took" : 491,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "customer",
"_type" : "external",
"_id" : "2",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1,
"status" : 201
}
}
]
}
範例2:對於整個索引執行批次操作
POST /_bulk
{"delete":{"_index":"website","_type":"blog","_id":"123"}}
{"create":{"_index":"website","_type":"blog","_id":"123"}}
{"title":"my first blog post"}
{"index":{"_index":"website","_type":"blog"}}
{"title":"my second blog post"}
{"update":{"_index":"website","_type":"blog","_id":"123"}}
{"doc":{"title":"my updated blog post"}}
執行結果:
#! Deprecation: [types removal] Specifying types in bulk requests is deprecated.
{
"took" : 608,
"errors" : false,
"items" : [
{
"delete" : {
"_index" : "website",
"_type" : "blog",
"_id" : "123",
"_version" : 1,
"result" : "not_found",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 404
}
},
{
"create" : {
"_index" : "website",
"_type" : "blog",
"_id" : "123",
"_version" : 2,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "website",
"_type" : "blog",
"_id" : "MCOs0HEBHYK_MJXUyYIz",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1,
"status" : 201
}
},
{
"update" : {
"_index" : "website",
"_type" : "blog",
"_id" : "123",
"_version" : 3,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 3,
"_primary_term" : 1,
"status" : 200
}
}
]
}
準備了一份顧客銀行賬戶資訊的虛構的JSON文件樣本。每個文件都有下列的schema(模式)。
{
"account_number": 1,
"balance": 39225,
"firstname": "Amber",
"lastname": "Duke",
"age": 32,
"gender": "M",
"address": "880 Holmes Lane",
"employer": "Pyrami",
"email": "[email protected]",
"city": "Brogan",
"state": "IL"
}
https://github.com/elastic/elasticsearch/blob/master/docs/src/test/resources/accounts.json ,匯入測試數據,
POST bank/account/_bulk
ES支援兩種基本方式檢索;
資訊檢索
uri+請求體進行檢索
GET /bank/_search
{
"query": { "match_all": {} },
"sort": [
{ "account_number": "asc" },
{"balance":"desc"}
]
}
HTTP用戶端工具(),get請求不能夠攜帶請求體,
GET bank/_search?q=*&sort=account_number:asc
返回結果:
{
"took" : 235,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1000,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "0",
"_score" : null,
"_source" : {
"account_number" : 0,
"balance" : 16623,
"firstname" : "Bradshaw",
"lastname" : "Mckenzie",
"age" : 29,
"gender" : "F",
"address" : "244 Columbus Place",
"employer" : "Euron",
"email" : "[email protected]",
"city" : "Hobucken",
"state" : "CO"
},
"sort" : [
0
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "1",
"_score" : null,
"_source" : {
"account_number" : 1,
"balance" : 39225,
"firstname" : "Amber",
"lastname" : "Duke",
"age" : 32,
"gender" : "M",
"address" : "880 Holmes Lane",
"employer" : "Pyrami",
"email" : "[email protected]",
"city" : "Brogan",
"state" : "IL"
},
"sort" : [
1
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "2",
"_score" : null,
"_source" : {
"account_number" : 2,
"balance" : 28838,
"firstname" : "Roberta",
"lastname" : "Bender",
"age" : 22,
"gender" : "F",
"address" : "560 Kingsway Place",
"employer" : "Chillium",
"email" : "[email protected]",
"city" : "Bennett",
"state" : "LA"
},
"sort" : [
2
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "3",
"_score" : null,
"_source" : {
"account_number" : 3,
"balance" : 44947,
"firstname" : "Levine",
"lastname" : "Burks",
"age" : 26,
"gender" : "F",
"address" : "328 Wilson Avenue",
"employer" : "Amtap",
"email" : "[email protected]",
"city" : "Cochranville",
"state" : "HI"
},
"sort" : [
3
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "4",
"_score" : null,
"_source" : {
"account_number" : 4,
"balance" : 27658,
"firstname" : "Rodriquez",
"lastname" : "Flores",
"age" : 31,
"gender" : "F",
"address" : "986 Wyckoff Avenue",
"employer" : "Tourmania",
"email" : "[email protected]",
"city" : "Eastvale",
"state" : "HI"
},
"sort" : [
4
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "5",
"_score" : null,
"_source" : {
"account_number" : 5,
"balance" : 29342,
"firstname" : "Leola",
"lastname" : "Stewart",
"age" : 30,
"gender" : "F",
"address" : "311 Elm Place",
"employer" : "Diginetic",
"email" : "[email protected]",
"city" : "Fairview",
"state" : "NJ"
},
"sort" : [
5
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "6",
"_score" : null,
"_source" : {
"account_number" : 6,
"balance" : 5686,
"firstname" : "Hattie",
"lastname" : "Bond",
"age" : 36,
"gender" : "M",
"address" : "671 Bristol Street",
"employer" : "Netagy",
"email" : "[email protected]",
"city" : "Dante",
"state" : "TN"
},
"sort" : [
6
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "7",
"_score" : null,
"_source" : {
"account_number" : 7,
"balance" : 39121,
"firstname" : "Levy",
"lastname" : "Richard",
"age" : 22,
"gender" : "M",
"address" : "820 Logan Street",
"employer" : "Teraprene",
"email" : "[email protected]",
"city" : "Shrewsbury",
"state" : "MO"
},
"sort" : [
7
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "8",
"_score" : null,
"_source" : {
"account_number" : 8,
"balance" : 48868,
"firstname" : "Jan",
"lastname" : "Burns",
"age" : 35,
"gender" : "M",
"address" : "699 Visitation Place",
"employer" : "Glasstep",
"email" : "[email protected]",
"city" : "Wakulla",
"state" : "AZ"
},
"sort" : [
8
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "9",
"_score" : null,
"_source" : {
"account_number" : 9,
"balance" : 24776,
"firstname" : "Opal",
"lastname" : "Meadows",
"age" : 39,
"gender" : "M",
"address" : "963 Neptune Avenue",
"employer" : "Cedward",
"email" : "[email protected]",
"city" : "Olney",
"state" : "OH"
},
"sort" : [
9
]
}
]
}
}
(1)只有6條數據,這是因爲存在分頁查詢;
(2)詳細的欄位資訊,參照: https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-search.html
The response also provides the following information about the search request:
took
– how long it took Elasticsearch to run the query, in millisecondstimed_out
– whether or not the search request timed out_shards
– how many shards were searched and a breakdown of how many shards succeeded, failed, or were skipped.max_score
– the score of the most relevant document foundhits.total.value
- how many matching documents were foundhits.sort
- the document’s sort position (when not sorting by relevance score)hits._score
- the document’s relevance score (not applicable when usingmatch_all
)
Elasticsearch提供了一個可以執行查詢的Json風格的DSL。這個被稱爲Query DSL,該查詢語言非常全面。
一個查詢語句的典型結構
QUERY_NAME:{
ARGUMENT:VALUE,
ARGUMENT:VALUE,...
}
如果針對於某個欄位,那麼它的結構如下:
{
QUERY_NAME:{
FIELD_NAME:{
ARGUMENT:VALUE,
ARGUMENT:VALUE,...
}
}
}
GET bank/_search
{
"query": {
"match_all": {}
},
"from": 0,
"size": 5,
"sort": [
{
"account_number": {
"order": "desc"
}
}
]
}
query定義如何查詢;
GET bank/_search
{
"query": {
"match_all": {}
},
"from": 0,
"size": 5,
"sort": [
{
"account_number": {
"order": "desc"
}
}
],
"_source": ["balance","firstname"]
}
查詢結果:
{
"took" : 18,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1000,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "999",
"_score" : null,
"_source" : {
"firstname" : "Dorothy",
"balance" : 6087
},
"sort" : [
999
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "998",
"_score" : null,
"_source" : {
"firstname" : "Letha",
"balance" : 16869
},
"sort" : [
998
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "997",
"_score" : null,
"_source" : {
"firstname" : "Combs",
"balance" : 25311
},
"sort" : [
997
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "996",
"_score" : null,
"_source" : {
"firstname" : "Andrews",
"balance" : 17541
},
"sort" : [
996
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "995",
"_score" : null,
"_source" : {
"firstname" : "Phelps",
"balance" : 21153
},
"sort" : [
995
]
}
]
}
}
GET bank/_search
{
"query": {
"match": {
"account_number": "20"
}
}
}
match返回account_number=20的數據。
查詢結果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "20",
"_score" : 1.0,
"_source" : {
"account_number" : 20,
"balance" : 16418,
"firstname" : "Elinor",
"lastname" : "Ratliff",
"age" : 36,
"gender" : "M",
"address" : "282 Kings Place",
"employer" : "Scentric",
"email" : "[email protected]",
"city" : "Ribera",
"state" : "WA"
}
}
]
}
}
GET bank/_search
{
"query": {
"match": {
"address": "kings"
}
}
}
全文檢索,最終會按照評分進行排序,會對檢索條件進行分詞匹配。
查詢結果:
{
"took" : 30,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 5.990829,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "20",
"_score" : 5.990829,
"_source" : {
"account_number" : 20,
"balance" : 16418,
"firstname" : "Elinor",
"lastname" : "Ratliff",
"age" : 36,
"gender" : "M",
"address" : "282 Kings Place",
"employer" : "Scentric",
"email" : "[email protected]",
"city" : "Ribera",
"state" : "WA"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "722",
"_score" : 5.990829,
"_source" : {
"account_number" : 722,
"balance" : 27256,
"firstname" : "Roberts",
"lastname" : "Beasley",
"age" : 34,
"gender" : "F",
"address" : "305 Kings Hwy",
"employer" : "Quintity",
"email" : "[email protected]",
"city" : "Hayden",
"state" : "PA"
}
}
]
}
}
將需要匹配的值當成一整個單詞(不分詞)進行檢索
GET bank/_search
{
"query": {
"match_phrase": {
"address": "mill road"
}
}
}
查處address中包含mill_road的所有記錄,並給出相關性得分
檢視結果:
{
"took" : 32,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 8.926605,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 8.926605,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "[email protected]",
"city" : "Lopezo",
"state" : "AK"
}
}
]
}
}
match_phrase和Match的區別,觀察如下範例:
GET bank/_search
{
"query": {
"match_phrase": {
"address": "990 Mill"
}
}
}
查詢結果:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 10.806405,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 10.806405,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "[email protected]",
"city" : "Lopezo",
"state" : "AK"
}
}
]
}
}
使用match的keyword
GET bank/_search
{
"query": {
"match": {
"address.keyword": "990 Mill"
}
}
}
查詢結果,一條也未匹配到
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
修改匹配條件爲「990 Mill Road」
GET bank/_search
{
"query": {
"match": {
"address.keyword": "990 Mill Road"
}
}
}
查詢出一條數據
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 6.5032897,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 6.5032897,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "[email protected]",
"city" : "Lopezo",
"state" : "AK"
}
}
]
}
}
文字欄位的匹配,使用keyword,匹配的條件就是要顯示欄位的全部值,要進行精確匹配的。
match_phrase是做短語匹配,只要文字中包含匹配條件,就能匹配到。
GET bank/_search
{
"query": {
"multi_match": {
"query": "mill",
"fields": [
"state",
"address"
]
}
}
}
state或者address中包含mill,並且在查詢過程中,會對於查詢條件進行分詞。
查詢結果:
{
"took" : 28,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : 5.4032025,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 5.4032025,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "[email protected]",
"city" : "Lopezo",
"state" : "AK"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "136",
"_score" : 5.4032025,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "[email protected]",
"city" : "Urie",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "345",
"_score" : 5.4032025,
"_source" : {
"account_number" : 345,
"balance" : 9812,
"firstname" : "Parker",
"lastname" : "Hines",
"age" : 38,
"gender" : "M",
"address" : "715 Mill Avenue",
"employer" : "Baluba",
"email" : "[email protected]",
"city" : "Blackgum",
"state" : "KY"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "472",
"_score" : 5.4032025,
"_source" : {
"account_number" : 472,
"balance" : 25571,
"firstname" : "Lee",
"lastname" : "Long",
"age" : 32,
"gender" : "F",
"address" : "288 Mill Street",
"employer" : "Comverges",
"email" : "[email protected]",
"city" : "Movico",
"state" : "MT"
}
}
]
}
}
複合語句可以合併,任何其他查詢語句,包括符合語句。這也就意味着,複合語句之間
可以互相巢狀,可以表達非常複雜的邏輯。
must:必須達到must所列舉的所有條件
GET bank/_search
{
"query":{
"bool":{
"must":[
{"match":{"address":"mill"}},
{"match":{"gender":"M"}}
]
}
}
}
must_not,必須不匹配must_not所列舉的所有條件。
should,應該滿足should所列舉的條件。
範例:查詢gender=m,並且address=mill的數據
GET bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"gender": "M"
}
},
{
"match": {
"address": "mill"
}
}
]
}
}
}
查詢結果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 6.0824604,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 6.0824604,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "[email protected]",
"city" : "Lopezo",
"state" : "AK"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "136",
"_score" : 6.0824604,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "[email protected]",
"city" : "Urie",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "345",
"_score" : 6.0824604,
"_source" : {
"account_number" : 345,
"balance" : 9812,
"firstname" : "Parker",
"lastname" : "Hines",
"age" : 38,
"gender" : "M",
"address" : "715 Mill Avenue",
"employer" : "Baluba",
"email" : "[email protected]",
"city" : "Blackgum",
"state" : "KY"
}
}
]
}
}
must_not:必須不是指定的情況
範例:查詢gender=m,並且address=mill的數據,但是age不等於38的
GET bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"gender": "M"
}
},
{
"match": {
"address": "mill"
}
}
],
"must_not": [
{
"match": {
"age": "38"
}
}
]
}
}
查詢結果:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 6.0824604,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 6.0824604,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "[email protected]",
"city" : "Lopezo",
"state" : "AK"
}
}
]
}
}
should:應該達到should列舉的條件,如果到達會增加相關文件的評分,並不會改變查詢的結果。如果query中只有should且只有一種匹配規則,那麼should的條件就會被作爲預設匹配條件二區改變查詢結果。
範例:匹配lastName應該等於Wallace的數據
GET bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"gender": "M"
}
},
{
"match": {
"address": "mill"
}
}
],
"must_not": [
{
"match": {
"age": "18"
}
}
],
"should": [
{
"match": {
"lastname": "Wallace"
}
}
]
}
}
}
查詢結果:
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 12.585751,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 12.585751,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "[email protected]",
"city" : "Lopezo",
"state" : "AK"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "136",
"_score" : 6.0824604,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "[email protected]",
"city" : "Urie",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "345",
"_score" : 6.0824604,
"_source" : {
"account_number" : 345,
"balance" : 9812,
"firstname" : "Parker",
"lastname" : "Hines",
"age" : 38,
"gender" : "M",
"address" : "715 Mill Avenue",
"employer" : "Baluba",
"email" : "[email protected]",
"city" : "Blackgum",
"state" : "KY"
}
}
]
}
}
能夠看到相關度越高,得分也越高。
並不是所有的查詢都需要產生分數,特別是哪些僅用於filtering過濾的文件。爲了不計算分數,elasticsearch會自動檢查場景並且優化查詢的執行。
GET bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"address": "mill"
}
}
],
"filter": {
"range": {
"balance": {
"gte": "10000",
"lte": "20000"
}
}
}
}
}
}
這裏先是查詢所有匹配address=mill的文件,然後再根據10000<=balance<=20000進行過濾查詢結果
查詢結果:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 5.4032025,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 5.4032025,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "[email protected]",
"city" : "Lopezo",
"state" : "AK"
}
}
]
}
}
Each must
, should
, and must_not
element in a Boolean query is referred to as a query clause. How well a document meets the criteria in each must
or should
clause contributes to the document’s relevance score. The higher the score, the better the document matches your search criteria. By default, Elasticsearch returns documents ranked by these relevance scores.
在boolean查詢中,must
, should
和must_not
元素都被稱爲查詢子句 。 文件是否符合每個「must」或「should」子句中的標準,決定了文件的「相關性得分」。 得分越高,文件越符合您的搜尋條件。 預設情況下,Elasticsearch返回根據這些相關性得分排序的文件。
The criteria in a must_not
clause is treated as a filter. It affects whether or not the document is included in the results, but does not contribute to how documents are scored. You can also explicitly specify arbitrary filters to include or exclude documents based on structured data.
「must_not」子句中的條件被視爲「過濾器」。
它影響文件是否包含在結果中, 但不影響文件的評分方式。 還可以顯式地指定任意過濾器來包含或排除基於結構化數據的文件。
filter在使用過程中,並不會計算相關性得分:
GET bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"address": "mill"
}
}
],
"filter": {
"range": {
"balance": {
"gte": "10000",
"lte": "20000"
}
}
}
}
}
}
查詢結果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 213,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "20",
"_score" : 0.0,
"_source" : {
"account_number" : 20,
"balance" : 16418,
"firstname" : "Elinor",
"lastname" : "Ratliff",
"age" : 36,
"gender" : "M",
"address" : "282 Kings Place",
"employer" : "Scentric",
"email" : "[email protected]",
"city" : "Ribera",
"state" : "WA"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "37",
"_score" : 0.0,
"_source" : {
"account_number" : 37,
"balance" : 18612,
"firstname" : "Mcgee",
"lastname" : "Mooney",
"age" : 39,
"gender" : "M",
"address" : "826 Fillmore Place",
"employer" : "Reversus",
"email" : "[email protected]",
"city" : "Tooleville",
"state" : "OK"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "51",
"_score" : 0.0,
"_source" : {
"account_number" : 51,
"balance" : 14097,
"firstname" : "Burton",
"lastname" : "Meyers",
"age" : 31,
"gender" : "F",
"address" : "334 River Street",
"employer" : "Bezal",
"email" : "[email protected]",
"city" : "Jacksonburg",
"state" : "MO"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "56",
"_score" : 0.0,
"_source" : {
"account_number" : 56,
"balance" : 14992,
"firstname" : "Josie",
"lastname" : "Nelson",
"age" : 32,
"gender" : "M",
"address" : "857 Tabor Court",
"employer" : "Emtrac",
"email" : "[email protected]",
"city" : "Sunnyside",
"state" : "UT"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "121",
"_score" : 0.0,
"_source" : {
"account_number" : 121,
"balance" : 19594,
"firstname" : "Acevedo",
"lastname" : "Dorsey",
"age" : 32,
"gender" : "M",
"address" : "479 Nova Court",
"employer" : "Netropic",
"email" : "[email protected]",
"city" : "Islandia",
"state" : "CT"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "176",
"_score" : 0.0,
"_source" : {
"account_number" : 176,
"balance" : 18607,
"firstname" : "Kemp",
"lastname" : "Walters",
"age" : 28,
"gender" : "F",
"address" : "906 Howard Avenue",
"employer" : "Eyewax",
"email" : "[email protected]",
"city" : "Why",
"state" : "KY"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "183",
"_score" : 0.0,
"_source" : {
"account_number" : 183,
"balance" : 14223,
"firstname" : "Hudson",
"lastname" : "English",
"age" : 26,
"gender" : "F",
"address" : "823 Herkimer Place",
"employer" : "Xinware",
"email" : "[email protected]",
"city" : "Robbins",
"state" : "ND"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "222",
"_score" : 0.0,
"_source" : {
"account_number" : 222,
"balance" : 14764,
"firstname" : "Rachelle",
"lastname" : "Rice",
"age" : 36,
"gender" : "M",
"address" : "333 Narrows Avenue",
"employer" : "Enaut",
"email" : "[email protected]",
"city" : "Wright",
"state" : "AZ"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "227",
"_score" : 0.0,
"_source" : {
"account_number" : 227,
"balance" : 19780,
"firstname" : "Coleman",
"lastname" : "Berg",
"age" : 22,
"gender" : "M",
"address" : "776 Little Street",
"employer" : "Exoteric",
"email" : "[email protected]",
"city" : "Eagleville",
"state" : "WV"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "272",
"_score" : 0.0,
"_source" : {
"account_number" : 272,
"balance" : 19253,
"firstname" : "Lilly",
"lastname" : "Morgan",
"age" : 25,
"gender" : "F",
"address" : "689 Fleet Street",
"employer" : "Biolive",
"email" : "[email protected]",
"city" : "Sunbury",
"state" : "OH"
}
}
]
}
}
能看到所有文件的 「_score」 : 0.0。
和match一樣。匹配某個屬性的值。全文檢索欄位用match,其他非text欄位匹配用term。
Avoid using the
term
query fortext
fields.避免對文字欄位使用「term」查詢
By default, Elasticsearch changes the values of
text
fields as part of analysis. This can make finding exact matches fortext
field values difficult.預設情況下,Elasticsearch作爲analysis的一部分更改’ text '欄位的值。這使得爲「text」欄位值尋找精確匹配變得困難。
To search
text
field values, use the match.要搜尋「text」欄位值,請使用匹配。
https://www.elastic.co/guide/en/elasticsearch/reference/7.6/query-dsl-term-query.html
使用term匹配查詢
GET bank/_search
{
"query": {
"term": {
"address": "mill Road"
}
}
}
查詢結果:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
一條也沒有匹配到
而更換爲match匹配時,能夠匹配到32個文件
也就是說,全文檢索欄位用match,其他非text欄位匹配用term。
聚合提供了從數據中分組和提取數據的能力。最簡單的聚合方法大致等於SQL Group by和SQL聚合函數。在elasticsearch中,執行搜尋返回this(命中結果),並且同時返回聚合結果,把以響應中的所有hits(命中結果)分隔開的能力。這是非常強大且有效的,你可以執行查詢和多個聚合,並且在一次使用中得到各自的(任何一個的)返回結果,使用一次簡潔和簡化的API啦避免網路往返。
「size」:0
size:0不顯示搜尋數據
aggs:執行聚合。聚合語法如下:
"aggs":{
"aggs_name這次聚合的名字,方便展示在結果集中":{
"AGG_TYPE聚合的型別(avg,term,terms)":{}
}
},
搜尋address中包含mill的所有人的年齡分佈以及平均年齡,但不顯示這些人的詳情
GET bank/_search
{
"query": {
"match": {
"address": "Mill"
}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 10
}
},
"ageAvg": {
"avg": {
"field": "age"
}
},
"balanceAvg": {
"avg": {
"field": "balance"
}
}
},
"size": 0
}
查詢結果:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"ageAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 38,
"doc_count" : 2
},
{
"key" : 28,
"doc_count" : 1
},
{
"key" : 32,
"doc_count" : 1
}
]
},
"ageAvg" : {
"value" : 34.0
},
"balanceAvg" : {
"value" : 25208.0
}
}
}
複雜:
按照年齡聚合,並且求這些年齡段的這些人的平均薪資
GET bank/_search
{
"query": {
"match_all": {}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 100
},
"aggs": {
"ageAvg": {
"avg": {
"field": "balance"
}
}
}
}
},
"size": 0
}
輸出結果:
{
"took" : 49,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1000,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"ageAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 31,
"doc_count" : 61,
"ageAvg" : {
"value" : 28312.918032786885
}
},
{
"key" : 39,
"doc_count" : 60,
"ageAvg" : {
"value" : 25269.583333333332
}
},
{
"key" : 26,
"doc_count" : 59,
"ageAvg" : {
"value" : 23194.813559322032
}
},
{
"key" : 32,
"doc_count" : 52,
"ageAvg" : {
"value" : 23951.346153846152
}
},
{
"key" : 35,
"doc_count" : 52,
"ageAvg" : {
"value" : 22136.69230769231
}
},
{
"key" : 36,
"doc_count" : 52,
"ageAvg" : {
"value" : 22174.71153846154
}
},
{
"key" : 22,
"doc_count" : 51,
"ageAvg" : {
"value" : 24731.07843137255
}
},
{
"key" : 28,
"doc_count" : 51,
"ageAvg" : {
"value" : 28273.882352941175
}
},
{
"key" : 33,
"doc_count" : 50,
"ageAvg" : {
"value" : 25093.94
}
},
{
"key" : 34,
"doc_count" : 49,
"ageAvg" : {
"value" : 26809.95918367347
}
},
{
"key" : 30,
"doc_count" : 47,
"ageAvg" : {
"value" : 22841.106382978724
}
},
{
"key" : 21,
"doc_count" : 46,
"ageAvg" : {
"value" : 26981.434782608696
}
},
{
"key" : 40,
"doc_count" : 45,
"ageAvg" : {
"value" : 27183.17777777778
}
},
{
"key" : 20,
"doc_count" : 44,
"ageAvg" : {
"value" : 27741.227272727272
}
},
{
"key" : 23,
"doc_count" : 42,
"ageAvg" : {
"value" : 27314.214285714286
}
},
{
"key" : 24,
"doc_count" : 42,
"ageAvg" : {
"value" : 28519.04761904762
}
},
{
"key" : 25,
"doc_count" : 42,
"ageAvg" : {
"value" : 27445.214285714286
}
},
{
"key" : 37,
"doc_count" : 42,
"ageAvg" : {
"value" : 27022.261904761905
}
},
{
"key" : 27,
"doc_count" : 39,
"ageAvg" : {
"value" : 21471.871794871793
}
},
{
"key" : 38,
"doc_count" : 39,
"ageAvg" : {
"value" : 26187.17948717949
}
},
{
"key" : 29,
"doc_count" : 35,
"ageAvg" : {
"value" : 29483.14285714286
}
}
]
}
}
}
查出所有年齡分佈,並且這些年齡段中M的平均薪資和F的平均薪資以及這個年齡段的總體平均薪資
GET bank/_search
{
"query": {
"match_all": {}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 100
},
"aggs": {
"genderAgg": {
"terms": {
"field": "gender.keyword"
},
"aggs": {
"balanceAvg": {
"avg": {
"field": "balance"
}
}
}
},
"ageBalanceAvg": {
"avg": {
"field": "balance"
}
}
}
}
},
"size": 0
}
輸出結果:
{
"took" : 119,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1000,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"ageAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 31,
"doc_count" : 61,
"genderAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "M",
"doc_count" : 35,
"balanceAvg" : {
"value" : 29565.628571428573
}
},
{
"key" : "F",
"doc_count" : 26,
"balanceAvg" : {
"value" : 26626.576923076922
}
}
]
},
"ageBalanceAvg" : {
"value" : 28312.918032786885
}
}
]
.......//省略其他
}
}
}
Mapping(對映)
Maping是用來定義一個文件(document),以及它所包含的屬性(field)是如何儲存和索引的。比如:使用maping來定義:
{
"bank" : {
"mappings" : {
"properties" : {
"account_number" : {
"type" : "long"
},
"address" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"age" : {
"type" : "long"
},
"balance" : {
"type" : "long"
},
"city" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"email" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"employer" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"firstname" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"gender" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"lastname" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"state" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
ElasticSearch7-去掉type概念
關係型數據庫中兩個數據表示是獨立的,即使他們裏面有相同名稱的列也不影響使用,但ES中不是這樣的。elasticsearch是基於Lucene開發的搜尋引擎,而ES中不同type下名稱相同的filed最終在Lucene中的處理方式是一樣的。
Elasticsearch 7.x URL中的type參數爲可選。比如,索引一個文件不再要求提供文件型別。
Elasticsearch 8.x 不再支援URL中的type參數。
解決:
將索引從多型別遷移到單型別,每種型別文件一個獨立索引
將已存在的索引下的型別數據,全部遷移到指定位置即可。詳見數據遷移
Elasticsearch 7.x
- Specifying types in requests is deprecated. For instance, indexing a document no longer requires a document
type
. The new index APIs arePUT {index}/_doc/{id}
in case of explicit ids andPOST {index}/_doc
for auto-generated ids. Note that in 7.0,_doc
is a permanent part of the path, and represents the endpoint name rather than the document type.- The
include_type_name
parameter in the index creation, index template, and mapping APIs will default tofalse
. Setting the parameter at all will result in a deprecation warning.- The
_default_
mapping type is removed.Elasticsearch 8.x
- Specifying types in requests is no longer supported.
- The
include_type_name
parameter is removed.
建立索引並指定對映
PUT /my_index
{
"mappings": {
"properties": {
"age": {
"type": "integer"
},
"email": {
"type": "keyword"
},
"name": {
"type": "text"
}
}
}
}
輸出:
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "my_index"
}
GET /my_index
輸出結果:
{
"my_index" : {
"aliases" : { },
"mappings" : {
"properties" : {
"age" : {
"type" : "integer"
},
"email" : {
"type" : "keyword"
},
"employee-id" : {
"type" : "keyword",
"index" : false
},
"name" : {
"type" : "text"
}
}
},
"settings" : {
"index" : {
"creation_date" : "1588410780774",
"number_of_shards" : "1",
"number_of_replicas" : "1",
"uuid" : "ua0lXhtkQCOmn7Kh3iUu0w",
"version" : {
"created" : "7060299"
},
"provided_name" : "my_index"
}
}
}
}
PUT /my_index/_mapping
{
"properties": {
"employee-id": {
"type": "keyword",
"index": false
}
}
}
這裏的 「index」: false,表明新增的欄位不能被檢索,只是一個冗餘欄位。
對於已經存在的欄位對映,我們不能更新。更新必須建立新的索引,進行數據遷移。
先建立new_twitter的正確對映。然後使用如下方式進行數據遷移。
POST reindex [固定寫法]
{
"source":{
"index":"twitter"
},
"dest":{
"index":"new_twitters"
}
}
將舊索引的type下的數據進行遷移
POST reindex [固定寫法]
{
"source":{
"index":"twitter",
"twitter":"twitter"
},
"dest":{
"index":"new_twitters"
}
}
更多詳情見: https://www.elastic.co/guide/en/elasticsearch/reference/7.6/docs-reindex.html
GET /bank/_search
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1000,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "bank",
"_type" : "account",//型別爲account
"_id" : "1",
"_score" : 1.0,
"_source" : {
"account_number" : 1,
"balance" : 39225,
"firstname" : "Amber",
"lastname" : "Duke",
"age" : 32,
"gender" : "M",
"address" : "880 Holmes Lane",
"employer" : "Pyrami",
"email" : "[email protected]",
"city" : "Brogan",
"state" : "IL"
}
},
...
GET /bank/_search
想要將年齡修改爲integer
PUT /newbank
{
"mappings": {
"properties": {
"account_number": {
"type": "long"
},
"address": {
"type": "text"
},
"age": {
"type": "integer"
},
"balance": {
"type": "long"
},
"city": {
"type": "keyword"
},
"email": {
"type": "keyword"
},
"employer": {
"type": "keyword"
},
"firstname": {
"type": "text"
},
"gender": {
"type": "keyword"
},
"lastname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"state": {
"type": "keyword"
}
}
}
}
檢視「newbank」的對映:
GET /newbank/_mapping
能夠看到age的對映型別被修改爲了integer.
將bank中的數據遷移到newbank中
POST _reindex
{
"source": {
"index": "bank",
"type": "account"
},
"dest": {
"index": "newbank"
}
}
執行輸出:
#! Deprecation: [types removal] Specifying types in reindex requests is deprecated.
{
"took" : 768,
"timed_out" : false,
"total" : 1000,
"updated" : 0,
"created" : 1000,
"deleted" : 0,
"batches" : 1,
"version_conflicts" : 0,
"noops" : 0,
"retries" : {
"bulk" : 0,
"search" : 0
},
"throttled_millis" : 0,
"requests_per_second" : -1.0,
"throttled_until_millis" : 0,
"failures" : [ ]
}
檢視newbank中的數據
一個tokenizer(分詞器)接收一個字元流,將之分割爲獨立的tokens(詞元,通常是獨立的單詞),然後輸出tokens流。
例如:whitespace tokenizer遇到空白字元時分割文字。它會將文字「Quick brown fox!」分割爲[Quick,brown,fox!]。
該tokenizer(分詞器)還負責記錄各個terms(詞條)的順序或position位置(用於phrase短語和word proximity詞近鄰查詢),以及term(詞條)所代表的原始word(單詞)的start(起始)和end(結束)的character offsets(字串偏移量)(用於高亮顯示搜尋的內容)。
elasticsearch提供了很多內建的分詞器,可以用來構建custom analyzers(自定義分詞器)。
關於分詞器: https://www.elastic.co/guide/en/elasticsearch/reference/7.6/analysis.html
POST _analyze
{
"analyzer": "standard",
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}
執行結果:
{
"tokens" : [
{
"token" : "the",
"start_offset" : 0,
"end_offset" : 3,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "2",
"start_offset" : 4,
"end_offset" : 5,
"type" : "<NUM>",
"position" : 1
},
{
"token" : "quick",
"start_offset" : 6,
"end_offset" : 11,
"type" : "<ALPHANUM>",
"position" : 2
},
{
"token" : "brown",
"start_offset" : 12,
"end_offset" : 17,
"type" : "<ALPHANUM>",
"position" : 3
},
{
"token" : "foxes",
"start_offset" : 18,
"end_offset" : 23,
"type" : "<ALPHANUM>",
"position" : 4
},
{
"token" : "jumped",
"start_offset" : 24,
"end_offset" : 30,
"type" : "<ALPHANUM>",
"position" : 5
},
{
"token" : "over",
"start_offset" : 31,
"end_offset" : 35,
"type" : "<ALPHANUM>",
"position" : 6
},
{
"token" : "the",
"start_offset" : 36,
"end_offset" : 39,
"type" : "<ALPHANUM>",
"position" : 7
},
{
"token" : "lazy",
"start_offset" : 40,
"end_offset" : 44,
"type" : "<ALPHANUM>",
"position" : 8
},
{
"token" : "dog's",
"start_offset" : 45,
"end_offset" : 50,
"type" : "<ALPHANUM>",
"position" : 9
},
{
"token" : "bone",
"start_offset" : 51,
"end_offset" : 55,
"type" : "<ALPHANUM>",
"position" : 10
}
]
}
所有的語言分詞,預設使用的都是「Standard Analyzer」,但是這些分詞器針對於中文的分詞,並不友好。爲此需要安裝中文的分詞器。
注意:不能用預設elasticsearch-plugin install xxx.zip 進行自動安裝
https://github.com/medcl/elasticsearch-analysis-ik/releases/download 對應es版本安裝
在前面安裝的elasticsearch時,我們已經將elasticsearch容器的「/usr/share/elasticsearch/plugins」目錄,對映到宿主機的「 /mydata/elasticsearch/plugins」目錄下,所以比較方便的做法就是下載「/elasticsearch-analysis-ik-7.6.2.zip」檔案,然後解壓到該資料夾下即可。安裝完畢後,需要重新啓動elasticsearch容器。
如果不嫌麻煩,還可以採用如下的方式。
[root@hadoop-104 ~]# curl http://localhost:9200
{
"name" : "0adeb7852e00",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "9gglpP0HTfyOTRAaSe2rIg",
"version" : {
"number" : "7.6.2", #版本號爲7.6.2
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "ef48eb35cf30adf4db14086e8aabd07ef6fb113f",
"build_date" : "2020-03-26T06:34:37.794943Z",
"build_snapshot" : false,
"lucene_version" : "8.4.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
[root@hadoop-104 ~]#
[root@hadoop-104 ~]# docker exec -it elasticsearch /bin/bash
[root@0adeb7852e00 elasticsearch]#
[root@0adeb7852e00 elasticsearch]# pwd
/usr/share/elasticsearch
#下載ik7.6.2
[root@0adeb7852e00 elasticsearch]# wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.6.2/elasticsearch-analysis-ik-7.6.2.zip
[root@0adeb7852e00 elasticsearch]# unzip elasticsearch-analysis-ik-7.6.2.zip -d ink
Archive: elasticsearch-analysis-ik-7.6.2.zip
creating: ik/config/
inflating: ik/config/main.dic
inflating: ik/config/quantifier.dic
inflating: ik/config/extra_single_word_full.dic
inflating: ik/config/IKAnalyzer.cfg.xml
inflating: ik/config/surname.dic
inflating: ik/config/suffix.dic
inflating: ik/config/stopword.dic
inflating: ik/config/extra_main.dic
inflating: ik/config/extra_stopword.dic
inflating: ik/config/preposition.dic
inflating: ik/config/extra_single_word_low_freq.dic
inflating: ik/config/extra_single_word.dic
inflating: ik/elasticsearch-analysis-ik-7.6.2.jar
inflating: ik/httpclient-4.5.2.jar
inflating: ik/httpcore-4.4.4.jar
inflating: ik/commons-logging-1.2.jar
inflating: ik/commons-codec-1.9.jar
inflating: ik/plugin-descriptor.properties
inflating: ik/plugin-security.policy
[root@0adeb7852e00 elasticsearch]#
#移動到plugins目錄下
[root@0adeb7852e00 elasticsearch]# mv ik plugins/
[root@0adeb7852e00 elasticsearch]# rm -rf elasticsearch-analysis-ik-7.6.2.zip
確認是否安裝好了分詞器
使用預設
GET my_index/_analyze
{
"text":"我是中國人"
}
請觀察執行結果:
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<IDEOGRAPHIC>",
"position" : 0
},
{
"token" : "是",
"start_offset" : 1,
"end_offset" : 2,
"type" : "<IDEOGRAPHIC>",
"position" : 1
},
{
"token" : "中",
"start_offset" : 2,
"end_offset" : 3,
"type" : "<IDEOGRAPHIC>",
"position" : 2
},
{
"token" : "國",
"start_offset" : 3,
"end_offset" : 4,
"type" : "<IDEOGRAPHIC>",
"position" : 3
},
{
"token" : "人",
"start_offset" : 4,
"end_offset" : 5,
"type" : "<IDEOGRAPHIC>",
"position" : 4
}
]
}
GET my_index/_analyze
{
"analyzer": "ik_smart",
"text":"我是中國人"
}
輸出結果:
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "是",
"start_offset" : 1,
"end_offset" : 2,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "中國人",
"start_offset" : 2,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 2
}
]
}
GET my_index/_analyze
{
"analyzer": "ik_max_word",
"text":"我是中國人"
}
輸出結果:
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "是",
"start_offset" : 1,
"end_offset" : 2,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "中國人",
"start_offset" : 2,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "中國",
"start_offset" : 2,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 3
},
{
"token" : "國人",
"start_offset" : 3,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 4
}
]
}
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 擴充套件設定</comment>
<!--使用者可以在這裏設定自己的擴充套件字典 -->
<entry key="ext_dict"></entry>
<!--使用者可以在這裏設定自己的擴充套件停止詞字典-->
<entry key="ext_stopwords"></entry>
<!--使用者可以在這裏設定遠端擴充套件字典 -->
<entry key="remote_ext_dict">http://192.168.137.14/es/fenci.txt</entry>
<!--使用者可以在這裏設定遠端擴充套件停止詞字典-->
<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>
原來的xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 擴充套件設定</comment>
<!--使用者可以在這裏設定自己的擴充套件字典 -->
<entry key="ext_dict"></entry>
<!--使用者可以在這裏設定自己的擴充套件停止詞字典-->
<entry key="ext_stopwords"></entry>
<!--使用者可以在這裏設定遠端擴充套件字典 -->
<!-- <entry key="remote_ext_dict">words_location</entry> -->
<!--使用者可以在這裏設定遠端擴充套件停止詞字典-->
<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>
修改完成後,需要重新啓動elasticsearch容器,否則修改不生效。
更新完成後,es只會對於新增的數據用更新分詞。歷史數據是不會重新分詞的。如果想要歷史數據重新分詞,需要執行:
POST my_index/_update_by_query?conflicts=proceed
http://192.168.137.14/es/fenci.txt,這個是nginx上資源的存取路徑
在執行下面 下麪範例之前,需要安裝nginx(安裝方法見安裝nginx),然後建立「fenci.txt」檔案,內容如下:
echo "櫻桃薩其馬,帶你甜蜜入夏" > /mydata/nginx/html/fenci.txt
測試效果:
GET my_index/_analyze
{
"analyzer": "ik_max_word",
"text":"櫻桃薩其馬,帶你甜蜜入夏"
}
輸出結果:
{
"tokens" : [
{
"token" : "櫻桃",
"start_offset" : 0,
"end_offset" : 2,
"type" : "CN_WORD",
"position" : 0
},
{
"token" : "薩其馬",
"start_offset" : 2,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 1
},
{
"token" : "帶你",
"start_offset" : 6,
"end_offset" : 8,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "甜蜜",
"start_offset" : 8,
"end_offset" : 10,
"type" : "CN_WORD",
"position" : 3
},
{
"token" : "入夏",
"start_offset" : 10,
"end_offset" : 12,
"type" : "CN_WORD",
"position" : 4
}
]
}
隨便啓動一個nginx範例,只是爲了複製出設定
docker run -p80:80 --name nginx -d nginx:1.10
將容器內的組態檔拷貝到/mydata/nginx/conf/ 下
mkdir -p /mydata/nginx/html
mkdir -p /mydata/nginx/logs
mkdir -p /mydata/nginx/conf
docker container cp nginx:/etc/nginx/* /mydata/nginx/conf/
#由於拷貝完成後會在config中存在一個nginx資料夾,所以需要將它的內容移動到conf中
mv /mydata/nginx/conf/nginx/* /mydata/nginx/conf/
rm -rf /mydata/nginx/conf/nginx
終止原容器:
docker stop nginx
執行命令刪除原容器:
docker rm nginx
建立新的Nginx,執行以下命令
docker run -p 80:80 --name nginx \
-v /mydata/nginx/html:/usr/share/nginx/html \
-v /mydata/nginx/logs:/var/log/nginx \
-v /mydata/nginx/conf/:/etc/nginx \
-d nginx:1.10
設定開機啓動nginx
docker update nginx --restart=always
建立「/mydata/nginx/html/index.html」檔案,測試是否能夠正常存取
echo '<h2>hello nginx!</h2>' >index.html
存取:http://ngix所在主機的IP:80/index.html
這裏的版本要和所按照的ELK版本匹配。
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.6.2</version>
</dependency>
在spring-boot-dependencies中所依賴的ELK版本位6.8.7
<elasticsearch.version>6.8.7</elasticsearch.version>
需要在專案中將它改爲7.6.2
<properties>
...
<elasticsearch.version>7.6.2</elasticsearch.version>
</properties>
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-document-index.html
@Test
public void indexData() throws IOException {
IndexRequest indexRequest = new IndexRequest ("users");
User user = new User();
user.setUserName("張三");
user.setAge(20);
user.setGender("男");
String jsonString = JSON.toJSONString(user);
//設定要儲存的內容
indexRequest.source(jsonString, XContentType.JSON);
//執行建立索引和儲存數據
IndexResponse index = client.index(indexRequest, GulimallElasticSearchConfig.COMMON_OPTIONS);
System.out.println(index);
}
測試前:
測試後:
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-search.html
@Test
public void searchData() throws IOException {
GetRequest getRequest = new GetRequest(
"users",
"_-2vAHIB0nzmLJLkxKWk");
GetResponse getResponse = client.get(getRequest, RequestOptions.DEFAULT);
System.out.println(getResponse);
String index = getResponse.getIndex();
System.out.println(index);
String id = getResponse.getId();
System.out.println(id);
if (getResponse.isExists()) {
long version = getResponse.getVersion();
System.out.println(version);
String sourceAsString = getResponse.getSourceAsString();
System.out.println(sourceAsString);
Map<String, Object> sourceAsMap = getResponse.getSourceAsMap();
System.out.println(sourceAsMap);
byte[] sourceAsBytes = getResponse.getSourceAsBytes();
} else {
}
}
查詢state="AK"的文件:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 22, //匹配到了22條
"relation": "eq"
},
"max_score": 3.7952394,
"hits": [{
"_index": "bank",
"_type": "account",
"_id": "210",
"_score": 3.7952394,
"_source": {
"account_number": 210,
"balance": 33946,
"firstname": "Cherry",
"lastname": "Carey",
"age": 24,
"gender": "M",
"address": "539 Tiffany Place",
"employer": "Martgo",
"email": "[email protected]",
"city": "Fairacres",
"state": "AK"
}
},
....//省略其他
]
}
}
搜尋address中包含mill的所有人的年齡分佈以及平均年齡,平均薪資
GET bank/_search
{
"query": {
"match": {
"address": "Mill"
}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 10
}
},
"ageAvg": {
"avg": {
"field": "age"
}
},
"balanceAvg": {
"avg": {
"field": "balance"
}
}
}
}
java實現
/**
* 複雜檢索:在bank中搜尋address中包含mill的所有人的年齡分佈以及平均年齡,平均薪資
* @throws IOException
*/
@Test
public void searchData() throws IOException {
//1. 建立檢索請求
SearchRequest searchRequest = new SearchRequest();
//1.1)指定索引
searchRequest.indices("bank");
//1.2)構造檢索條件
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.matchQuery("address","Mill"));
//1.2.1)按照年齡分佈進行聚合
TermsAggregationBuilder ageAgg=AggregationBuilders.terms("ageAgg").field("age").size(10);
sourceBuilder.aggregation(ageAgg);
//1.2.2)計算平均年齡
AvgAggregationBuilder ageAvg = AggregationBuilders.avg("ageAvg").field("age");
sourceBuilder.aggregation(ageAvg);
//1.2.3)計算平均薪資
AvgAggregationBuilder balanceAvg = AggregationBuilders.avg("balanceAvg").field("balance");
sourceBuilder.aggregation(balanceAvg);
System.out.println("檢索條件:"+sourceBuilder);
searchRequest.source(sourceBuilder);
//2. 執行檢索
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
System.out.println("檢索結果:"+searchResponse);
//3. 將檢索結果封裝爲Bean
SearchHits hits = searchResponse.getHits();
SearchHit[] searchHits = hits.getHits();
for (SearchHit searchHit : searchHits) {
String sourceAsString = searchHit.getSourceAsString();
Account account = JSON.parseObject(sourceAsString, Account.class);
System.out.println(account);
}
//4. 獲取聚合資訊
Aggregations aggregations = searchResponse.getAggregations();
Terms ageAgg1 = aggregations.get("ageAgg");
for (Terms.Bucket bucket : ageAgg1.getBuckets()) {
String keyAsString = bucket.getKeyAsString();
System.out.println("年齡:"+keyAsString+" ==> "+bucket.getDocCount());
}
Avg ageAvg1 = aggregations.get("ageAvg");
System.out.println("平均年齡:"+ageAvg1.getValue());
Avg balanceAvg1 = aggregations.get("balanceAvg");
System.out.println("平均薪資:"+balanceAvg1.getValue());
}
可以嘗試對比列印的條件和執行結果,和前面的ElasticSearch的檢索語句和檢索結果進行比較;
ctrl+home:回到文件首部;
ctril+end:回到文件尾部。