FreezeJ' Blog

elasticsearch7.x python客户端

2022-04-13

官方API文档:https://elasticsearch-py.readthedocs.io/en/v7.14.2/

setup初始化

模块安装:

pip3 install elasticsearch==7.14

测试连接:

from elasticsearch import Elasticsearch
from pprint import pprint

es = Elasticsearch(
    ['xxxxxx.com'],
    http_auth=('用户名', '密码'),
    scheme="https",
    port=443,
)
pprint(es.info())

结果:

{'cluster_name': 'elasticsearch',
 'cluster_uuid': 'xxxxxxxxxxxxxx',
 'name': 'xxxxxxxxx',
 'tagline': 'You Know, for Search',
 'version': {'build_date': '2021-09-15T10:18:09.722761972Z',
             'build_flavor': 'default',
             'build_hash': '6bc13727ce758c0e943c3c21653b3da82f627f75',
             'build_snapshot': False,
             'build_type': 'tar',
             'lucene_version': '8.9.0',
             'minimum_index_compatibility_version': '6.0.0-beta1',
             'minimum_wire_compatibility_version': '6.8.0',
             'number': '7.14.2'}}

如果提示The client is unable to verify that the server is Elasticsearch due security,在配置文件elasticsearch.yml开启认证

http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-headers: Authorization
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true

并且设置es密码:

./bin/elasticsearch-setup-passwords auto  # 自动设置随机密码
./bin/elasticsearch-setup-passwords interactive  # 自定义密码

索引操作

from elasticsearch import Elasticsearch
from pprint import pprint
from datetime import datetime

es = Elasticsearch(
    ['xxxxxx.com'],
    http_auth=('用户名', '密码'),
    scheme="https",
    port=443,
)

# 插入一条数据(自动创建索引)
doc = {
    'author': 'test',
    'text': 'this is a test',
    'timestamp': datetime.now(),
}
res = es.index(index="test-index", id=1, body=doc)
res = es.index(index="test-index", body=doc)  # 不填写id则自动产生
print(res['result'])

# 根据id查询索引数据
res = es.get(index="test-index", id=1)
print(res['_source'])

# 刷新索引
es.indices.refresh(index="test-index")

# 搜索数据
res = es.search(index="test-index", body={
  "query": {
    "match_all": {}
  }
})
print("Got %d Hits:" % res['hits']['total']['value'])
for hit in res['hits']['hits']:
    print("%(timestamp)s %(author)s: %(text)s" % hit["_source"])

暂时只是简单测试一下,后续有使用到再补充demo