不同表的数据同步到同一个索引不使用连表的方式

来源：9-13 【阶段总结】使用ES工具升级数据接入-索引构建

白平衡

2020-06-18

场景：比如说我有两个表，表里存了不同的数据，但是都有teacher_id,我想把两个表合成一个索引，而又不使用连表的方式（es索引文档的同步公司不让使用连表查询）。

写回答

2回答

白平衡

提问者

2020-06-22

解决方案：我又来了去扒了一波官方文档，问题已经解决解决方案如下:

场景：非连表方式，多个表数据对应一个索引，配置多个conf文件jdbc不同，指向相同的output

启动时./logstash -f mysql/conf/ 指定到所有conf的文件夹就行。

input {
jdbc {
# mysql 数据库链接,dianpingdb为数据库名
jdbc_connection_string => "jdbc:mysql://127.0.0.1:3306/dianpingdb"
# 用户名和密码
jdbc_user => "root"
jdbc_password => "0821"
# 驱动
jdbc_driver_library => "E:\soft\logstash-7.3.0\bin\mysql\mysql-connector-java-5.1.34.jar"
# 驱动类名
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
# 执行的sql 文件路径+名称
statement_filepath => "E:\soft\logstash-7.3.0\bin\mysql\jdbc_2.sql"
# 设置监听间隔各字段含义（由左至右）分、时、天、月、年，全部为*默认含义为每分钟都更新
schedule => "* * * * *"
}
}

output {
elasticsearch {
action => "update"
doc_as_upsert => "true"
# ES的IP地址及端口
hosts => ["localhost:9200"]
# 索引名称
index => "shop"
document_type => "_doc"
# 自增ID 需要关联的数据库中有有一个id字段，对应索引的id号
document_id => "%{id}"
}
stdout {
# JSON格式输出
codec => json_lines
}
}

重点配置了两个参数

action => "update"

doc_as_upsert => "true"

官方文档如下：

https://www.elastic.co/guide/en/logstash/7.3/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-action

action

Value type is string

Default value is "index"

Protocol agnostic (i.e. non-http, non-java specific) configs go here Protocol agnostic methods The Elasticsearch action to perform. Valid actions are:

index: indexes a document (an event from Logstash).

delete: deletes a document by id (An id is required for this action)

create: indexes a document, fails if a document by that id already exists in the index.

update: updates a document by id. Update has a special case where you can upsert — update a document if not already present. See the upsert option. NOTE: This does not work and is not supported in Elasticsearch 1.x. Please upgrade to ES 2.x or greater to use this feature with Logstash!

A sprintf style string to change the action based on the content of the event. The value %{[foo]} would use the foo field for the action

doc_as_upsert

Value type is boolean

Default value is false

Enable doc_as_upsert for update mode. Create a new document with source if document_id doesn’t exist in Elasticsearch