Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
784 views
in Technique[技术] by (71.8m points)

elasticsearch - logstash jdbc connector time based data

With the new logstash jdbc connector here:

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html How do subsequent logstash runs effect whats already indexed into ElasticSearch? Does it create new documents in the ES index, or does it update the docs that match the row that have already been indexes? The use case I'm try to tackle is to index rows with timestamps into elastic search, but the table continually gets updated i would like to only index new data, or if I have to read the table again, only add new documents for new rows.

Any suggestions? Or more documentation around the logstash jdbc plugin?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

What I would do is to include in the query statement the timestamp of the last time the plugin ran (i.e. sql_last_start) and so it will only load the newly updated records.

For instance, your jdbc input plugin would look like this:

input {
  jdbc {
    jdbc_driver_library => "mysql-connector-java-5.1.36-bin.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
    jdbc_user => "mysql"
    schedule => "* * * * *"
    statement => "SELECT * FROM mytable WHERE timestamp > :sql_last_start"
  }
}

Make sure to change timestamp with the name of your field containing the last updated date and mytable with the real name of your table.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...