Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
713 views
in Technique[技术] by (71.8m points)

elasticsearch - JSON Bulk import to Elasticstearch

Elasticsearch Bulk import.

I need to import the Products as individual items.

I have a json file that looks similar to the following:

{
   "Products":[
      {
         "Title":"Product 1",
         "Description":"Product 1 Description",
         "Size":"Small",
         "Location":[
            {
               "url":"website.com",
               "price":"9.99",
               "anchor":"Prodcut 1"
            }
         ],
         "Images":[
            {
               "url":"product1.jpg"
            }
         ],
         "Slug":"prodcut1"
      },
      {
         "Title":"Product 2",
         "Description":"Prodcut 2 Desctiption",
         "Size":"large",
         "Location":[
            {
               "url":"website2.com",
               "price":"99.94",
               "anchor":"Product 2"
            },
            {
               "url":"website3.com",
               "price":"79.95",
               "anchor":"discount product 2"
            }
         ],
         "Images":[
            {
               "url":"image.jpg"
            },
            {
               "url":"image2.jpg"
            }
         ],
         "Slug":"product2"
      }
   ]
}

I've tried the following (I'm a novice at this):

curl -s -XPOST 'http://localhost:9200/_bulk' --data-binary @products.json
curl -s -XPOST 'http://localhost:9200/_bulk' -d @products.json
curl -XPOST http://localhost:9200/cp/products -d "@products.json"
curl -XPOST http://localhost:9200/products -d "@products.json"

Some gave an error others didn't. What do I need to do?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Following the Bulk API documentation. You need to supply the bulk operation with a file formatted very specifically:

NOTE: the final line of data must end with a newline character .

The possible actions are index, create, delete and update. index and create expect a source on the next line, and have the same semantics as the op_type parameter to the standard index API (i.e. create will fail if a document with the same index and type exists already, whereas index will add or replace a document as necessary). delete does not expect a source on the following line, and has the same semantics as the standard delete API. update expects that the partial doc, upsert and script and its options are specified on the next line.

If you’re providing text file input to curl, you must use the --data-binary flag instead of plain -d. The latter doesn’t preserve newlines.

So you will need to change the contents of your products.json file to the following:

 {"index":{"_index":"cp", "_type":"products", "_id": "1"}}
 { "Title":"Product 1", "Description":"Product 1 Description", "Size":"Small", "Location":[{"url":"website.com", "price":"9.99", "anchor":"Prodcut 1"}],"Images":[{ "url":"product1.jpg"}],"Slug":"prodcut1"}
 {"index":{"_index":"cp", "_type":"products", "_id":"2"}}
 {"Title":"Product 2", "Description":"Prodcut 2 Desctiption", "Size":"large","Location":[{"url":"website2.com", "price":"99.94","anchor":"Product 2"},{"url":"website3.com","price":"79.95","anchor":"discount product 2"}],"Images":[{"url":"image.jpg"},{"url":"image2.jpg"}],"Slug":"product2"}

And be sure to use --data-binary in your curl command (like your first command). Also note the index and type can be omitted if you use the index and type specific endpoint. Yours is /cp/products like your 3rd curl command.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...