Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
678 views
in Technique[技术] by (71.8m points)

c# - Bulk Indexing in Elasticssearch using the ElasticLowLevelClient client

I'm using the ElasticLowLevelClient client to index elasticsearch data as it needs to be indexed as a raw string as I don't have access to the POCO objects. I can successfully index an individual object by calling:

client.Index<object>(indexName, message.MessageType, message.Id, 
    new Elasticsearch.Net.PostData<object>(message.MessageJson));

How can I do a bulk insert into the index using the ElasticLowLevelClient client? The bulk inset APIs all require a POCO of the indexing document which I don't have e.g.:

 ElasticsearchResponse<T> Bulk<T>(string index, PostData<object> body,
      Func<BulkRequestParameters, BulkRequestParameters> requestParameters = null)

I could make the API calls in parallel for each object but that seems inefficient.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The low level client generic type parameter is the type for the response expected.

If you're using the low level client exposed on the high level client, through the .LowLevel property, you can send a bulk request where your documents are JSON strings as follows in 5.x

var client = new ElasticClient(settings);


var messages = new [] 
{
    new Message 
    { 
        Id = "1", 
        MessageType = "foo", 
        MessageJson = "{"name":"message 1","content":"foo"}" 
    },  
    new Message 
    { 
        Id = "2", 
        MessageType = "bar", 
        MessageJson = "{"name":"message 2","content":"bar"}" 
    }   
};

var indexName = "my-index";

var bulkRequest = messages.SelectMany(m => 
    new[]
    {
        client.Serializer.SerializeToString(new
            {
                index = new
                {
                    _index = indexName,
                    _type = m.MessageType,
                    _id = m.Id
                }
            }, SerializationFormatting.None),
        m.MessageJson
    });

var bulkResponse = client.LowLevel.Bulk<BulkResponse>(string.Join("
", bulkRequest) + "
");

which sends the following bulk request

POST http://localhost:9200/_bulk
{"index":{"_index":"my-index","_type":"foo","_id":"1"}}
{"name":"message 1","content":"foo"}
{"index":{"_index":"my-index","_type":"bar","_id":"2"}}
{"name":"message 2","content":"bar"}

A few important points

  1. We need to build the bulk request ourselves to use the low level bulk API call. Since our documents are already strings, it makes sense to build a string request.
  2. We serialize an anonymous type with no indenting for the action and metadata for each bulk item.
  3. The MessageJson cannot contain any newline characters in it as this will break the bulk API; newline characters are the delimiters for json objects within the body.
  4. Because we're using the low level client exposed on the high level client, we can still take advantage of the high level requests, responses and serializer. The bulk request returns a BulkResponse, which you can work with as you normally do when sending a bulk request with the high level client.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...