I am currently writing a Golang client for Elasticsearch which uses the native b...

sagichmal · on June 19, 2014

    > I am currently writing a Golang client for Elasticsearch 
    > which uses the native binary protocol

Why on earth would you do that? Is request serialization and transfer time via the JSON API even approaching 1% of mean request duration?

silenteh · on June 19, 2014

Why would I not ?

This brought me to dig deeper into Elasticsearch code, find out more about its code quality, deal with machine endiannes, deal with byte shifting, think how to structure code in Golang and overall enjoy the feeling of touching the bare metal again...

TheBiv · on June 20, 2014

I guess bc theoretically someone is paying you to write applications at a fair clip. Granted, we have no idea what your role is or what your goals of the project are, so we are probably completely wrong at your own role! :)

silenteh · on June 20, 2014

You are right, I should have probably mentioned I am doing it on my free time and no one is paying me. It's just pure curiosity. :)

phungleson · on June 21, 2014

Writing a distributed system to talk to Lucene directly might be more rewarding? If it is possible?

diminish · on June 19, 2014

So if Elasticsearch is a stringified (JSON/HTTP) wrapper around Lucene with simplified setup for the web app crowd, you are making a binary-fied wrapper around it. Why not skip ES altogether, or use the JSON api?

atombender · on June 19, 2014

ElasticSearch is a lot more than just a "stringified wrapper around Lucene". Lucene is used for the underlying inverted indexes, the item store and tokenization/analysis, and that's pretty much it. ES adds clustering, a query DSL, configuration, data mapping system, "river" functionality, HTTP API etc.

silenteh · on June 19, 2014

The clients actually acts as a cluster node and therefore has knowledge about the cluster state, its indexes and shards, because it receives notifications from it, once it joins.

This allows to execute operations on a specific shard of a specific index on a specific node of the cluster resulting in better performance than going through the HTTP interface.

It can be used to efficiently store big quantities of data, for instance logs, which then can be visualized with Kibana.

It's just unfortunate that Elasticsearch presents the problems mentioned in the article and which I also experience in production, because it has a series of plugins which makes it a good solution for specific use cases.

coolsunglasses · on June 20, 2014

I'd recommend not using the native binary protocol unless you have proof it makes a substantial difference for your application.

If you need to do bulk work, connection pooling, keep-alive, and batching on the client-side over HTTP can easily vastly exceed what ES cluster can handle. Users of my library have confirmed this.

You could use my library as a guide to the abstract data types, even I don't use the native protocol.

https://github.com/bitemyapp/bloodhound