Hacker News new | past | comments | ask | show | jobs | submit login

For your use case set up both and then do some testing.

Speaking for myself I did that and found that SOLR was a lot more performant. I needed a high-traffic solution without a bunch of servers.

I find that ES tries to do too much with all the dashboards and monitoring etc.

SOLR keeps it simple and thereby does not incur the performance penalty.

Also when an ES cluster goes south (cluster health "yellow" or "red") it seemed like a pain to troubleshoot and determine the real reason WHY. SOLR seems more durable and when something needs investigation you get a clear message in the log.

If you are starting something new and just don't know your traffic requirements and are scared of it "going viral" like a hot new mobile game ES may work for you though. The one thing it excels at is adding more ES servers to the cluster quickly. So if you need more servers ASAP and don't care about the cost ES has that covered.




Does Solr have the same problem as ES where you can't modify an index mapping after creating it? With ES you have to reindex everything into a new temporary index and then swap the new and old indexes. It's a terrible design, especially considering that ES already has all the original data and should be perfectly capable of doing it itself, incrementally.


Often when you want to reindex you want to reindex a lot of data - quickly. You don't want to use your same hardware, possibly. So what you do if you're in AWS, for instance, is take your data mounts and mount them to an extremely high powered set of instances. Then, reindex with the new mappings. Then, move your mounts back to your smaller instances.

You save money and can do a reindex very quickly.

A basic "reindex" command would be a cool feature though.


It is similar in SOLR (and other NoSQL data stores) currently .. at least as far as I am aware. It can be quite intensive especially since ES, SOLR, etc like to use as much memory as they can for fast access. Any non-trivial application should have duplicate servers / instances to soak up the traffic while a change like that is migrated.


The primary limitation comes from Lucene, which powers both. With Solr, you could change the definition and reload the core, but you will be getting some weird artifacts for the previously-indexed content.

Neither Solr nor Elasticsearch should be treated as a primary data store, so that's probably why reindex-in-place is not the highest priority.


I think treating Solr/ES as a primary data store would be a horrible idea. But the problem is equally annoying if you're using it as an expendable search index.

There is no technical reason why Solr/ES could not do diff-based indexing. ES (I don't know about Solr) admittedly uses a single Lucene index per logical index, so changing a single field mapping involves reindexing the whole index, not just that one mapping.

But if the mappings were properly versioned ES could simply create a new version, index everything (from the original contents), and then swap. Locking the original index should be a non-issue.


No, they don't. Behind the scene, it's Lucene that is creating a set of inverted indexes. While extremely performant for text search, it is also destructive and partial comparing to the initial dataset. So, there is no way to migrate an index to another one.


ES, by default, stores the original document in the nested "_source" document, which could be used to reindex data from scratch.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: