Hacker News new | past | comments | ask | show | jobs | submit login

I'm impressed.

I have a database with 15k documents, each with around 70 pages of text, HTML formatted.

I'm using ElasticSearch currently, with the Searchkick gem.

30 min playing with MeiliSearch. So far:

- Blazing fast to index, like 10x more performant than using ElasticSearch / Searchkick;

- Blazing fast to search, at least 3x faster in all my random tests so far;

- Literally zero config;

- Uses 140MB of RAM currently, while in my experience ElasticSearch would crash with anything less than 1GB, and needs at least 1.5GB to be usable in production.




Since this got upvoted and I see the devs are replying to questions, here are some! I'm also going to point how ElasticSearch works for comparison.

- The docs state that `Only a single filter is supported in a query`. This is kind of a dealbreaker for my use case, since I need at least a `user_id` and a `status` filter. ElasticSearch can work with multiple filters. Also, don't understand why you call it `filters` instead of `filter` then. Are multiple filters in the roadmap?

- My search UI has a sort by `<select>`, where you can choose, for instance, `last updated asc` or `last updated desc`, amongst others. In my understanding, that would be cumbersome with MeiliSearch, since it would require (1) a settings change to alter the ranking rules order beforehand [0], which would not even work in production due to race conditions or (2) maintain multiple indexes each with a pre-defined ranking rule order and switch between them depending on the UI criteria?

- As an extension of the last question, I see that a lot of what you call "search settings" are considered by ElasticSearch query parameters. For instance, I can easily query ES for the title or description fields just by setting that as a parameter. In MeiliSearch that would require a change in the index settings beforehand, right?

PS: The docs, specially in the Ruby SDK, could use some work in the filters section. It took me a while to understand I should pass a string, like index.search("query", filters: "user_id:3"). I was trying a hash like `filters: { user_id: 3 }`.

[0] https://docs.meilisearch.com/references/ranking_rules.html#u...


Hi, many answers to these questions. But first, I'll put you on the link to the public roadmap. A lot of the stuff we're working on is in there. If you need/love a feature, please add a heart emoji on it. https://github.com/orgs/meilisearch/projects/2

- Currently, we only support single filters. The multiple filters option is coming soon. https://github.com/meilisearch/MeiliSearch/issues/425

- Custom ranking rules on the fly is something imaginable on our solution. We didn't do it yet because it complexifies the search query parameters. We are waiting for feedback like yours to implement this kind of feature.

- To return only the field you need, it's already possible during the search https://docs.meilisearch.com/guides/advanced_guides/search_p.... To restrict attributes to search in during the query. We had this feature on a previous version. But like the last answer, no one used it, and it complexifies the search query.


Just to add a little note here, we are currently working on the functionality of multi-filter queries, because we are aware of our community!


But... what happens if I need more than one instance? I'm genuinely curious. I hope this doesn't come off as an asshole comment. Isn't the whole point of ES versus just plain ol' lucene or solr the horizontal scalability of it?


We are currently working on the sharding and the replication (Raft). Development is progressing well and the functionality should come out soon.


I agree with all your points but a minor nit pick that Solr has been horizontally scalable for quite a while now.


> in production.

I went looking, but found nothing regarding any operations management.

* How does this scale?

* How is it monitored? Where do I get the metrics for it? (indexing performance, search performance, etc.. Stuff not found in the OS)

* Are there any kind of throttling or queueing capabilities?

* What's the redundancy/HA approach?

* I'll ask about backups, though its the least of my worries as indexing databases like this and ES should be able to be rehydrated from source. However, snapshots may be faster to restore than reindexing.

This might be a nice local dev tool for something, but I'm not sure how you run a business critical application with it? I'm wondering if I'm missing something.

Edit: formatting

Edit2: also wondering about security too


Hi, to answer your questions.

* 2 parts.

- Vertical scale: We use LMDB as a key-value store. This one uses the power of memory mapping. It made our search engine use mainly the disk and will do not need a machine that will have TB of RAM.

- Horizontal scale. We are working on sharding and replications (Raft). Development is progressing well, and the functionality should come out soon.

* Currently, it is not monitored at all. This feature is planned. https://github.com/meilisearch/MeiliSearch/issues/523

* We use a queue for updates. You can find here the complete guide https://docs.meilisearch.com/guides/advanced_guides/asynchro...

* As I said previously, we are working on HA with a raft consensus.

* We will add snapshots in no time (disk folder saved in s3). A little more time for backups (version agnostic, need indexing).

We are already working with Louis Vuitton on an application in production. The app is in production from 9 months, and there hasn't been a single problem.


If you are looking for alternatives, check out Typesense as well:

https://github.com/typesense/typesense

It supports multiple filters and has HA for reads as well.


You can add more nodes to scale search speed with ES, can you do the same with this?


More nodes is more throughput, not lower latency.

You’re always bounded by max single shard latency AND by coordination latency.

Ignoring how expensive it would be, over-sharding and over scaling (I.e. low volumes of data per shard and low shards per host) could reduce max single shard/host latency, however it’ll increase coordination latency but also memory (which directly or indirectly will cause more coordination latency).

Perfect data per shard and perfect shard per host numbers are currently an unsolved problem. They heavily depend on the domain, I.e. data types, data volume, data ingest, mappings, query types, query load.

:) if anyone has found a way to consistently add hosts to reduce latency, please let me know!


That's marked for Q3: https://github.com/meilisearch/MeiliSearch/issues?q=is%3Aiss...

But probably 99% of users using ES don't need sharding.


More accurately, 99% of the use cases ES is appropriate for don't need sharding. Every time I've needed to shard ES has been a nightmare bad enough that ES was abandoned.


I had a typical case of ingesting a ton of logs into ES. I needed sharding to keep up with multi-threaded writes while something else is doing intensive search queries. I think sharding was very useful in processing a lot of data efficiently.


Yes, It's marked for Q3 because it's a pretty complex feature. And we had a lot of other features to do at the same time. But the good news is that it's very well advanced and is likely to be released in mid-Q2.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: