Hacker News new | past | comments | ask | show | jobs | submit login
ScyllaDB Open Source 3.0 (scylladb.com)
210 points by manigandham on Jan 17, 2019 | hide | past | favorite | 68 comments



I'm interested in the http://seastar.io/ networking library that the Scylla guys built. Looks like it runs on DPDK and might be a nice way to do user mode networking. Anyone tried using it in other projects?


If you are thinking about starting a Seastar project, definitely check out SMF [0]. It's a highly optimized RPC framework that uses Seastar, and makes it easy to build high-performance servers and clients. Seastar itself is RPC agnostic, and this project has a great implementation of functionality most developers will eventually need.

[0]: https://github.com/smfrpc/smf


Yep! Talk to Kefu Chai at RedHat. He's using Seastar to rebuild Ceph. Here's his presentation from Scylla Summit 2018 back last October:

https://www.scylladb.com/tech-talk/redhat-on-rebuilding-the-...


Do you need to have a certain network card for DPDK to work? Can I just rent a random dedicated server, install DPDK and get userland networking?


Here's a list of DPDK supported NICs:

https://core.dpdk.org/supported/


I've also been interested in this project. Several years ago, I was toying with the idea of building a Kafka-compatible broker implementation with it. Although I didn't get very far, it's a very powerful and well-designed framework.


i have a prototype of that same idea working w/ seastar.


They need to get rid of that email registration to download the software.


Why?


Because at every road block you bleed potential users.

I know I only bother with these more annoying products after I've already tried and failed with the alternatives. Its not so much the few minutes dealing with registration and the other time working out if access to the private repositories are going to be a hassle with my environment and security policies, its the first sign that you are dealing with a company that is going to be a hassle.

So please leave up the registration forms, because it signals to me that I will be dealing aggressive sales departments, fine print and hidden expenses. And I'm only going to do that when I've exhausted my alternatives. Maybe it isn't true, but its what I've come to believe over the last few decades, and I'm not the loser in this scenario since I just investigate the easier to deal with solutions first. And not much point continuing if I find something good enough and free, which is why I'm currently migrating my old DSE clusters to new hardware and Apache Cassandra 3.11 despite my poor experiences with older versions (I was pleasantly surprised with 3.11 and look forward to seeing 4.0).


Agree it's annoying (I'm one of the co-founders) but it's the minimal 'eval' to make sure users succeed to maximize the value from their downloads. We mainly push them to slack and to share monitoring and logs with us, until they get to prod


Translating, after a period you spam users with marketing designed to get people who had a poor first impression to spend time getting a second impression. Informing users who didn't make the effort to google for support where the support channels are. I imagine it helps get the people who want baby sitters, but it certainly drives away the people who don't want baby sitters. I guess it makes sense if you goal is to sell managed services.


It also means we can't automate installation of ScyllaDB.


Most of our users automate it. When you register, you are given access to the repository file for your preferred distribution. Once you have the repo, you can just apt-get it into every one of your nodes.

Also Scylla provides non-gated access for AWS users with ready-to-consume AMIs


Oh, that's good to know!


Congrats to the Scylla team! These features and performance improvements are pretty huge for people working with Apache C* that want to evaluate Scylla. Compatible storage formats will certainly make evaluations much easier. Also, hope the Scylla experience with MVs is better...

As far as compatibility w/ Apache Cassandra 3.x+, is there anything outstanding?


https://docs.scylladb.com/using-scylla/cassandra-compatibili...

The big remaining item is lightweight transactions.


Exactly! We're working on LWT & the consensus protocol behind it, Raft. Anyone heading to FOSDEM '19 in Brussels can hear this talk by Duarte Nunes: https://fosdem.org/2019/schedule/event/raft_in_scylla/


Also, for those looking for the side-by-side comparison with Cassandra, there's this page:

https://docs.scylladb.com/using-scylla/cassandra-compatibili...


What's a light weight transaction?


A compare-and-set request. E.g., I can read version 3 of a row, and write version 4 if the current version is still 3, so that if someone else writes version 4 I don't clobber whatever they just did.


That’s a really succinct definition. I like it. Thank you.


To be clear, "lightweight transactions" are a Cassandra specific term, nobody else uses that. Compare-and-set, atomic updates, or just transactions are the normal terms.


Pythian blog on the topic; note that Cassandra relies on Paxos as a consensus algorithm. Scylla will be using Raft: https://blog.pythian.com/lightweight-transactions-cassandra/


From everything that I've heard about this (being an wire copy compatible with Cassandra) and it's performance. I'm pretty excited about this!


Would love to hear your feedback! Ping me on Twitter @PeterCorless


Anyone using this in production? Any thoughts or tips on how to adapt traditional MongoDB workflows to this?


We are (Numberly) running it in production indeed and our use cases keep on increasing!

As glommer and PeterCorless mentioned, I'd be happy to share thoughts and learnings about it.

Feel free to show up and ask questions mate: you can easily find me on the community Slack channel, freenode IRC or Twitter.


Which slack channel? I’d love to join.



Is it private? I might be dumb but slack doesn’t seem to give me a signup option.


Sorry, should be better using https://scylladb-users-slackin.herokuapp.com/


We are (Zenly) and have been since 1 1/2 years.


Indeed! And for those interested in seeing how Zenly made the switch:

https://www.scylladb.com/2017/11/29/zenly-database-replaceme...


Scylla is a replacement for Cassandra which itself is based on the Dynamo model so most of the best practices from AWS DynamoDB will apply: https://docs.aws.amazon.com/amazondynamodb/latest/developerg...


The best person to talk to about this is @Ultrabug (Twitter and Github), CTO of Numberly, who moved from MongoDB to Scylla:

Fixed link: https://www.scylladb.com/tech-talk/numberly-on-joining-billi...


Thanks is everyone for the tips.


More about Ultrabug's use case from Mongo in his awesome presentation at our summit: https://www.scylladb.com/tech-talk/numberly-on-joining-billi...


Best K8s operator for ScyllaDB?


Get ScyllaCloud barrier to entry a bit lower. $200/mo is too much for tinkering. Even better if there is an autoscaling option available.


"Stay tuned!" winks and walks away, whistling innocently


Also Google Cloud. We'd be using it now if it were available.


It certainly is pricey for tinkering, but it requires a decent chunk of block storage and I was shocked how expensive that is when I looked the other day. It unlikely to be news to most, but I've been on private cloud and DCs for a long time and it seems everyone is charging huge amounts of money if you need large filesystems; not even slow but cheap options. Maybe Scylla needs to be backed by S3 rather than XFS :)


Scylladb uses local storage (not EBS) for sane performance. XFS is required for async reads/writes.


I also find ScylaDB C++ code nicely educational (at least for me). It is using modern idiomatic C++ with a healthy mix of boost library.

For example:

https://github.com/scylladb/scylla/blob/master/db/view/view....


Awesome. Can't wait for the Amazon hosted version.


Note: Our new Scylla Cloud runs on AWS; it uses our Enterprise 2018 release, not the latest 3.0 Open Source. Yet would love to hear your thoughts! https://www.scylladb.com/product/scylla-cloud/


I think he's being sarcastic.


It's called DynamoDB and it's been available for a long time.

DynamoDB begat Cassandra which begat ScyllaDB.



You raise an interesting point, how are they going to prevent Amazon just taking their business with making SkyllaDB on demand? Confluent ( Kafka folks ) did it by adjusting the licence.


I hope this versión do not loss data :/


According to their https://www.scylladb.com/open-source/ page:

Server License:

- Free Software Foundation’s GNU AGPL v3.0

- Commercial licenses are also available. Contact us for more information.

Driver Licenses:

- Apache Cassandra drivers: Apache License v2.0

Third-party drivers:

- Licenses will vary. See the individual driver documentation for details.

Documentation License:

- Creative Commons Attribution-ShareAlike 4.0 International

This angers me very much, they are lying about being OSS which was started by OSI (https://opensource.org/) as a response to GPL/AGPL/FSF/RMS's non-open policies, when in reality only their client drivers are OSS but their servers are commercial or AGPL.

I say this, as someone who has spent years working on and consciously giving away popular DB software as zlib/MIT/apache2 with my project - which is now run in production by Internet Archive, top 300 global site, and others (https://github.com/amark/gun) - just to see more and more other DBs steal "OSS" label to falsify marketing, and more and more DBs become "open core" crippleware. We need more people to keep campaigning against these outright license lies and the faux-humble "I can't survive as OSS unless you pay for commercial license" junk, if you can't survive doing OSS then just don't do OSS!! Lying you are OSS but actually AGPL instead is just a shady ploy to get more dev/biz clicks.


AGPL is an open source license as per OSI definition which you yourself point to!

You don't have to take my word for it, just check out their own site:

https://opensource.org/licenses/AGPL-3.0

The licenses you are talking about, such as MIT or Apache, are called _permissive licenses_ whereas AGPL is a _copyleft license_. However, both types of licenses can be open source licenses, if they fill the OSI requirements.


Really? Would have helped to check out: https://opensource.org/licenses/alphabetical

All the licenses(Except the Third-Party Driver licenses) look opensource to me.


What the heck? You are right, AGPL is listed, yet clearly violates https://opensource.org/osd-annotated definition. Could somebody explain this to me?


It conforms with OSD for the same reason GPL v2/v3 does. As I understand it, AGPL simply treats "networked access" as a subset of "linking".

There's a ton of legal grey area that hasn't been tested or has been hand waved away (do I have to open source my web application if it connects to a database that is AGPL licensed?)


AGPL treats "networked access" as "distribution", not "linking".

With the GPL, you are required to make your modifications available as GPL if you distribute the modified program. This creates a loophole for ASPs, which can modify GPL’d code, but because they offer it as a service (and don’t distribute the program), they don’t have to share the modifications. AGPL closes this loophole by explicitly saying that network access is considered as “distribution”.

So no, you don't need to make your web application AGPL. Just like you don't make all your applications running on Linux GPL just because they happen to communicate with the kernel over system calls. Copyleft licenses are relevant only if your code is a derivative work of the original.


There is no clear violation, AGPL is a copyleft license based on the GPL v2 (one of the most prominent OSS licenses) but closes the application service provider loophole.

Open Source Software (OSS) ensures users have access and can freely modify source code for software they use which the AGPL does, it’s even even more copyleft than the GPL as it ensures modifications to software by ASPs are also made available.


What part of the AGPL violates the open source definition?


As for mangodb, I still qualify this as open source. I don't care about what OSI says. For me as long as I can read the source, and modify it for my business, it's fine. It doesn't need to be free as in free beer.


If you are interested in our CEO's take on the Mongo SSPL, we published this blog a few months ago.

https://www.scylladb.com/2018/10/22/the-dark-side-of-mongodb...


I think it's possible to still be very critical of MongoDB's licensing, and not have issues with Scylla.


Well that's exactly the issue...you can't modify it or use it for your business if you compete with Mongo corp


The “if you compete with mongo corp” is probably exactly what the parent comment means. It doesn’t effect a lot of people who would use that.


I thought I could modify as long as I either pay or give money to mongodb?


Michael from MongoDB here again. MongoDB Community is still free to view, download, modify, etc. The SSPL change affects no-one besides those offering the licensed software (MongoDB) as a public service.


Michael: Would love to speak with you about this very point. Ping me via LinkedIn: https://www.linkedin.com/in/petercorless/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: