AWS S3 open source alternative written in Go

Ixiaus · on Aug 30, 2016

Or, run Riak with their S3 compatibility layer. Riak is extremely stable and the work Basho has done to make a truly robust distributed database is significant.

http://docs.basho.com/riak/cs/2.1.1/

viraptor · on Aug 30, 2016

Other alternatives:

ceph - http://docs.ceph.com/docs/master/radosgw/s3/

swift - https://wiki.openstack.org/wiki/Swift/APIFeatureComparison#A...

theanalyst · on Aug 30, 2016

Also ceph (& swift) are known to scale well in prod. clusters with over 30+ PB of data (at least looking at CERN's cluster) and the latest version of RGW does support geographic redundancy for S3 like apis

shinydevops · on Aug 30, 2016

+1 for Ceph. We're running several ~3.5 PB clusters in production. We've not taken advantage of the new RGW features in Jewel, but it works well as an object storage solution.

dekobon · on Aug 30, 2016

Don't forget:

manta - https://www.joyent.com/manta

dc2447 · on Aug 31, 2016

CEPH is a volume service not an object storage service.

SWIFT is indeed analogous to S3.

viraptor · on Aug 31, 2016

Come on, I literally linked to a website describing "CEPH OBJECT GATEWAY S3 API"

XorNot · on Aug 31, 2016

Ceph is object storage first. The volume service is implemented on top of that.

Natales · on Aug 31, 2016

+1 on RiakCS. They now call it RiakS2 for kicks. The scalability and reliability of their server is insane. You just can't beat Erlang software in that regard.

Unfortunately, Basho has been so successful with their TSDB and KV products that they have basically put S2 on maintenance mode. They are still "supporting" it, but no new features. I was hoping this Minio tool could do something similar, but with a single daemon is a single point of failure. Unacceptable for serious deployments.

bramd · on Aug 31, 2016

Another interesting project written in Erlang is LeoFS: http://leo-project.net/leofs/

bandris · on Aug 31, 2016

Worth adding that LeoFS is being used in production by Rakuten for years now. Still not a widely known project for some reason.

http://www.slideshare.net/rakutentech/scaling-and-high-perfo...

jdboyd · on Aug 31, 2016

Considering how many serious deployments still use non-clustered NASs, a single node object store seems equally reasonable.

owyn · on Aug 31, 2016

That sounds pretty nice. If it works does it need new features? :)

corobo · on Aug 30, 2016

There's also Skylable's SX Cluster if you use the libres3 daemon with it. Been using it for over a year with no problems. Set, forget, add more nodes when I need more disk.

Everyone's got their s3 of choice, always good to have more options on the table.

https://www.skylable.com/products/sx/

https://www.skylable.com/products/libres3/

ranman · on Aug 31, 2016

Ran Riak CS in production and had constant issues. It's not terrible but it's also not ideal. I would caution against anyone depending on it for mission critical systems. Many of the failure modes are undocumented.

0xmohit · on Aug 31, 2016

Could you elaborate on some of the specific issues you ran into?

mprev · on Aug 30, 2016

There's also Pithos from Exoscale. Runs on top of Cassandra. Code is Clojure and open source. Http://Pithos.io

merb · on Aug 30, 2016

guess it's possible, but riak is not designed to run on a single node. Guess even basho suggests using at least a 5 node cluster.

unlocksmith · on Aug 31, 2016

Minio is deliberately designed this way. Cloud native applications require strict multitenancy. Minio's approach is to build just enough to meet a single tenant's requirement. Deploy one minio server per tenant or user or customer .. whichever fits you the best. This will allow you to upgrade, customize or bug fix in isolation. To replicate for HA, use "mc mirror -watch SOURCE TARGET" command to pair them up. If you have multiple drives (JBOD), you can eliminate RAID or ZFS and use Minio's erasure code to pool them up. Distributed version is also in testing at the moment. It should be out in a month.

merb · on Aug 31, 2016

I know and that. And that's why I find minio interesting. Start with a single node. Raise up to 16.

dc2447 · on Aug 31, 2016

Dont you have to pay for an enterprise licence if you want multi region/datacentre/AZ?

davidu · on Aug 30, 2016

Theory here is that people will build apps that talk to S3. But sometimes those apps might need to run inside the perimeter and can't talk to the cloud. So rather than rewrite an app to talk to a new internal datastore, you just point it at a locally hosted Minio and you're up and running.

Smart.

tomjakubowski · on Aug 30, 2016

Versions of this (S3-compatible service for development use) have existed for years. One I used was https://github.com/jubos/fake-s3

notyourwork · on Aug 30, 2016

What kind of situations do you see this becoming a factor in? 5 or 10 years ago this was an issue with early cloud adopters. Now a days cloud providers are ramping up their DCs to be compliant and allow companies/government entities with strict policies to still onboard.

Its a good strategy but not one that I see being exercised frequently enough.

extrapickles · on Aug 30, 2016

The software I work on is targeted towards customers who generally have really spotty internet connections (eg: they all are in the less forgiving parts of the ocean or middle of nowhere if on land). This pretty much mandates using software like this to build out your app as you can't rely on internet connectivity.

There pretty much isn't anything you can do to improve their internet connections as cables to remote places are always getting dug up with week+ times to repair so you need something that can run locally for long periods. Ships have a different problem with very slow speeds that effectively means you can only transmit the absolute minimum off the ship when its out as sea (when they are at port they typically have normal internet connections to bulk dump data off on).

hhandoko · on Aug 31, 2016

I switched from Fake S3 [1] to Minio for local development. Fast and lightweight, good experience so far :)

Easy to setup with Vagrant, and linking / sharing the Minio shared folder to the host makes it quite convenient to quickly check the files without going to the UI [2].

[1] - https://github.com/jubos/fake-s3

[2] - It stores the files as-is in the local filesystem (files in folders, unchanged), as opposed to having it 'wrapped' like Fake S3 does.

krishnasrinivas · on Aug 30, 2016

Minio will always be 100% free software / open source. We have no plans to add any proprietary extensions or hold back on features for paying customers only. -- Minio Team

cyphar · on Aug 30, 2016

Then why not make the license AGPLv3-or-later, to avoid other people creating proprietary forks? I get that it's not a common occurence within the Golang world, but nothing will change unless more Golang projects start making their code copylefted.

y4m4b4 · on Aug 30, 2016

GNU AGPL is an ideal license for free software projects. We are a strong supporter of the GNU project. We chose Apache License for Minio purely for adoption reasons. Most of our users build proprietary software around Minio and their legal council has a default NO policy towards GNU licenses. Besides, FSF has also approved Apache License v2 as a free software license.

Proprietary forks are OK with us. It will be too expensive to maintain branches of their own and catch up with the upstream.

cookiengineer · on Aug 31, 2016

> It will be too expensive to maintain branches of their own and catch up with the upstream.

Haha, you guys are awesome! You've totally figured it out. Stay awesome!

bjoerns · on Aug 30, 2016

After evaluating a couple of options mentioned in the other comments here, we recently replaced our in-house built s3 clone with minio for our on-prem version of our app. Very robust and stable.

MichaelRenor · on Aug 30, 2016

Keep in mind that there are plenty of object stores that are robust and stable until you put 1 billion keys in them.

bjoerns · on Aug 30, 2016

That's a very good point - but for what we do (on-premise version control for Excel where each workbook version represents one object) we won't be getting even close to that number. But yes, agreed, it entirely depends on your use case.

MichaelRenor · on Aug 30, 2016

If you don't mind me asking, why not use Amazon S3? It's cheap and -- importantly -- somebody else is on-call for its uptime.

bjoerns · on Aug 30, 2016

We use S3 for our hosted version - but for our on-premise offering it simply has to be 100% on-premise, entirely on our client's infrastructure.

abakker · on Aug 30, 2016

I'm going to guess that it is part of the "on premises superstition" where companies feel that the stuff they own is more secure somehow. Obviously, this is not often true in practice, but 5 years of research/consulting has taught me that they feel this way all the same.

Occasionally, there are laws that also mandate certain controls that cloud providers in general did not have. That is also becoming rarer as time goes on.

brianwawok · on Aug 30, 2016

There really are pros to on prem.

There are cons, don't get me wrong, but to somehow claim that AWS is the end all be all of hosting choices is demonstrably wrong.

For example - You want to develop a financial exchange with a 100 microsecond average response time, peaks of 10Gbit traffic, and 5 9s of uptime. Do you host that on AWS? I wouldn't.

Another example - If I were a medium+ sized company (say 20+ employees), I would want my source control 100% on prem (excluding backup). Internet connections are too flakey, and Github gets DDOSed too often. I could not stake my entire business on github.

breakingcups · on Aug 30, 2016

I'll give you another pro, customization. Our on-prem Jira instance can have add-ons and changes that aren't allowed in the cloud-hosted version.

killbrad · on Aug 31, 2016

That's the Atlassian SaaS offering. You could still run it in any cloud yourself and get all the customizations you want.

takeda · on Aug 31, 2016

but then voiding the argument that was stated initially i.e. no one needs to maintain it and be on call for it

abakker · on Aug 31, 2016

This is true if you are first and foremost a company that provides technology to other customers, and in that case, you

a) have a very competent dev and ops team b) have a business where you are the provider of an SLA.

For many companies with 5K+ employees, they are already distributed, already have multiple data centers, have workers all over the globe, and, when they are not primarily in the IT delivery business, tend to have IT departments that have limited budgets, lack of training and poor organizational awareness.

This leads to poor security practice, poor cost analysis, and long/nonexistent upgrade cycles on many behind-the-scenes workloads.

You are of course right, there are many diverse reasons to be on-prem, but at some point, many of those reasons go away with sufficient size and differing priorities. Things change when a company's business involves the consumption of IT services, rather than the delivery of IT services.

matthewrudy · on Aug 31, 2016

Re: github

I can understand why on-premises git is better in some ways.

But you're overstating the frequency of github outages.

And it doesn't exactly kill the business when it's down for an hour.

Git is distributed after all.

mrweasel · on Aug 31, 2016

> But you're overstating the frequency of github outages.

It's a little of topic, but I don't think it's overstated. People complain about the stability of something like HipChat all the time, but GitHub is unavailable more often, in our area at least.

GitHub is huge target, and outages are extremely disruptive for companies.

vacri · on Aug 30, 2016

On-Premises is also "one less third-party service to manage". I work for a small company, and was asked to list the external services we use a couple of days ago, and the list went into dozens.

It's not the primary reason for On Premises, but it's one less thing to worry about: "our tech team has full control, instead of yet another company having some control that we can't see"

MichaelRenor · on Aug 30, 2016

I would argue that from a management point of view it's actually a lot nicer for small places to offload to third parties. As a point, my original comment pointed out that somebody else is on-call for that services uptime.

vacri · on Aug 31, 2016

I agree with you; outsourcing those services certainly helps this one-man ops team. But for a larger company with a more mature and staffed ops team, there's going to be that advantage above.

fizzbatter · on Aug 30, 2016

Does this have the ability to mirror to an encrypted remote? I'm looking for something like this for a simple home storage server, but emphasis on being able to replicate to something like B2 Storage for cheap backup.

Currently Infinit.sh has my attention the most, but it's quite young still.

edit: https://news.ycombinator.com/item?id=12125344 this thread seems to be talking about what i want. With that said, i'm not yet sure if `mc mirror` supports Backblaze, as that (per price point) is my prime need

rsync · on Aug 30, 2016

Current opinion is that "borg" is the holy grail of backup schemes ... it takes attic, which fixed all of the duplicity shortcomings, and improved on that ... [1]

We[2][3] tend to agree with that.

One reason it might not work for you is that we are an order of magnitude more expensive than B2, so perhaps that's a better bet for you. On the other hand, $7.20 per year for our smallest borg account is almost as close to zero as your B2 minimum order would be, so ... who knows.

One upside of choosing our service is that you can choose your location (US, Zurich, HK, etc.)

[1] https://www.stavros.io/posts/holy-grail-backups/

[2] rsync.net

[3] http://www.rsync.net/products/attic.html

RubyPinch · on Aug 30, 2016

from [3]

> If you're not sure what this means, our product is Not For You.

Please don't do that, its childish and unimpressive.

corobo · on Aug 30, 2016

There's no support for that service. Makes sense to ward people off who might need support at the headline.

rsync · on Aug 31, 2016

Just to be clear, there is no support for the deeply discounted borg/attic accounts at rsync.net.

Regular rsync.net accounts have full, unlimited support provided by a US-based engineer. As in, an honest to god unix engineer. Sometimes, but rarely, me.

krishnasrinivas · on Aug 30, 2016

Minio is object-storage server. You can use https://github.com/restic/restic to encrypt and mirror to remote minio server. For more help https://docs.minio.io/docs/restic-with-minio

fizzbatter · on Aug 30, 2016

Looks like restic doesn't support backblaze, as of yet: https://github.com/restic/restic/issues/512

howeyc · on Aug 30, 2016

True, but if you have the space to hold the encrypted data, you can "rclone"[0] that to most clouds.

[0] http://rclone.org/

fizzbatter · on Aug 30, 2016

That's really cool, appreciated! Surprised it doesn't offer an encrypt feature, seems really useful for the given featureset

mappu · on Aug 30, 2016

Rclone recently got support for on-the-fly encryption, it will be in the 1.33 release.

https://github.com/ncw/rclone/issues/219#issuecomment-239695...

koolhead17 · on Aug 30, 2016

We have documented how to use Minio with Rclone https://docs.minio.io/docs/rclone-with-minio-server hope it helps.

brightball · on Aug 30, 2016

I've been wondering about this for that use case myself: https://ipfs.io/

fizzbatter · on Aug 30, 2016

Yea IPFS is awesome, i have often pondered about the idea of using it for an internal family storage.

espadrine · on Aug 31, 2016

Does it support encryption yet? Also, you'd have to auto-pin all your files, or risk losing them.

I think GlusterFS (battle-proof, but file-wise and assumes an administrator with access to everything) assumes or infinit.sh (robust ACL, but young, not open-source) better addresses those use-cases.

diggan · on Aug 31, 2016

Encrypt the files before adding to IPFS and then decrypt after receiving them. Support for encryption will be built-in into IPFS in the future but for now, this solution works. There is a issue tracking this here: https://github.com/ipfs/faq/issues/6

Also, when adding files yourself, those files become pinned by the default. Getting files won't pin automatically however. You have some more information about pinning here: https://github.com/ipfs/examples/tree/master/examples/pinnin...

espadrine · on Aug 31, 2016

Ah, good to know. Still, you end up having to develop your own ACL system if you have multiple users (eg. family). Especially, shared directories are tricky.

fizzbatter · on Aug 31, 2016

I was actually thinking of using a custom network, not truly IPFS.

frugalmail · on Aug 30, 2016

The canonical open source alternative to S3 https://wiki.openstack.org/wiki/Swift

hansjorg · on Aug 30, 2016

Riak CS is another one:

https://github.com/basho/riak_cs

ranman · on Aug 30, 2016

Ran this in production and dealt with a lot of issues. I would caution people against it's use in anything critical or customer facing.

hansjorg · on Aug 30, 2016

For what it's worth, I've worked with it for a couple of customers with pretty large Riak stores and never ran into or heard of any problems myself.

I've used the official JS aws-sdk and boto3 as clients.

hashin · on Aug 30, 2016

Could you please elaborate it? What were the issues you were facing?

shinydevops · on Aug 31, 2016

As another user with nothing but negative experiences with Riak-CS in production, I thought I'd take a stab here. We had a 12-node cluster with ~10TB per node, fwiw. In no particular order:

- The restart times of the Riak process ranged from 10 minutes to 3+ hours, during which time the cluster was basically useless. Not a single suggestion from support sped up this process.

- Every single night from 0800 - 0900 UTC, the cluster would grind to a halt (as measured by canaries measuring upload/download cycle times). This continued even after we migrated all customer data and traffic off of the cluster.

- Riak-CS ships with garbage collection disabled despite it being a critical feature. I inherited a cluster that had been run for some months without gc enabled. Turning it on caused the cluster to catastrophically fail. Basho Support, over a period of close to a year, was unable to find a single solution that would get our cluster back to health. If our cluster were a house on a show like Hoarders, the garbage in it would be considered load bearing.

- We attempted to upgrade our way out of our un-garbage-collect-able mess, but the transfer crashed. Every. Single. Time.

- Even had transfers worked, all of the bloated manifests have to be copied in their entirety, so you can't gc the incoming data on the new cluster.

- Even while babying the cluster, it would become unusable at least once a month, requiring a restart of all nodes. The slowest node took 3+ hours to start, followed by another 3+ hours of transferring data. This was 6+ hours of system downtime every month.

- During these monthly episodes, we attempted to engage with support and try to debug the processes (we were a team of seasoned Erlang developers). We could attach Observer and/or use the REPL to grab stats, but not a single support resource was able or willing to engage.

- For giggles, once we had migrated all users off of the cluster, we attempted to let gc run. It never completed. Not once. We let this go on for a few months before nuking the entire cluster.

Now, I absolutely realize that we got ourselves into that mess by running the cluster without gc for an extended period. But in the grand scheme of things, this cluster wasn't storing a very large amount of data -- tens of TB spread over tens of millions of objects. Having the cluster get into a state where gc can never run and where this causes snowballing instability is unacceptable.

We switched to Ceph. We've never looked back.

ranman · on Aug 31, 2016

We didn't have any issues with lost data but we had a lot of operational issues that didn't have clear fixes. Primarily around TLS, migrations, and performance. We had to contact support for many issues because the documentation for various failure modes wasn't there.

takeda · on Aug 30, 2016

We use riak and riak-cs in production and in fact we are one of biggest riak users.

If you use riak in production you probably do want their (Basho) support. Their product when works, works great, but when there is a problem it's a bit hard to troubleshoot it without knowing erlang and being familiar with riak's source code.

spudfkc · on Aug 30, 2016

I use Swift at work, and while it is a great tool, it is a bitch to set up. I would be curious to learn how Minio works more technically on a distributed level: how is object replication handled? are downloads automatically routed to the closest server? can I make downloads temporarily available (think Swift tempURLs)?

y4m4b4 · on Aug 30, 2016

We are currently working on the distributed version and will be making a beta release soon.

Currently minio supports

- pure FS backend with single disk - pure Erasure coded backend with multiple disks on single node (like ZFS)

For more information you can read here - https://docs.minio.io/docs/minio-erasure-code-quickstart-gui...

We do not do any sort of replication erasure code handles disk failures and we also implement transparent bit-rot protection as well.

To replicate one setup to many you can use 'mc mirror -w' which would watch on events and do continuous replication.

Relevant docs can be found here

https://docs.minio.io/docs/minio-client-complete-guide#mirro...

y4m4b4 · on Aug 30, 2016

Additionally "SwiftTempURLs" equivalent is called PresignedURLs in S3 API so we indeed support that as well.

Relevant docs here https://docs.minio.io/docs/using-pre-signed-urls-to-download...

llambiel · on Aug 30, 2016

Also http://pithos.io/ backed by Cassandra

kjetijor · on Aug 30, 2016

And another good example is ceph+radosgw.

majewsky · on Aug 30, 2016

I run OpenStack Swift at work, currently working on deploying it on Kubernetes. Swift is a very fine piece of software, with a pleasant operations experience, but it will take a lot of time to set up initially. Plan at least half a man-year until you have it all up and running.

cdnsteve · on Aug 30, 2016

Practical use case:

- Spin up a bunch of droplets on DigitalOcean, because I want reliability, etc.

- What's the best way to share drive space across these to create a single Minio storage volume, so if one DO node goes away I don't lose my stuff?

krishnasrinivas · on Aug 30, 2016

We are working on distributed minio https://github.com/minio/minio/tree/distributed

The minio available today for production use can export single disk or aggregate multiple disks on the same machine using erasure coding.

For this, if you want backup you can use github.com/minio/mc tool to mirror, more help here https://docs.minio.io/docs/minio-client-complete-guide#mirro...

killbrad · on Aug 31, 2016

I think this should be made clear on your site. I spent a good amount of time trying to figure out how to actually get this to be distributed, but the answer is - you don't. So it's only like S3 in interface, not in durability or availability.

SteveNuts · on Aug 30, 2016

So far the best option I've found has been GlusterFS

krishnasrinivas · on Aug 30, 2016

Minio is by ex-GlusterFS developers!

nickpsecurity · on Aug 30, 2016

"You had my curiosity but now you have my attention."

That gives it some credibility. Especially ability to deal with tough challenges they'll encounter in this domain. Helps to have encountered most already. ;) I'll look at it in more detail later on. I'm also more interested in it if it has many-node, HA/SSI support. What's ETA on that feature?

y4m4b4 · on Aug 30, 2016

Currently we have a single node multi disk version relevant docs here - https://docs.minio.io/docs/minio-erasure-code-quickstart-gui...

We are also about to finish distributed server functionality you can track the work here https://github.com/minio/minio/tree/distributed

nickpsecurity · on Aug 30, 2016

Thanks for the links. Good docs, too. Definitely keep that up. :)

squiguy7 · on Aug 30, 2016

I was going to suggest using their new block storage but I read the docs some more:

> A volume may only be attached to one Droplet at a time. However, up to five volumes can be attached to a single Droplet.

Looks like you would have to roll your own solution.

bryanlarsen · on Aug 30, 2016

minio works awesome for dev & test deployments. It's dead simple to set up, just a single executable. Hopefully it doesn't lose that simplicity as it grows up and gains features.

powerbook5300CS · on Aug 31, 2016

It's a go binary, that's just how they work.

Keyframe · on Aug 30, 2016

Sorry for two posts (the other one was unrelated). If anyone has experience with this I have a few questions regarding a particular use case.

How does something like this behave with really large files. Video files in 100s of gigabytes, for example. I'm asking because if one could set up a resilient online (online as in available) storage with fat pipes like this it could be used as a platform to build a centralized video hub for editing. It's another question how much sense would it make over a filesystem though.

klodolph · on Aug 30, 2016

I think these days we should by default think of storing blobs of data (like video files) in storage systems like S3 or the alternatives, and that ordinary filesystems should be thought of as a special case where you want to attach storage to an individual computer.

Edit: I'm going to elaborate, because people are calling me naïve. Full disclosure: I work at a cloud provider on a storage team.

For most people and applications, you simply don't get good value for your money by using filesystems and hard drives directly. We've tried to make things more reliable and durable with backup policies, RAID, and ZFS but the fact is all of these things come with operational and capital expenditures that compare unfavorably with common cloud storage options. There are some good technical reasons why cloud storage is better: basically technologies like RAID and ZFS are attempts to make each layer of your storage stack completely durable and available, but this approach is not competitive with the way cloud storage is typically implemented, which is to build a reliable distributed service on top of cheap hardware. Consider RAID 1, for example. This gives you N+1 redundancy at the drive level for an individual computer. This worked in the 1990s but drives are bigger and RAID failure modes suck with larger drives—it's worrying how common it is to see errors when rebuilding a degraded RAID array, and at N+1 that means that your data is lost from that computer. Essentially, with modern drive sizes (4+ TB seems pretty common these days) a RAID 1 array should always be considered N+0 instead of N+1.

Cloud storage is implemented much more intelligently. If you have distributed storage, you can simply spread files across computers in different DCs and use error correction codes to increase the redundancy. You can get more nines of durability and availability for less money this way. You end up with something like 33% overhead on disk space instead of 300% overhead, and you're also off the hook for a big chunk of your capacity planning and various other operational expenditures.

These days I would consider starting from "this file is in cloud storage, and we have a local cache" rather than "this file is in local storage, but we have a cloud backup". That's really all I'm saying.

It also won't always be competitive. Sometimes cloud storage is more expensive than regular filesystems, depending on how you're using it. If you're a big company you can sometimes amortize the costs of doing it yourself better. That's all I mean by "default"—I'm going to put my data in cloud storage unless I have a compelling reason to store it some other way.

mi100hael · on Aug 30, 2016

That's awfully naive, especially for tasks like video editing that are significantly impacted by disk read/write speeds. Even a NAS on a gigabit network is going to be roughly 6x slower than a standard internal SATA III spinning disk.

klodolph · on Aug 30, 2016

I said "by default", the implication being that you'd do something else if your application needs it. But it's much easier from an operational perspective if you start with a reliable system (replicated, networked storage) and cache locally for speed, then to try and make local filesystems reliable and durable.

Keyframe · on Aug 30, 2016

I agree. Network-wise we start at 10GbE. It's a lot more complicated than simple file storage on network though. Many needs and solutions. And I mean MANY.

nickpsecurity · on Aug 30, 2016

I disagree. I think we should default on storing blobs of data in local storage to retain full legal and technological control of them. Storing them in 3rd party services under their EULA's, SLA's, and API's should be a special case to improve attributes of data like its availability or cost of distribution. The way most people and companies use them now. :)

alfalfasprout · on Aug 30, 2016

Uhhh have you seen the internet data transfer costs for S3? That would become absurdly expensive quickly. Even with a dedicated cross connect.

fwessels · on Aug 30, 2016

S3 data transfers costs are an issue -- that's why you can host minio yourself at any hosting company, and save significantly (multiple times) on data transfer and storage costs.

athrun · on Aug 30, 2016

You're right, but I would reframe it along the lines of Network filesystems (like NFS or OCFS3) vs. Distributed Object Storage (S3). In that sense, certainly, the current "default" is to use the latter and avoid the former.

Local filesystems and/or volume managers won't go away anytime soon. Internally, a system like S3 needs a unified access to the storage, which is provided by the filesystem.

I think we are going to see the emergence of new filesystems that are much simpler in design compared to ZFS (as reliability is left to an upper layer in the stack) for use in the Cloud. Somewhat similar to the trend toward lightweight OSes built for the cloud (CoreOS, Project Atomic, etc.). Many features that were in the realm of the operating system are now delegated to upper layers in the stack.

orestes910 · on Aug 30, 2016

Can you help me understand this statement better? Why should we do that?

I may sound like I'm playing dumb, but I'm really struggling to see whats compelling about this in its current state aside from the fact that its one tool as opposed to a RAID + filesystem + something to make the data available.

zzzcpan · on Aug 30, 2016

It's a bit confusing, but minio is not a resilient storage. It's just a server, kind of like webdav, but with s3 api and a capability to use multiple filesystem folders with erasure coding.

As for distributed object storages, I would expect them to work great for video editing, since they can saturate any link given enough servers. But not out of the box, you would need a client designed for it, splitting files into chunks in parallel, etc.

krishnasrinivas · on Aug 30, 2016

We are working on distributed minio (resilient to server failures) on the "distributed" branch here https://github.com/minio/minio/tree/distributed

Currently available minio is resilient to disk failures using erasure coding (similar to RAID)

krishnasrinivas · on Aug 30, 2016

Uses cases like these are a really good fit for Minio. i.e videos/photos ... actually any blob/file.

Keyframe · on Aug 30, 2016

Would one expect any issues with large files? Can one file span machines? For example, you have a 1TB single file, but one machine has 500GB free and the other 200GB and third 400GB or whatever (stupid example).

I really think this could be useful to build something like Avid Interplay on top of.

y4m4b4 · on Aug 30, 2016

Yes definitely you can read our docs here https://docs.minio.io/docs/minio-erasure-code-quickstart-gui... for more understanding and even hardware recommendations.

zx2c4 · on Aug 31, 2016

Their CLI client is called `mc`. This is an unfortunate conflict with the venerable Midnight Commander.

andrewchambers · on Aug 31, 2016

I love the website. I'm a lone developer who doesn't know any HTML, how would I go about getting such a nice design for my own projects? (Or how much would it cost)

zbuttram · on Aug 31, 2016

Wappalyzer (https://wappalyzer.com/) tells me they're using Bootstrap (http://getbootstrap.com/) (probably customized a bit). HTML isn't very difficult (just another markup language) and if you're not inclined toward design (I am also not) there are a plethora of CSS frameworks to choose from (like Bootstrap) that will get you up and running with something not completely ugly. Personally I like Bulma (http://bulma.io/) right now which showed up (I think as a Show HN) on here a while back. Currently using it for a project and I'm enjoying it.

andrewchambers · on Aug 31, 2016

Really my design sense isn't great, given time I can hack together something with bootstrap, but I do think I lack the designer training and probably instincts

zbuttram · on Aug 31, 2016

Same here. I recommend making friends with some designers or looking around at pre-customized versions of Bootstrap. I also spent some time looking for this: http://jgthms.com/web-design-in-4-minutes/ One of my favorite sites for this type of conversation.

jedisct1 · on Aug 31, 2016

Or run LeoFS http://leo-project.net/leofs/

Keyframe · on Aug 30, 2016

Unrelated question. What's the point of fullscreen button on those term session players (or whatever they are) if it doesn't stretch the playback to fullscreen? You only get a same-sized screen with black around it. It's not even centered to the screen.

eknkc · on Aug 30, 2016

I guess it is https://asciinema.org but their samples have centered full screen. Maybe a CSS issue here.

I'm not sure about the point either. Maybe if you embedded a small player it would be zoomed out and fullscreen would show the native style.

jdc0589 · on Aug 30, 2016

all my brain sees in the domain name is "ascii enema"

nulagrithom · on Aug 30, 2016

Is this just meant to emulate S3 for the sake of dev/test environments? Without clustering/HA I don't really see the point of using this over the plain old file system. Or am I missing something?

krishnasrinivas · on Aug 30, 2016

Absolutely, our focus currently is on multi-server minio which is being actively developed on the "distributed" branch https://github.com/minio/minio/tree/distributed

Our current stable version can export single disk or multiple disks (using erasure coding providing protection against disk failures) As it is very easy to get started with (single binary, thanks to Go) people find it attractive for dev/test environments.

To replicate for HA (even for the single server version), use "mc mirror -watch SOURCE TARGET" command to pair them up. If you have multiple drives (JBOD), you can eliminate RAID or ZFS and use Minio's erasure code to pool them up. Distributed version is also in dev/testing at the moment. It should be out in a month.

olalonde · on Aug 30, 2016

Previous discussion: https://news.ycombinator.com/item?id=12122998

helper · on Aug 30, 2016

How easy is it to embed this into go tests? Right now I use goamz/s3test for that, but it has a lot of limitations.

khc · on Aug 31, 2016

goofys and s3fs both use s3proxy for this, which works fine as long as you are ok with having Java as a test dependency: https://github.com/kahing/goofys/blob/master/test/run-tests....

y4m4b4 · on Aug 30, 2016

Quite easy actually you can look at

https://github.com/restic/restic/blob/master/run_integration...

helper · on Aug 31, 2016

I don't want to run it in an external process, I want to run it in a goroutine.

y4m4b4 · on Sept 7, 2016

For that you can just do

```

package main

import minio "github.com/minio/minio/cmd"

func main() {

        go minio.Main()

        ... do your stuff ...

}

```

scoopr · on Aug 30, 2016

So, I can use midnight commander as the client? ;) (half joking, half serious)

unboxed_type · on Aug 31, 2016

Why is it so important what language it is written in? :-)

avereveard · on Aug 30, 2016

couldn't find at a glance wheter it has the same read after write issue of s3, or in general what the consistency is.

also, failure and backup modes.

kparthas · on Sept 7, 2016

Minio server provides read-after-write consistency. For fault-tolerance, * protection against failed disks, you could deploy Minio erasure code setup. ref: https://docs.minio.io/docs/minio-erasure-code-quickstart-gui...

* Minio erasure code setup also provides protection against "bit-rot".

muminoff · on Aug 31, 2016

Do you guys have plans with multi-tenancy feature?

koolhead17 · on Sept 7, 2016

Absolutely, we are working on it. Please visit our "distributed" branch https://github.com/minio/minio/tree/distributed

anonymous7777 · on Aug 30, 2016

ok tired of people bragging about "Go". It underperforms than many GC based languages that are out there.

RubyPinch · on Aug 30, 2016

Generally, if you comment less about Go, then you end up in less discussions about it

beastman82 · on Aug 31, 2016

written in Go - Does this matter?

mrweasel · on Aug 31, 2016

Yes and no, if you're in the market for an S3 clone, but want to be able to add features, fix bug or hack on it in some way, it nice to know which language it's being developed in.

As you can tell from the other comments, there's plenty of alternatives to pick from, and if you're going to dive in to the code yourself the language may be a deciding factors.

unboxed_type · on Sept 2, 2016

It is important, because you will not find any Go-developers on the market, so if you are serious about using it then think twice ;)