Things like Sandstorm are really important as they allow people options without becoming a full-blown sysadmin. Just look at how many crowd funding ideas are based on 'personal cloud' concepts.
However, we also need to work on the fundamental problems to make it easier to build decentralised products in the first place (not everything is a web-app). Namely, how such apps are built, how they store/sync data, and how we deal with identity. The current tools simply aren't designed for the world we're heading towards, so we need to re-evaluate our assumptions. On top of this is the need for business models that don't rely on mass data collection (eg advertising) -- we can't rely on everything being open source but the underlying infrastructure must be.
There are many ways forward and the particular approach I'm taking is based on unikernels and creating a modern stack to deal with the above issues directly. There's more info at http://amirchaudhry.com/brewing-miso-to-serve-nymote/
If anyone happens to be at FOSDEM this weekend I'd be happy to chat about these ideas in person.
Open protocol for synchronized personal storage. It uses a decentralized model where users provide and pay for their own storage. Could be game-changing if it takes off.
I've quickly read through your RFC, and since I've recently added support for Content-Range to LibreS3[0] that is the first thing I looked for, and sure enough you support it.
Although it is not clear to me why you need to use webfinger to announce Content-Range support.
There is Accept-Ranges header, or you can detect that you get a 200 instead of a 206 reply on a Range request for GET.
For PUT RFC7233 says "A server MUST ignore a Range header field received with a request method other than GET." so I'm not sure how that would work there, can you give an example?
I should have probably made it a bit more clear that I'm only working WITH remoteStorage on one of my side projects, and am not one of its developers. =)
remoteStorage is an elegant solution in a lot of ways, but there are some problems:
* Apps written in this style have a very hard time being real-time collaborative. Maybe this will be solved by WebRTC, but it's a lot easier with a server component. (Sandstorm has several real-time collaborative apps, like EtherPad, EtherCalc, and Wave. Currently you can share these apps with other users by copy/pasting the URL; the sharing model will get more sophisticated eventually).
* You need to set up a storage server for the apps to use, or use one of the big providers that like to do data mining. You then need to connect the app to the correct server, which is extra busy-work.
* The permission model of these storage servers isn't terribly sophisticated. Often you'll end up granting the app broader permissions that you really want.
* There is, of course, nothing stopping the app from storing data to other places as well, nor is there any isolation between separate documents opened by the same app. (On Sandstorm, each document is a separate instance of the app in a separate sandbox which cannot communicate with the outside world without permission.)
Edit: Missed that you said you're working on RemoteStorage. Probably would have phrased differently if I noticed. Wasn't meant to be an attack. Sorry! (We actually like remoteStorage apps a lot in that they are often easily ported to Sandstorm. :) )
Actually you got it right the first time: Working with* not on. =)
My wording could probably have been a bit more precise.
Having worked with it for the past few months, I do agree with a lot of your sentiments on the shortcomings of the server-less application model, but in many applications where user privacy is a high priority (which is increasingly the case since the Snowden revelations), having a fully portable, standards-based storage solution that you can host on your own if you choose is incredibly compelling.
With the remoteStorage-based serverless app I'm working on right now, it's been quite a challenge trying to reach feature parity with existing client-server apps on the market today. But once we even get close to achieving feature parity, I believe we'll have a very compelling solution.
I love the idea of unikernels! I took a much high level approach for p2p apps with Fire★ (http://firestr.com). Where it is GUI centric and has a built in app editor (all application code is viewable/editable).
My only concern with the unikernel approach is that you can end up with a system where the code is not viewable and editable.
I know this is going to sound ridiculous and maybe this is the wrong topic. I'm at a spot where I wish there was a closer to turn key framework for my project.
I wrote 3 interdependent web servers, have them running on all the same virtual machine on digital ocean with varnish. Now I need to upgrade ubuntu and I don't want them to go down. So, apparently I need at least 3 machines if not more. One to run varnish or something similar, to direct to the other 2. Some way to bleed people over to one of the 2 machines. Once everyone is on one machine I can upgrade the unused machine. After that I can bleed them all back and then upgrade the other.
Is this too small a niche? Is the answer I should have used "google cloud"? It just seems like each step is so much work. I certainly learned a lot (vagrant, ansible, and other stuff) although all of that knowledge will be probably obsolete before I need to do this again.
Is there something I should have looked at? Sandstorm seems one level down. Like it's more like a replacement for the old lamp/cpanel isps.
In your scenario, you need "just" two machines - and really only one running most of the time. You have varnish+appservers running one one vm now. What you need is one more vm, running the same stack. Depending on how you handle sessions this could be as easy as:
0: reduce ttl of dns to ~60s
1: bring up vm2==vm1
2: upgrade vm2
2.1: test vm2
2.2: shutdown vm2, take DO snapshot
2.3: instanciate vm/droplet of vm2
3: point/change dns from vm1 to vm2
4: wait for traffic to die on vm1
5: shutdown vm1
6: possibly increase ttl in dns again
Next round, same procedure. This is not ready for HA -- easiest way there is running 2 vms like now, but with haproxy in front (so the vm has ha-varnish-3appservers), along with heart-beat and a shared ip. I don't thing DO supports that -- and it is probably not worth the hassle if you can live with dns changeover-time==downtime (say, minutes modulus wrongly configured dns caches).
With a couple of dedicated ips, you can usually get away with vm2 taking vm1s ip (and vice-versa) -- but again that depends on your session set-up and/or if dropping sessions are ok. With sessions in some kind of replicated cache/db (eg: redis, mysql) -- you could probably set vm2 to slave over/replicate sessions, then do the dns [ed: or ip] switch[over].
My real point was why isn't this a finished product yet kind of like Sandstorm? Basically why hasn't someone made a system that just handles this for me. I start n services, when I want to upgrade something it provides a simple UI or a couple of command line options to just do it. Why is it all done by hand?
It seems like enough devs might need this now-a-days.
Note: I get why. I guess my point is there's an opportunity here and Sandstorm made me think of it.
I'm using Docker + Fig. The nice thing is that you can exactly replicate your production environment on your local machine and the Dockerfile syntax is really simple. The not so nice thing is that there's no "simple" way to run docker containers on multiple physical/virtual machines right now (that should become easier once Docker team releases their cluster solution). It would be nice to have a fig registry where people can publish complete Docker based systems (e.g. varnish + nginx + Node.js server) but I don't think there is such thing available at the moment. I wrote a bit about that idea here: http://syskall.com/crazy-and-not-so-crazy-startup-ideas-2015...
Big letters: "Sandstorm makes it easier to do XYZ" or "Sandstorm lets you do XYZ by doing XYZ"
E.g. "Sandstorm lets you run your own personal web applications without needing a background in IT!"
or "Sandstorm lets you to install personal web apps as easily as you install mobile apps!"
3 examples of what this could actually mean for 80% of your users
"Run your own Dropbox!"
"Host your own WordPress Blog!"
"Get a mailbox to match your personalized email address!"
THEN drill down into what it actually is (Sandstorm is a open source platform that makes it easier to run and manage your own personal server, yadda yadda), and its more specific features, such as usability, security, etc.
(This advice operates under the assumption that "individuals" are your main target audience.)
+1
I went to Sandstorm's home page, and spent the first 5 minutes trying to figure out what it is & what it does.
"Sandstorm is an open source platform for personal servers". Ok, fine, but what is it REALLY? What does it do? Why is it better than (hosting / VPS hosting / AWS / Docker / PaaS)? Give me some examples of what I can do with it.
It's better because the internet was supposed to be a bunch of computers talking to each other. It was a beautiful vision. Instead, it's been centralized on two levels:
FB/Google/Instagram etc. serving content, and
AWS/DigitalOcean owning the hardware for those intrepid individuals who want to roll their own solutions.
The internet wasn't supposed to be Amazon, Google, and Facebook all talking to each other. It's scary that ISP's don't even want you to host your own (modest) server. It's SUPPOSED to be a bunch of computers networked together! Sandstorm makes it easier to live that vision where you own the hardware, or at LEAST have full control over your cloud. It doesn't necessarily need to be your home computer - a colo'ed odroid (or RPi if your needs are modest) would do the trick too. As more and more of the internet is gobbled up by VPS services I think it's important that the average Jane or Joe can still put together their own website, blog, game server, etc. and not be reliant on a company for it.
Unfortunately, I don't see how this solves the problem. The main problem for me is this part of my ISP's ToS: "Users may not run any type of server on the system."
Further, every ISP I've ever had has had some such clause. I'd have to get a business plan to actually be allowed to run a server. So who is this, or any home server software, even for?
Sandstorm is not necessarily about running the server at home (though you can). It's more about being able to choose what is on your server and control how your data is stored and accessed, whether on a home machine or running in a datacenter.
1) You understand how to use the Unix shell and everything else that goes into maintaining a Unix machine.
2) You have the time to do it. (This is what has always stopped me, FWIW.)
3) You are willing to spend money on a machine that has sufficient resources to be responsive when you use it but sits idle 99% of the time since you're the only user.
These obstacles are what drive people to SaaS, where they no longer have freedom to install arbitrary software.
Unfortunately we lost all of the small independent ISP's that offered any semblance of competition. I just host my own stuff anyway with an ISP known to be pretty relaxed about it. You're right, though, it's a major problem.
(I know it was kind of broken under all the traffic earlier.)
Usually the demo is what makes people "get it".
It's been surprisingly difficult to find a sentence or two that describe Sandstorm in a way that is effective on everyone. For any text we use, different sets of people get it or are confused. :/
People on the internet are easily distracted and have short attention spans. You want them to get interested enough to actually run your demo. I'm not going to take 10 minutes to delve deeper unless you hook me to begin with.
Also, you don't need to have text that appeals to everyone (there is no "average" user), but you should be able to write text that appeals to at least one of your groups (individuals, developers, enterprise). The two sentences you have currently are so generic that they don't say anything at all. An open source platform? An open source platform that does what?
Target the group with your messaging that you are targeting with your platform. Sure, sandstorm could be used by any of them, but which group is MOST important to your platform?
10 minutes? The demo allows you to set up a Wordpress blog in literally 10 seconds (it's four clicks and no typing or scrolling -- not even to log in). I'm not sure why that's so onerous, even for folks with short attention spans.
That's such a great line that you just wrote: "You can setup a WordPress blog in 10 seconds". Why don't you say that under the demo link? Or say, "Try our demo. It takes 10 seconds to install WordPress" or whatever app.
It's not onerous at all, but you have to get people to the point where they're actually at the demo. My "10 minutes" was based on the thought process that goes through my head when I see a "try our demo" link. If the demo takes only 10 seconds, that's highlighting a major selling point of your platform, so make that explicit.
Looks good! I would think about replacing your main tagline with that sentence, or something like it. Not to be overly harsh, but your main tagline doesn't say anything.
Edit: Actually, I think if you said something like "Sandstorm is an open source app platform for personal servers" that would be a major improvement. The whole "app" part is missing from the main tagline. Then, your sub-tagline goes into more detail about what apps.
Edit 2: Actually, I would remove the open source part altogether. It's redundant if you have a github link somewhere on your page, which you do, and I think the developer community you are targeting would assume that it's open source. Or, just keep "open".
Actually, the words "open source" are a recent addition to our header, whereas we've always had the github link. We discovered from feedback that many people who visited our page had no idea that it was open source, since most people don't look at nav bars, and this of course completely changed their perception of the project (for the worse, obviously). When we put "open source" into the header, we saw a marked increase in interest.
Thanks for the feedback, though! We'll think about inserting "app" in there.
"... Usually the demo is what makes people "get it"...."
Kenton, I'd agree with this approach.
The killer approach is for newly created applications, ported to sandstorm to take advantage of the isolation, security and scalability.
So one area to look at, might be extra tools/paths to port, maintain and expand development.
For cough, Microsoft (platform), it was VB, the killer app. For Linux (platform) it was Apache (killer app). So "the path" to get applications on Sandstorm (platform) to create a killer app, might be the answer.
==== background ====
In fact, one way would be to ask what apps people (HN for example) already use and what problems they have. You need a feel for the numbers of applications companies/startups use. Is it technical? Is it business related? Is it cost?
Install it (if it's ported) and work a discussion around it. For example the reader who chimed in on creating a page on Wordpress - show the path to do that.
Another one I'd suggest as a side-business/demo is a collaborative editor (hello etherpad). [0] I know for a fact google, for example, use some crappy Doc editor (sans the nice editor features) to screen candidates. So there's a demand there.
For the technical minded, poking around https://capnproto.org/ really explains what sandstorm servers can do.
There's this idea around that people don't read text. What if it's just that most text sucks? Web designers end up writing web pages with text that isn't really designed to be read, so people don't read it, and then it get optimized away. The result is often really weird. It's this blank pastel page with some vague promises and a SIGN UP NOW button. Zombo.com all over again.
I thought your Cap'n Proto page did a good job of this actually! It tells the story.
Yes, I definitely think something along those lines, especially if you're targeting a group of people who are somewhat tech savvy, but not tech savvy enough to run their own server.
Good ideas, but make sure you hire a copy editor! "Let's" is short for "let us", like "let's go to the store." "Lets" is a conjugation of "to let" (i.e. to enable) like "Sandstorm lets you do XYZ."
Sounds cool but I think the WordPress implementation is TERRIBLE: it depends on a WordPress fork that is completely outdated, instead of downloading an up-to-date fresh archive.
I agree -- the current WordPress package needs work. Thank you for trying it and looking into it!
Community-wise, one thing we're going to need, as Sandstorm grows, is an ecosystem of app package maintainers. Part of what we're hoping is that more developers of the apps themselves will maintain the Sandstorm ports, like Audrey Tang is maintaining the EtherCalc port.
Tech-wise, one thing we're going to need is a solid story for how Sandstorm packages will easily stay up to date with the latest changes as the upstream author releases new updates.
I work on+for Sandstorm, and I'm also a Debian developer. Debian is not a shining example with regard to either of the above, and I'm sure we can do even better at Sandstorm.
It's not currently possible to run arbitrary Docker containers through Sandstorm, since we prefer app packages (we call them SPKs) to be:
* Self-contained -- if the app needs MySQL, bundle it;
* Able to run with external network access unavailable -- this improves security, since even if an app gets compromised, it's not a big deal since it can't leak any data out to the world;
and a few other constraints that are more technical than philosophical.
We actually don't want apps to bundle MySQL -- we'd prefer they use sqlite. :) But the point is, it's up to the app. The app gets a slice of filesystem and they can use whatever infrastructure they want to store stuff to it.
We want the experience for users to be install app, use app, without worrying about setting up databases and such. We also want to enforce isolation between apps so one app cannot access another's data, and that's a lot easier to do if they aren't sharing a database. Considering these desires, it makes sense to say that apps should simply bundle their database of choice.
Even so, I don't see how you could distinguish between a compromised app sending data to the outside world and an uncompromised app doing the same as part of its normal operation.
That's just marketing bullshit. Unless the API is magic (and I don't mean advanced technology "magic" but Harry Potter "magic") it has no way of knowing what the application is allowed to send or not and therefor cannot filter. It's like saying it cannot leak data because it has to use HTTP.
> Unless the API is magic (and I don't mean advanced technology "magic" but Harry Potter "magic") it has no way of knowing what the application is allowed to send or not and therefor cannot filter.
You're assuming that Sandstorm apps have arbitrary IP network access. They do not.
Sandstorm is based on capability-based security. Any outgoing request has to be addressed to a capability representing some specific permission that the user has granted to the app. A capability might point to another app, or it might point to a specific external host that the user has designated.
More specifically, a Sandstorm app's only connection to the outside world is through Cap'n Proto RPC, which is an object-capability protocol, meaning that an app can only send requests to objects to which it has explicitly received a reference.
Of course, for backwards-compatibility, we have translation layers so that apps written to use regular old HTTP need not be entirely rewritten. You just have to tweak it to make the correct permissions request first, which has proven not very hard in practice.
Note that Sandstorm is still in development and for the moment we've created a hack to allow ttrss to make arbitrary HTTP requests in order to update feeds.
However, in a few more months this won't be necessary. Instead, when you click "subscribe to feed", the app will call a method on the Sandstorm API saying "Prompt the user for a URL and then give me permission to access it". So, you'll get a dialog box to enter the URL rendered by Sandstorm itself. If you enter a URL, it's plainly obvious that you want the app to have permission to fetch it, so Sandstorm grants said permission. We call this UI the "powerbox".
Notice how the UX here is equivalent to what we have today, where the app renders its own prompt. This technique of inferring security decisions from actions the user was doing anyway is the core of how we plan to implement tight security without inconveniencing the user.
I've been using TinyTinyRSS on Sandstorm for a while. It even has a mobile app that works with Sandstorm's API. (Though it's a fork, not the official Play Store version.)
Sandboxed applications literally cannot send any data by default. They can't open a connection to <whatever server>, no matter what protocol.
The goal, once they've built their Powerbox, is to then implement a set of protocol drivers which the application can use. So it still can't connect to arbitrary servers, but it can ask the user for permission to, say, connect via SMTP to <wherever>, and the user has control over that.
Yes, they could leak anything that you put in them if you allow them to connect to someone you don't trust. However, even if you do so once, most applications will be per-document - you have an instance of your document editor for each document, and they don't know anything about any other documents you have.
In short: applications can only leak what you give them, and only to people you say to give them to. They can't call back to home base without your permission or the permission of someone you've given the app permission to contact. So for all reasonable definitions of "cannot leak data", applications cannot leak data without your permission.
It's worth keeping covert and side channels in mind, though: e.g. an instance can leak bits by timing variations. Capability security is a big big deal, a qualitative change in the game, but I think this comment is over-promising things.
Yes, covert side channels should always be assumed to be possible.
However, there are two reasons I think you don't need to worry about them too much:
1) They'll typically be fairly expensive and low-bandwidth.
2) They're unambiguously malicious. This is not a technical barrier to using them, but it's a huge political barrier. Today, major developers will happily stick covert statistics gathering into their code, and then when called out on it, will make some contrived argument about how it benefits users (if that's true, why don't you ask them first?) and how it's mentioned in the privacy policy so therefore it's legit. OTOH, you can't exploit a covert channel in Sandstorm and then plausibly claim you haven't done anything wrong.
Some hardcore security nerds will of course scoff at this argument, and to them I can only say: "OK, yes, there are possibly covert channels, sorry. Please don't put sensitive data into an app you don't trust."
A theoretical long-term solution is deterministic computing, but that probably requires apps to be written in a different language or be run in a heavy-handed VM. Not practical at the moment.
It's also worth noting that Sandstorm is designed to make it impossible for an app to leak capabilities via covert channels. They can only leak bits, and a capability is not just bits.
Yep, good points; I just think the GP was too absolute. It's good to hear Sandstorm's built on object capabilities instead of password capabilities; since I wasn't sure I didn't get into that, or deafening (determinism to eliminate side channels into a process; I gather that outward is much harder to control).
* Backend: Due to Linux network namespaces, the app can't communicate with the network (except over "sandstorm-http-bridge" which allows it to respond to inbound HTTP requests).
* Frontend: Due to Content-Security-Policy, the client part of the app can't communicate with any hostname other than the one the app runs on. The CSP header is set by Sandstorm, not the app.
So then it has no network access, and therefore even if it is compromised, can't leak anything.
This does hinge on the app's dynamic code only being run for logged-in users. For many apps -- imagine a Google Docs spreadsheet only accessible to people within your domain -- this is a pretty straightforwardly reasonable model. Sandstorm handles authentication for apps, so it can enforce this even if the app is 0wned.
An app does not have the ability to edit who has permissions to itself. In order to add yourself as an admin of some app, you'd have to compromise Sandstorm, not the app.
I'm not sure how that is supposed to work. You would have to rewrite every webapp so that it's data can be protected by sandstorm. Which seems hugely impractical. And as long as the webapp has access to it compromising it will compromise the data.
Not "rewrite". You do have to tweak apps to be Sandstorm-appropriate, but it's usually somewhere between five minutes and a couple days of work. Namely:
* Delete the login system and use Sandstorm's. If you build on Meteor, for instance, this is a simple matter of swapping dependencies.
* Delete your sharing system. If the app hosts multiple things that can be independently shared, change it to host only one such thing. The user can create multiple instances of the app and using Sandstorm's sharing. This is probably the hardest part, but we've done it for several apps now without too much trouble. Since it's largely deleting code, it's not very difficult.
* Find the places where your app connects to the outside world and insert a bit of code to make a Sandstorm powerbox request to get permission first, then address the requests to that permission.
None of this involves "rewriting". We have 20+ apps on the Sandstorm app list, most of which were ported by two people who certainly didn't have time to rewrite each one.
I've ported apps to Sandstorm with literally no prior experience with the languages those apps were written in. Porting to Sandstorm involves more deleting stuff you don't need than actually writing code yourself. :D
The only "everything" you should be able to get, if the security is correct, is for the app you compromised, not the other ones running on Sandstorm. No, it does not magically secure applications put behind it (though IIRC it does put a couple of useful tweaks in place, but that's all it can do), but it can prevent "I compromised your WordPress and stole your entire machine's contents."
It's because of Sandstorm's security as a platform. Apps cannot see each other's files on disk, because each one runs in a container with only their own subdirectory mapped in.
It's the case of me scrolling the site and reading most of the GH readme - and still getting almost no idea what status it is, what is the goal/vision and how I might use it..
"We do LXC stuff in secure and user friendly way" is the message?
Here's my own summary, which if you like it, I can try harder to make sure becomes more prominent somewhere:
Sandstorm is a way to run web apps as containers, gloriously sandboxed from each other, and moreover a web interface to install them easily and allow the user to create multiple instances of a web app easily.
It intends to grow features relating to sharing instances -- so that an instance of a web app is as easy to share with someone else as a Google Docs link -- and grow features relating to supporting more network protocols -- so that apps can safely communicate with the outside world, mediated by the person using Sandstorm.
Right now, the target audience is people who like running web apps like WordPress or Ethercalc on their own server. In the future, the target audience will grow to include companies whose IT departments want to enable users to install web apps safely without asking IT first -- they'll know it's safe due to the glorious sandboxing.
The primary idea being that open source web apps can be used without having to be a server admin. Non-technical users can install apps on a Sandstorm server as easily as installing apps on their phone.
After about a year -- how tied up is sandstorm to meteor? I confess I have issues with the our-way-or-the-highway nature of meteor (our js, our db, our app-server) -- even if I can see that it does appear to give some pretty nice benefits for rapid prototyping.
I'd love to see sandstorm as a handful of small tools with various uis on top: command line, web, etc. Seems like it should be possible to do with (on the extreme end) go and and a berkley/sqlite db+file system for images?
Sandstorm's front-end UI is built on Meteor, but this doesn't affect apps -- they can be written using any stack. We have apps written in Meteor, Express, Rails, PHP, Python, C++, and Rust.
Meteor is actually amazingly modular if you look under the hood. We use it in a fairly default configuration, but it's easy for me to see how I would go about using a different database or a different templating language. Those people write high-quality code.
Eventually I would like to ditch Mongo and instead have Meteor speaking Cap'n Proto RPC to a Cap'n Proto database. I don't expect that I'll have much trouble getting Meteor to do this.
> I'd love to see sandstorm as a handful of small tools with various uis on top: command line, web, etc.
Hmm, not sure I understand what you're suggesting. Sandstorm is all about UI and running web apps, so it seems to me that a "command-line interface to Sandstorm" would really be a whole different project. :)
I don't think Sandstorm wants to be a platform on which you host your large user-facing app. Instead it wants to be the platform on which you can install your personal wiki, ipython notebooks, your streaming media library, etc.
You as the owner of the Sandstorm instance will control whether the apps on your instance can send data to Google for Google Analytics, for example.
If you invite someone else to use an app on your Sandstorm instance, they will trust _you_ with their data and you can decide whether the apps on your instance share the data with Google or not.
Does anybody have any thoughts on the differences between Sandstorm, Camlistore, and Tent?
It seems there are a number of problems here. We need:
1. Better data stores
2. Better server environments
3. Better ways to share data with others
I wonder if Camlistore's approach might not be the cleanest, since it doesn't try to bundle (1) and (2) together.
EDIT: Not to get too sappy, but any of these would be _fantastic_ compared to our current Web 2.0 disaster, and I'm glad Sandstorm is picking up steam.
Camlistore and Tent are both complimentary to Sandstorm. Sandstorm gives you a way to run apps easily, Camlistore is a structured storage system which other apps could connect to, and Tent is a federation protocol that apps could use to talk to each other. I'd like to see this all converge at some point. :)
I'm suspicious there's too much overlap between Camlistore and Tent for them to both be useful, since they each do data storage and sharing, but that's not your problem:)
This is exactly what needs to exist. I recently set up Ghost, Owncloud, and Gitlab on a personal server (odroid U3) that sits under my couch at home, and it's really rewarding to own the hardware which is my "cloud". However, it should be easier, and possible for anyone. Good for you guys.
I've been using dokku[0] for a while now and love how easy it is to just push random stuff up to a new subdomain. The other day I pushed up a doxygen html of an code base I was working with. I have my blog, portfolio site, random apps I use for myself, a cloud storage app etc.
It is definitely one more tool to learn, but it is pretty much a light wrapper around docker so it ended up being a great gradual introduction to the concepts and configurations of working with docker as well.
Be sure to install either the persistent storage[1] plugin or the docker options plugin[2] so that your apps can just use the file system on the server to make things a lot simpler.
Incidentally, running containers is probably a great way to "install" the ghost libc vulnerability[1] (assuming you're basing off of base-images made before the bug was patched, and you haven't updated your containers/images).
I'm not sure neither vagrant nor docker have this really fixed -- that is: easily patching the base system/image (and still be confident that the app keeps running).
Is there an easy way to update a container based off of a (possibly few generations remote) base-image? Eg: You've pulled down a bare-bones, official CoreOS/Ubuntu/Debian/RedHat image from docker -- set it up for your use-case (say made a base image with your own CA-cert bundled, wired it up for kerberos/ldap/AD, maybe set up a trusted ssh-server ca-cert) -- then made a handful of images based off that: db, cache, and web-app.
Is there an easy way to patch the base image and all descendants? I assume all state should be in other volumes, so maybe this is easier than I think?
At any rate, it is something to keep in mind -- that grabbing images are great, but updates are still needed!
As other mention, the ghost refereed to by gp, is a blogging platform.
Bad news: Sandstorm packages do not have any particular separation between "base system" and "app"; your app package is simply one big archive of the entire userspace filesystem needed. This is something we might conceivably do in the future, but for now we like the simplicity.
Good news: Once the app maintainer publishes an updated package, it is trivial to update your local app instances in-place. Much like installing apps on Android, the system just swaps out the old package for the new one without touching the user data. We are confident enough in the robustness of this that we plan to implement auto-updating of apps, again like Android (though you'll be able to turn it off if you prefer).
Curious news: With Sandstorm, it often (not always, but often) doesn't matter if an app has vulnerabilities. Each app instance is initially only accessible by its owner, and only accessible to others if the owner explicitly shares with them. Often, the people you are collaborating with aren't threats -- they're your friends.
Apps that public a public web site -- like Ghost (the blog platform, not the glibc vulnerability :) ) -- actually do so strictly as static content. Sandstorm serves the content for them, without executing any of the app's code.
Admittedly, this starts to break down if you want to have a public web site in which users can make persistent changes -- say, post comments.
Of course, if someone does compromise one of your app instances, it's only that instance. The rest of your server remains safe, since each app is in an isolated container.
None of this is to say that patching exploits doesn't matter, but security is not about absolutes, it's about risk management. It's significantly less likely that a bug in a Sandstorm app will lead to real damage.
There's a lot of focus of how sandstorm allows you to run web applications easily without having to setup the back-end that they need without SS.
There is an additional edge case, that's of web applications that don't have a back-end at all, ours falls into that category. Our web app is a by-product of two of our commercial products, but we don't actually have user management, storage, etc.
Online we integrate with Google Drive and Dropbox, but you can't create an account with us and store your data with us. Sandstorm allow people to deploy our web app, whereas you can't at all, previously. It saves us months of work creating and maintaining the functionality it provides.
Sandstorm - this is wickedly cool, tried the demo and it worked great, can't understand all the 'do this, do that' comments. As someone who is just learning to play with docker - just finished dockerizing all my vps apps so the first thing I think of is there a Dockerfile to build this or a docker image - off to have a look for one. Awesome stuff - like the collection of apps you have ready to go. Maybe you have fixed the landing web page in the meantime but I had no trouble understanding what you are about. 100.times upvote.
> No protection from getting your job done: Security can often be a hassle, getting in the way of your work. Sandstorm is different. When you tell a Sandstorm app to talk to some other app, or to talk to the internet, Sandstorm sees your intent and automatically grants it access. So, you are never interrupted by a prompt asking "Do you want to allow this app to the thing you just told it to do?" And yet, the apps only get the permissions you actually wanted them to have.
I love that Jas (being on the Google security team) just instinctively XSS's any form he fills out, and I love that our code just leaves his tag there, properly escaped, for all to see.
There is no 1GB limit. I think you might be confused about compute units? Compute units are just a measure of RAM usage over time -- a compute unit is 1GB of RAM used for one hour. An app can use more than 1GB of RAM; it will just consume compute units faster. E.g. an app using 2GB will consume a compute unit in 30 minutes.
This all relates to our upcoming managed hosting. Self-hosted installs are not metered since it's your own hardware.
I am not really sure why this is so exciting to everyone. I have seen a couple comments asking what this software does, explicitly. I will admit I like the idea of personal servers but I am not sure how to apply this potential amazing software to my life. Suggestions?
Very cool project. I've played around with this same idea, but as a CLI package manager, rather than a webapp.
I think it would be cool to have custom VPS image, where you can install webapps to it from the CLI out of the box easily. Sort of like homebrew, but for your personal servers.
What's make me curious - do you plan enable option for selling apps? As a developer I would be more than happy to allow user install my app, but since I am not charity I would like to earn some money on it as well.
Yes, we'll have an app store much like iPhone/Android. Open source apps will have the option of using a "pay what you want" model, but we'll also allow proprietary apps with fixed prices and maybe even subscription-based.
Yeah the demo "app list" is just a placeholder. We're working on an "app store" with self-service uploads, searchability, paid apps, pay-what-you-want for open source apps, etc. Once it is ready it will be available to everyone, whether you use self-hosting or managed.
And indeed, when we were #1 on HN yesterday morning with hundreds of apps running concurrently, it got a bit slow and crashy. :) But things cleared up after the traffic died down a bit.
Our upcoming managed hosting service will, of course, use multiple machines with automagic scaling.
How do you get individually installed apps to selectively share data with other installations? I imagine something like diaspora tried solving this problem. Is there a general solution for any app?
This is something we're still building (Sandstorm is still in alpha), but what we have in mind we call the Powerbox.
The idea is that one app can say to the system "I implement such and such protocol at such and such endpoint". Later on, some other app can say "I need something implementing such and such protocol". The system itself displays a picker, showing the user all of their other apps that may satisfy the requirement. When the user makes a choice, the requesting app is told how to contact the providing app and is implicitly given permission to do so, whereas prior to the exchange the apps had no ability to talk to each other. The user can inspect these connections later and potentially revoke them.
This is a whole lot easier for the user than going to the providing app and editing an ACL, then going to the requesting app and giving it an endpoint address, etc.
The way this is implemented under the hood is in terms of Cap'n Proto capability-based RPC. Blog post on that:
That's pretty neat it's like the android intent system.
How does the system get the list of all other apps that satisfy the requirement? I'm guessing all apps register with sandstorm server somewhere that has a centralized list of other servers?
It'd be neat to have standard protocols, in the same way we have standard media types.
The system knows about other apps installed on your server, but not necessarily apps on other servers. To connect to something on another server you will usually want to obtain a Cap'n Proto capability to it. You might do this through, say, a messaging app that has the ability to embed capabilities. Your friend sends you a message with a capability to some object on their server, and then your messaging app publishes that capability on the receiving server, such that it will now appear in the powerbox for other apps on that server.
Alternatively (less cool, but more practical), you might just drop a URL into the Powerbox and Sandstorm will connect to it and turn it into a capability.
I'm a DevOps developer, so keep that in mind as you read below.
Sandstorm looks insecure and inefficient, but that probably won't matter.
The ease of use for the end user trumps all. Users will love this, but I'm not looking forwarding to having to administer boxes running Sandstorm, though.
I imagine there will be a fair bit of work from my end to re-building apps so they can take advantage of tuned settings, shared services, caches, and the like. Plus figuring out a way to automate the usual securing, managing, monitoring, and cleanup around the Sandstorm environment.
Identifying bottlenecks will be fun too, though my first instinct will probably be to look at the Cap'n'Protocol bridge which Sandstorm runs everything through.
As the lead developer, I emphatically disagree with this. :)
If you'd like to state why you think it's insecure, I'd love to hear it, but security is incredibly important to us and something we've put lots of effort into. I don't deny that there may be bugs (it's an alpha), but by design Sandstorm is a highly secure way to run other people's code.
> and inefficient
While it's true that running lots of small per-user (or per-document) instances of apps is necessarily less efficient than running one large multi-tenant server, it's not nearly as bad as it sounds. Instances of the same app share their code an assets (read-only) and are aggressively shut down when not in use, which makes up the vast majority of the inefficiency. Meanwhile, infrastructure continues to get cheaper...
> If you'd like to state why you think it's insecure, I'd love to hear it
The long-term security of Linux containers has not been well explored yet. There have been exploits against the Kernel found, and there are likely to be more.
Plus, how much effort has gone into hardening the Cap'n'Protocol bridge? Do you have a security expert reviewing the code and looking for vulnerabilities? If so, great! I take that part back wholeheartedly.
I appreciate that you've worked past the Docker failing of not signing and validating files; this is a huge step in the right direction.
> Instances of the same app share their code an assets (read-only)
But not their in-memory caches. Code and shared make up a fraction of their presence in memory.
> infrastructure continues to get cheaper
This has always sounded like a lazy cop-out to me. Yes, infrastructure is getting cheaper, but our applications are getting bloated at the same rate. And if we're running potentially dozens of PostgreSQL instances on a single machine, your infrastructure costs to make all of the apps performant are not going to be cheap.
> The long-term security of Linux containers has not been well explored yet. There have been exploits against the Kernel found, and there are likely to be more.
> Plus, how much effort has gone into hardening the Cap'n'Protocol bridge? Do you have a security expert reviewing the code and looking for vulnerabilities?
Among our advisors are Mark Miller and Jas Nagra, both members of the Google security team (though advising us in their free time, not on behalf of Google). Cap'n Proto is based heavily on Mark Miller's previous work in capability-based security.
Also among our advisors is Andy Lutomirski, a kernel developer who specializes in security and sandboxing. He has been cranking out CVEs against the kernel lately. Most of them haven't affected Sandstorm, due largely to our seccomp filter which Andy himself wrote and continues to work on (see link above).
My own background is diverse but includes a few years working on security at Google.
That said, we have not yet commissioned a thorough security review of Cap'n Proto's own implementation. That is something we plan to do before any 1.0 release (of Cap'n Proto or Sandstorm).
> But not their in-memory caches. Code and shared make up a fraction of their presence in memory.
Depends. If the app is written in C++ or Rust, then the code is mmap'd in (with that memory being shared across instances). The runtime memory overhead tends to be very low if the app is single-user.
For apps written in dynamic languages that parse their code at startup, yes, memory usage is a lot larger -- as a rule of thumb, many apps use around 100MB. One idea we have to fixing this is to checkpoint an app at the point when it first tries to read its per-instance data and restore from that checkpoint on future runs. This checkpoint could theoretically be shared between all instances and mmap'd copy-on-write.
That said, we don't feel this trick is immediately needed. For our upcoming managed hosting service, we've run the numbers and are confident that the vast majority of users will not come anywhere near hitting the resource limits we've set even for the "standard" service level, and we aren't losing money if they do.
> And if we're running potentially dozens of PostgreSQL instances on a single machine
For the scale of a Sandstorm app, it makes tons of sense to switch to sqlite, which mostly solves this problem. :)
That's pretty freaking awesome - thanks for taking the time to point all this out. Might I request that you make some of this information more prominent on your site?
> [memory]
You're still talking about program code memory, not the allocated stacks and heaps. The heaps are the important part to me, because they represent db buffer pools, Redis queues, and cached responses - data which will be duplicated if multiple instances of the same command are run.
> For the scale of a Sandstorm app, it makes tons of sense to switch to sqlite, which mostly solves this problem. :)
Which unfortunately references back to my comment about re-writing apps which come in, in an effort to increase performance.
> Might I request that you make some of this information more prominent on your site?
Yes, we should do that. (Tricky, though -- there's so much information we want people to know, but most people will only read two lines. :) )
> Which unfortunately references back to my comment about re-writing apps which come in, in an effort to increase performance.
We've found that a lot of SQL-based apps support sqlite already. For those that don't, adding support may be some work but it's not a rewrite.
For Mongo-based apps, we actually have a patched version of Mongo that reduces the resource usage pretty well. (Basically we just reduced all their hard-coded "pre-allocate at least this much space" constants.) At some point we'll try to do the same for some SQL database...
> For the scale of a Sandstorm app, it makes tons of sense to switch to sqlite, which mostly solves this problem. :)
Case in point: EtherCalc, which usually runs with Redis storage, deliberately uses the fallback "toy" JSON file storage with Sandstorm, which saves 1MB RAM per document instance and makes migration easier.
This works because there's only a few concurrent writers per document at most, instead of the multi-tenant scenario where there's thousands of concurrent writers at any given time.
"Sandstorm looks insecure and inefficient, but that probably won't matter."
You may need to expand on that. It is not clear that you know what they're doing with sandboxing, etc. If you are, then I'm definitely interested in your further criticism, if this was a knee-jerk response I think it's unjustified.
"I imagine there will be a fair bit of work from my end to re-building apps so they can take advantage of tuned settings, shared services, caches, and the like."
Are you building apps that you expect people to deploy to their personal servers on a routine basis? For normal DevOps folks working inside of a corporation, Sandstorm is a complete non-event. It's not targeted at any part of your pipeline. The fact you're asking about bottlenecks makes me wonder a bit if you understand where this is targeted, too; frankly Sandstorm could slow everything it runs down by a factor of 10 and I wouldn't notice. My VPS that I would run sandstorm on is 99.9% idle anyhow, if not 99.99%.
I addressed the security concern in a response to kentonv, please feel free to comment more on that one.
> Sandstorm is a complete non-event.
I am speaking from the point of view of someone who may be asked to run and manage Sandstorm servers. Either as a service to sell, or as a service for internal customers. This is a very different use case than using it for just myself.
And if I'm honest, this isn't something I'd need to or want to run for myself. I'm comfortable with managing shared services and propping up web frontends. So no, I am not the targeted user of this software; however, I am the one who gets to write tools and processes to support those targeted users at some point in the future.
You might also be interested to know that for large-scale users we're developing tools to manage Sandstorm clusters, with the goal of making your life really easy. :) The idea was introduced as part of this blog post:
You really should follow what your fellows are saying, your claims of this being a complete non-event for DevOps contradict their claims of building out cluster management to help DevOps, or the future target audience being IT departments.
"The fact you're asking about bottlenecks makes me wonder a bit if you understand where this is targeted, too"
If your own team can't seem to understand where this is targeted, how would anyone else?
kentonv, any chance of getting a nice extended (eg, support for more than just text; file formats like images etc as well) (or even just regular) pastebin app added to Sandstorm? Perhaps something like bepasty ( https://github.com/bepasty/bepasty-server ) or Hosty ( https://bitbucket.org/xrstf/hosty ).
It might help to emphasize where this sits between installing Docker images from the repo, and using something like Webuzo or Softaculous to install popular web apps.
The web site says apps cant perform psych experiments, and links to the contentious facebook emotion split test. I'm pretty sure you'll be able to do split testing on these apps...
That small criticism aside, I'm very curious how developing for sandstorm would be different from from developing for a typical host. Anyone know the major differences?
Once our sandbox is complete, apps will not be able to "phone home" unless the user grants permission, so users will need to opt into any experiments.
Currently there are two reasons this isn't true yet:
* We haven't implemented client-side sandboxing yet. It's not incredibly hard (content-security-policy header, some tweaking of apps), but hasn't hit the top of the priority queue.
* The server sandbox currently has some intentionally-poked holes allowing apps to do things like pull RSS feeds from the internet. We plan to close these once the Powerbox UI is implemented, which is the main permission-granting interface, but that's a major project and we wanted to get some useful apps working in the meantime.
Couldn't you do an AB test by having the code choose a random number on install (or first write to the datastore) and have it leak the results by loading an image or iframe with a special url/query parameter?
Sure, you could write an app that does A/B testing, but the whole idea is that people will use Sandstorm to run _personal_ servers/apps. And A/B testing your own reaction to an app doesn't make much sense.
Runtime permissions of these apps will be easy to control, so the Sandstorm platform will prevent apps from phoning home without your permission.
Again thanks! But using Facebook was still a poorly chosen example of what their apps won't do, since they aren't aiming for hosting large customer facing apps like a social network. Perhaps that is what contributed to my confusion in the first place. Now I have a better understanding of what it's for.
Ah, but you could write an A/B testing page and then share it easily with your friends - part of the value proposition is 1) hosting your personal apps and 2) fine-grained sharing of it with others. So it wouldn't be a huge sampling, just your friends.
I see that many old and new web applications run inside sandstorm, so it's a framework to manage apps. They probably have to be adapted a little but I don't think that WordPress has been rewritten to fit into sandstorm. Anything will do, probably.
Furthermore you can download sandstorm and install it on your server.
One major thing that I can think of off the top of my head is that they provide the login and authentication for you and you just plug into it. They also have a sharing system in place that I think you can easily plug into.
However, we also need to work on the fundamental problems to make it easier to build decentralised products in the first place (not everything is a web-app). Namely, how such apps are built, how they store/sync data, and how we deal with identity. The current tools simply aren't designed for the world we're heading towards, so we need to re-evaluate our assumptions. On top of this is the need for business models that don't rely on mass data collection (eg advertising) -- we can't rely on everything being open source but the underlying infrastructure must be.
There are many ways forward and the particular approach I'm taking is based on unikernels and creating a modern stack to deal with the above issues directly. There's more info at http://amirchaudhry.com/brewing-miso-to-serve-nymote/
If anyone happens to be at FOSDEM this weekend I'd be happy to chat about these ideas in person.