""""If you’re using some established framework, Heroku can figure it out. For example, in Ruby on Rails, it’s typically rails server, in Django it’s python <app>/manage.py runserver and in Node.js it’s node web.js."""
Been working in node for years now, have never named anything web.js. Why not index.js, server.js or app.js?
Well, they're obviously working on improving their documentation and have made public statements about some of the high-profile complaints. Is that not what you want? Or would you prefer they do nothing so that you can continue making tired accusations against them?
How do you know they document accurately this time? 'Obviously' as compared to what? Except if you have internal access to their source code and be able to verify against the documentation.
I am always happy to give honest mistakes a chance, but not to a company who ignored customers complaints for years, sold inaccurate tools for big bucks, and tried to obfuscating documentation to hide the change of algorithm in request routing at dyno level. Sorry, I just find these acts are bordering evil, and my money will go to other businesses. If you are not so sure what I am talking, read the following link and see how Heroku's CTO and COO responded to complaints.
For example, dynos are cycled at least once per day
Why so often? I wonder if this is just to protect against memory leaks leading to poor performance from poorly coded apps or if it's an issue with their infrastructure.
I would bet it's horizontal migration to "compact" dynos down onto fewer servers during off-hours, so they can terminate some of their EC2 instances. Basically, generational garbage collection performed on VM containers.
Actually I believe the containers are implemented with something similar to http://lxc.sourceforge.net/ which is more akin to a process level chroot, and kernel resource limits. This also the same technology that fundamentally underpins Docker and some other similar super-lightweight virtualisation technologies. With lxc and buildroot, for example it's trivial to create a self-contained 20Mb image for running a PostgreSQL server. With all the security guarantees [sic] that would be provided by paravirtualisation.
Certainly, it was provocative of me to imply it was that simple : when you have to deal with processes at this scale, you can't do it without any sandboxing if you want to avoid users messing with one another.
It's still the basic scheme, though : when you rent dynos, you actually rent processes. That's the heroku business model (with all the deployment sugar on top of it, of course), and I'm always surprised people are ok to pay that much to rent processes.
Of course, one could argue you can spawn multiple processes in a dyno. But 1. thoses processes have to consume very low memory because of low memory limits of dynos and 2. this could lead to boot up nightmare.
Before the "rap-genuis-gate", the company I worked for then had scalability issues with heroku, being the later described router random timeout problem. Our dyno costs were already very high, even when using hirefire, so adding more dynos sounds not a good option.
We tried to replace thin with unicorn, which was later the proposed solution from heroku. But even with two unicorn workers, we reached memory limits. Even worse : when dyno booted unicorn master process, it considered itself ready, while unicorn was still spawning its children, so it ends up with more failed requests.
Bottom line is : using (and paying for) a heroku dyno is basically the same as using a process on a dedicated server. That may not be a problem, provided heroku clients are aware of that, and are not sold dynos as if it was what we all know as virtual servers.
> when you rent dynos, you actually rent processes
True; more dynos, more processing capacity. I was replying to the suggestion that dynos are processes, but they aren't. Agreed, in common discussions this distinction doesn't matter much, and you correctly mentioned that it is possible to run multiple processes in one dyno.
> using (and paying for) a heroku dyno is basically the same as using a process on a dedicated server
Partly true. Using an Heroku dyno is basically the same as getting a small (although 2X dynos have 1 GB memory allocated) temporary virtual server in which a limited number of processes run, often just one. No running process equals to no dyno, I think.
Paying for an Heroku dyno is basically the same as renting (and releasing) a small virtual server for the duration of a running process (and for possibly very short periods of time). That is especially true for one-off dynos: a dyno only for the duration of one command, but those are charged just like web or worker dynos.
Normally, we aren't paying for processes on (dedicated or virtual) servers, but are buying the server capacity whatever we do with it (limited by the server resource allocation).
So that is indeed one of the differences between Heroku's PaaS solution and other IaaS offerings (and we both agree on): paying for the (running) processes instead of paying for the capacity made available for you.
> Paying for an Heroku dyno is basically the same as renting (and releasing) a small virtual server for the duration of a running process (and for possibly very short periods of time)
> Normally, we aren't paying for processes on (dedicated or virtual) servers, but are buying the server capacity whatever we do with it (limited by the server resource allocation).
That's one of the thing that makes me say we pay for processes, on heroku (even if it's indeed oversimplified). A dedicated or virtual server is not only about process and memory. You can have access to sendmail on it, handle uploads, store long and exhaustive logs, perform specific task via third party application (like resizing images), ssh to machine if any problem need deep inspection, etc.
The heroku model is about extremely specialized services. Whatever you need that is not processor share or memory needs an external addon. I'm ok with that, but it has to be stated clearly. It's not at all like if you had a standard virtual server.
Your stance "Paying for an Heroku dyno is basically the same as renting (and releasing) a small virtual server for the duration of a running process" would perfectly do. Way better than heroku's one "Dynos are isolated, virtualized Unix containers, that provide the environment required to run an application" which could let think that it's a regular virtual server. Well, sendmail, for me, belongs to an "environment required to run an application".
> ... getting a small... temporary virtual server in which
> a limited number of processes run, often just one.
Is it "often just one" process running because of some limit enforced by Heroku, or is it simply because, well, many people don't bother to spawn additional processes? Do additional processes "wait" for other processes to finish before proceeding? (I'm about to run another experiment on Heroku...)
Heroku enforces a 512 MB memory limit (or 1024 MB on the 2X dynos), so that limits the practical number of processes running in the same dyno... especially if those are (heavy) full-stack web applications. Additionally, processes share the same CPU, so you'll not get the same performance as two processes in their own dynos.
No, dynos are 'containers', not processes. Those run inside the dynos. From the documentation at https://devcenter.heroku.com/articles/dynos: "The commands run within dynos include web processes, worker processes".
Yes absolutely. A process is something that can run within a dyno. The root process is typically the one defined in the Procfile. It may spawn sub-processes, all running within the same dyno.
Has the whole or parts of the architecture been made available as open source?
I want to create a container based PAAS architecture myself, based on docker etc. Does anyone recommend a ready made solution similar to heroku, to serve ruby web apps.
Deis will be released soon. It's an open-source PaaS based on Docker & Chef, with a user experience modeled after Heroku. You create a "formation" which contains a configurable number of backends and Nginx proxies. After you git push your app to the formation, you can scale web=N worker=N just like Heroku. Deis will automatically balance Docker containers across the backends, reconfigure routing, etc.. all using Chef. The goal is to provide a Heroku-like platform where you control everything: Chef Server, PaaS controllers, hosting providers, routing layer, etc..
https://github.com/ddollar/foreman is what is used for the local testing, but also for the actual running of the app on the Cedar stack. One could use Foreman and Docker and AWS to make a pretty reasonable PaaS.
(I'm working on a S3 static-site-only version of Heroku so people don't have to serve static sites behind 3 layers of web servers.)
"""A random selection algorithm is used for HTTP request load balancing across web dynos""
bit more....
"""and this routing handles both HTTP and HTTPS traffic. It also supports multiple simultaneous connections, as well as timeout handling."""