Hacker News new | past | comments | ask | show | jobs | submit login
Node.js cures cancer (brianbeck.com)
156 points by kfalter on Oct 3, 2011 | hide | past | favorite | 89 comments



While node is doing the calculation, it won't be doing anything else (like serving the next request). If a more traditional server is doing the calculation, it will spin up another process to handle the next request.

If all you do is calculating Fibonacci, you can get ~(amount of CPUs) times the performance. You could use multinode for the same effect, but this is additional work.

In the end, it's a matter of the type of service you are doing. If it's a Fibonacci generator, you'd use something that's better suited than either node/JS or any other scripting language.

If you are doing something that's I/O heavy (which is probably the majority of today's web applications), node or other scripting languages might be better suited because they are easier to work with for most of us.

It's just tools. Not religion. I wouldn't use a hammer to remove a screw. And I definitely wouldn't write hateful articles (I'm referring to the original one) because somebody was using a screwdriver instead of a hammer to remove the screw.


It's a troll article, but I'd like to point out that the "UNIX way" is alive and well in node.js. I spawn new processes to do heavy lifting (e.g. thumbnailing, hashing files) all the time. I put jobs into queues in redis and handle them in other node processes. Obviously in co-operative multitasking it's wrong to block the rest of the server.


> Obviously in co-operative multitasking it's wrong to block the rest of the server.

I think that obviousness is exactly what the author missed.


The original Node.JS is cancer article is a silly troll.

But it's totally ridiculous how in response, people keep writing these terrible, straw-man Python servers to try to prove that Python is so horribly slow.

If you want to write Python web apps, there is a correct way to do it, and it isn't to use SimpleHTTPServer. Write a WSGI application, and serve it with any of a number of decent WSGI servers (just for starters, try uwsgi behind nginx; but if you really insist on directly serving requests out of a server written in an interpreted language, you could try gevent).


It should be noted that, with PyPy, the Python example executes in 3.924s.


Yes, since PyPy (and Rubinius) have JITs (like V8), that is the fairer comparison. Nevertheless this highlights just how helpful JITs can be for cpu-bound code -- makes me wish PyPy was more ubiquitous (and gets py3k support sooner rather than later).


And F# running under Mono executes in 0.950s and the code is shorter, where standard Python 2.6 takes 1m36s. This isn't about pure speed.


But, does it webscale?


    try:
        my_var
    except NameError:
        pass
    else:
        if my_var is not None:
            # Ted needs better examples
            ...
When would you EVER need to use this code? There is no situation in which you should ever need to use a variable in Python that may or may not be defined. While Ted's example may seem like a cheap shot, it does highlight an important problem with JavaScript: all the craziness with regards to types that aren't "real" types like undefined, arguments, and Array.


I was thinking about the same thing. There's no auto-vivification in Python. Thing just blow up and you fix your code. You NEVER do this in real Python code. I'm tired of straw man examples...


IMHO i think node.js sucks because it forces you to manually pass callbacks around. can't it remember my call site for me and use coroutines or call/cc or yield or something? even fork() exists on UNIX (or pthread_create()). why is passing callbacks around the answer? it's like using GOTO.


It's painful but I can see why they don't want to be touching the V8 engine itself (as tempting as it probably is). Short of doing that you are stuck doing things the old-fashioned way (that is, callbacks).

I'm sure things will be a lot better if/when the V8 engine supports let/yield.



As a cancer survivor, just wanted to let you know that the poor taste exhibited here is pretty sad. When you want to make your point next time, use a title that doesn't include something that kills people. Thank you.


Like, node is the bomb?

I am sure cancer sucks more than anything I had to experience but the world is not going to change its figures of speach for you and it is not directly derogatory at all it just reminds you of your misfortune.

Btw I am not criticizing you for asking the OP to change his headline I just think it is an impossible crusade.


See how I just typed crusade and pressed send without thinking.


To be clear, the title is intended to point out how absurd the original title was (node.js is a cancer). I strongly suspect the author agrees with you on using this kind of language when describing a language/framework, which is exactly why he did include it in the title.


It's the equivalent of tying a giant boulder to the back of a Ferrari 599 and scoffing at how Ferrari can dare to call it a "high performance" car. Stop trying to drag giant boulders around.


Not really. It's like saying "this ferrari doesn't need fuel!". Which is just bullshit.


It's not the main point of the article, but I just thought I'd point out that node already does have a WSGI/Rack-style library for folks who like that kind of thing. It's called Strata (http://stratajs.org). Disclaimer: I'm the author.


Thanks, I didn't know that. I've added a link to the article.


I think one thing that the article hints at is still relevant: "developers" nowadays throw around terms like "scalability" and other hype phrases and think that since they know the slightest thing about some new technology, they're a real developer. Sadly, this is incredibly naive. Any script kiddie / code monkey can some application in the latest over-hyped framework and say "HEY LOOK, IT CAN HAS NOSQL, IT CAN HAS SCALABILITY, IT DOES BIGGGGDATA", and write some inefficient code for this application. The truth is that many don't even understand basic data structures and how a computer processes information at a lower level. If you don't understand what something such as algorithmic complexity is, and can't look at your code from a more scientific and critical point of view, don't call yourself a freaking developer. Pick up a book and learn what REAL computer science is, not what the latest and greatest over-hyped framework is called.


Is it just me or is Brian Beck misunderstood Python idioms? What's with the try/except block? There's no auto-vivification in Python. So you just don't try to catch a NameError. You just let it blow up in your test and fix your code afterwards. I'm really tired of these straw man examples.


The article uses two examples to demonstrate that v8 is in fact fast. However, when using python or ruby to create a web server, the server will actually run in parallel (multiple threads), therefore, the average waiting time could be less than the node.js version.


suppose, for example, we have two people connecting to the server at the same time. if it is the node.js server, one person waited for 5 seconds, the other one waited for 10 seconds, the average waiting time is (5+10) / 2 = 7.5

assume that the python server is less efficient, which takes 7 seconds to finish the job. but it runs in parallel. so both two people waited 7 seconds.

therefore, on average, the python server is in fact, faster


i am not a node.js expert. one thing i am wondering is that, modern computers have more than just one core. how node.js utilizes these multiple cores?


By starting multiple processes.

See cluster (https://github.com/LearnBoost/cluster), fugue (https://github.com/pgte/fugue), multi-node (https://github.com/kriszyp/multi-node)


This is true for Python, Ruby, PHP. (I can't see ted's issues with node other than misunderstanding of the framework's runtime)


I read your benchmarks on Ruby, but you didn't list the implementation or version of Ruby you used.

I'd guess you used MRI 1.8.X.

I decided to benchmark other versions(and implementations) of Ruby.

= jruby 1.6.4 (7.3 seconds)

      user     system      total        real
  7.388000   0.000000   7.388000 (  7.349000)
= Rubinius 1.2.4 (Little under 6 seconds)

   user     system      total        real
  5.940015   0.006878   5.946893 (  5.842485)
= CRuby 1.9.2 (38 seconds)

      user     system      total        real
 38.250000   0.090000  38.340000 ( 38.376857)
= CRuby 1.8.7 (Little under 137 seconds)

      user     system      total        real
  136.960000   0.240000 137.200000 (137.437748)
Thanks!


Exactly -- for a fair comparision, V8 (which has a JIT) should be compared against Rubinius (for Ruby) and PyPy (for Python), which will both be nearly as fast as V8.

Of course, it is true that V8 is ubiquitous, whereas Rubinius and PyPy are not -- that is the one majore advantage of javascript.


Node has a mantra which addresses this whole thread, right back to the beginning. "Everything in node runs in parallel, except your code". Manuel Kiessling has a written a great tutorial (http://nodebeginner.org) that shows how node noobs enter and can escape the blocking pitfall. Node is the right tool when your requests block on I/O. I'm not convinced about CPU heavy apps, yet.


I wonder if

1) node.js async i/o is any different from haskell i/o? 2) author knows something about strongly-typed languages, or he deliberately banned them those from server side? Imho, dziuba tried to drop a hint about strongly-typed languages, not python or ruby.


guys guys guys. you do realize ted is trolling us all right (go look at his twitter @dozba right now. I think he is having a great time). he is a pro at this (http://teddziuba.com/2011/07/the-craigslist-reverse-programm...)

Also if his thoughts on node.js don't annoy you enough go take a look at his archive: http://teddziuba.com/archives.html. He blogs/trolls/thinks about NoSQL, OS X, twisted/tornado, python, queues and more.


There are actually some great posts in there. If you've been around long enough you'll overlook the asshole facade and see that while he often presents the material in a trollish, absolute manner there's wisdom to be found.


Those new to NodeJS will still read this trolled post and find real gems of info here. So all is not lost.


Except it contains a lot of FUD, like saying that Node has no SSL support.


> he is a pro at this (http://teddziuba.com/2011/07/the-craigslist-reverse-programm...)

The Craigslist Reverse Programmer Troll is clever. While it's openly a troll, it makes a good point. Worth reading.


The nice thing about node (or any other event-based platform) is that memoizing the Fibonacci function is trivial, whereas in a multi-threaded implementation it would be tricky and error-prone.



I just hope you will never have to experience what cancer is, otherwise you will understand how important a "cure" could be.


I'm getting 0.020s tops for that fibonacci code on node (time curl http://localhost/), even going up to `1.1210238130165696e+167` (800th number). OSX Lion on a C2d 2.3ghz.

Python 2.7.1 took 1m25.259s (no server).

Am I doing something wrong? Or is there some incredibly optimized code path for OSX?

edit: even weirder, `time node fibonacci.js` without a server takes 0.090s.


Sure you're calling fibonacci(40) and not just defining the function?


Yes. Turns out that the recursion overhead in python is huge, it can get down to <100ms too with a non-recursive function.


I have no idea whether that is true and I'm not arguing with it.

Why are you writing web request handlers containing heavily recursive code, and why do you seem to think that indicates anything meaningful about Python?

Please tell me you are not also using SimpleHTTPServer to try to prove points about Python's performance (like http://joshuakehn.com/2011/10/3/Diagnosis-No-Cancer.html and http://blog.brianbeck.com/post/node-js-cures-cancer)


I wasn't. I was just curious about the huge difference in performance. These numbers are from printing to console, no servers.

    def fibonacci(n):
        t = [0, 1]
        for i in xrange(n):
            t.append(t[-1] + t[-2])
        return t[-2]
    
    print fibonacci(800)


Technically the above is a dynamic programming algorithm, in that you are avoiding the recomputation of fib(1..n-1) in each step. That is equivalent to recursion+memoization, which would perform roughly similar to the above. So it is not that "recursion" is so dramatically slower than "iteration" in general, it's just that for these kinds of computations, recursion should be memoized.


I like how every rebuttal turns into "how fast can you compute a fibonacci number". I was looking for a function that burned a nontrivial amount of CPU, the choice of fibonacci was arbitrary. Let's move on from that.

What I was showing was that if your request handler does a nontrivial amount of CPU work, it will hold up the event loop and kill any "scalability" you think you're getting from Node.

If you Node guys were really that irritated by this, you're going to be super pissed when you learn how computers work.

I ain't even mad.


I would argue that heavily CPU-bound stuff shouldn't be run in a web app. It's much better to offload the task to a worker system via redis/zeromq/kestrel/etc. The majority of activity you see in every web app I've ever designed has been almost exclusively I/O bound.


Don't you think there's at least some validity to Ted's argument that the statement "Because nothing blocks, less-than-expert programmers are able to develop fast systems." from the node.js homepage is somewhat misleading?

As Ted points out, there are things like Fugue and Nginx which people who are not "less-that-expert-programmers" do, "experts" will be fine whether they've got magical behind the scenes async stuff going on or not. The question as I see it is - are the node.js docs/homepage misleading about how easy is is to "develop fast systems"?


That's cool and all but in the real world we gotta get it done.


Funny, because I avoid putting CPU intensive code in any request handler... in any language. if the calculation takes time, it's better done async.

(and I don't even use node)


Wait, I think I've missed something because that response seems overly dismissive. How is offloading CPU-intensive stuff to a worker system not getting it done?


because you dont fix everything with another layer of indirection. add in some more queuing (which as we know never has a problem) just to ger around 'no threads' seems stupid.


um...the whole thing with asynchronous event-based processing is that you can (ideally) dispense with threads for many use cases.


You will have to do more of it, of course.

In a classic thread-based system, it's okay if a single page takes a bit longer (e.g. >1 sec) to render if it's within your user's line of expectation.

You can't do that here because you'll block all the others.

But any decent developers knows that, so he takes the advantages Node.js offers and fixes the disadvantages that come along with it. Big deal.

Is this discussion really only about the scalability tagline? Some taglines are misleading, really?


That comment sort of destroyed all your credibility. Would you care to elaborate where in the "real world" there is ever a situation where a tight, asynchronous event-processing loop is required to do heavy CPU lifting?


But Ted, do you not understand that people use node.js because its a familiar language and a (relatively) fast runtime?

Nobody has been using for computing fib though. Nobody has been putting in things that burn non-trivial amounts of CPU though.

Its all about this: http://jlouisramblings.blogspot.com/2011/10/one-major-differ...


The problem is that fibonacci was a bad choice in your example and it proved nothing.

It is fairly easy to scale node with multiple processes. As long as you don't have a long running (such as fibonacci) operation. If you do have tasks like that, process outside node and check for completion. Like how Tasks work in Google App Engine.

Also, most other web stacks will discourage you from running a 30s fib on a thread processing web requests. This isn't specific to node.

Node and coffeescript has worked really well for us. Product coming out later this month.

[EDIT: Just noticed that several other people pointed out the same thing. Looks like most node users are aware of potential problems, but I can see such issues being confusing for new users.]


> Also, most other web stacks will discourage you from running a 30s fib on a thread processing web requests. This isn't specific to node.

Difference being, with other stacks a request running for 30s will have little impact on the rest of the machine. With node, the whole server gets stuck, not just that precise request and the machine resources necessary to perform the computation (or whatever).

The fib example is extreme, but it's rooted into a real issue of cooperative multitasking: code does not always behave correctly and is not always perfect. You might have used a quadratic algorithm and it ran in 10ms on 10 items or so, but in production it happens a user is getting it to run on a hundred or a thousand items, and now other users are severely affected, in that their requests are completely frozen while the computation is going on. There are hundreds of other possibilities, small inefficiencies, shortcuts, plain bugs, etc... which are basically going to break your node application.


Only if you're crazy enough to put something in production running a single node instance.


Even if you're not "crazy enough" to do what's prescribed, every user routed to the locked node instance will still be locked. You're just reducing the surface area of the freeze.


Presumably not many requests will be routed to the locked instance - that's why it's called load "balancing".

And you shouldn't have any long-running computation on your server process anyway.


I think this happens fairly often.


It is fairly easy to scale node with multiple processes. As long as you don't have a long running (such as fibonacci) operation.

- - - -

Which is like the WHOLE POINT of the original article...


If your using node and doing work other than serving simple pages, then your probably sending that work off. Most production serious Node-ers know not to do heavy lifting within the system. I really don't know where anyone from the Node community has ever recommended stuffing their single threaded V8 backed event processor with cpu heavy tasks.

I still don't know why people are so maximist with their tools. NodeJS is a tool. It works well with your _other_ tools.

There - is - no - perfect - tool, no perfect programmer, not even perfect intent.


> Most production serious Node-ers know not to do heavy lifting within the system.

So Node is reduced to the trivial work? Then why make it unnecessarily hard on yourself?

> I still don't know why people are so maximist with their tools.

Because moving parts = risk.


It is up to you to figure things out based on possible worst case runtime scenario coupled with your expected usage on _your_ hardware. You choose what work is defined 'trivial' (based on your resources). Trivial is always moving, and dependent on the scenario at hand. My trivial is not your trivial.

If the 'work' is too much you move it to another process. Either another NodeJS processor or some agnostic queue based managed worker. That worker could be anything.

OR you decide to use another tool.


One Node process is reduced to doing the trivial work, like you'd have in any other system.

Nothing is preventing the other processes from also being Node.


True, and in fact, this is how nginx works: the process that owns the event loop is separate from the process(es) that does the work.

This is a decent way of mixing an event loop and multi-core processing, but with Node, you're forced to marry the HTTP server to the application, which is a dangerously tight coupling of responsibilities.

If you really want to do something silly like write your application in server-side JS because you're familiar with it, then it should be through some interface like WSGI in Python (or even CGI in days of yore), which properly separates HTTP connection handling from application serving.


There's nothing keeping you from running a separate web server and then proxying requests to Node using HTTP. Why is it a problem that you do this using HTTP? What alternative protocol would be so much better? FastCGI isn't really that different (each have their pros and cons) and CGI has obvious problems...

WSGI applications can use a built-in HTTP server too (or FastCGI or ...). Node has an internal interface (similar to WSGI) for handing over requests to the web application and so on. That's not fundamentally different from WSGI. The main difference is that WSGI is a standard API, so there are several "WSGI servers" implementing the same thing (each of them similar to Node in some sense).


You're not "forced to marry" the HTTP server to the application, you can keep them as separated as you want.


I'm not sure if many people are choosing Node just because they're familiar with JavaScript. Most people would probably rather be less familiar with JavaScript. More likely they're choosing it because the runtime is very fast and a lot of libraries are packaged for it.

The WSGI/CGI bit is in fact mentioned in the article. :)


Node is an asynchronous programming framework bundled with a largely async library. If you have 40 cores on your system, you would presumably run 40 instances of node for a CPU intensive webserver(using multinode etc). So the event handler won't get stuck as long as there are available cores.

What about a single core system? Well I guess a threaded/multi-process solution would time slice the fibonacci requests between two threads so that both requests are served in 10 seconds, instead of one request in 5 seconds and the next one in 10 seconds like the node.js solution. Does not sound much better.

If you have done any kind of systems programming, you would know that availability of asynchronous I/O is a life saver, and can simplify your locking model greatly. 90% of the issues you face when building such systems is that some module deep inside grabbed a lock and issued a blocking I/O request and now the rest of the system is bottle necked behind it. Node.js is basically trying to eliminate the possibility of the existence of such a module. This complicates the issue of I/O calls, but simplifies locking in the sense that you don't really need all those locks in your system. In node.js of course there are no locks. The complexity moves from reasoning about locks to reasoning about correctly handling I/O calls and responses. IMO, this is the correct place to move the complexity to, because locks are simply an abstraction the programmer built. When debugging the system, we have to deal with - "How to get rid of this monolithic lock", when the real problem is - "This IO is taking too long we shouldn't be blocking on it". An async programming framework tackles this problem head on.

If you use Python/Perl you will never really know the number of instances of the process to run, Too many and you time slice requests, slow down all of them, increase your queueing buffers instead of just dropping the extra requests. Too few processes and you start dropping requests that you could have served. With a framework like node.js the number of instances you want is equal to the number of cores on the server.

Of course node.js can be an inappropriate solution for a wide variety of reasons, but I could not find anything really relevant regarding that in your post. Alex Payne discusses some issues here. You may want to read it. http://al3x.net/2010/07/27/node.html


The point of the original article is that there's no point in avoiding blocking on IO while allowing blocking on CPU.

Furthermore, since node.js is single threaded, what's wrong with blocking on IO in that single thread? The process hangs, is put to sleep by the OS and wakes when there's IO available. You gain a simpler model of programming than using callbacks/continuations


Most web applications are io bound not cpu bound as they spend most of their time talking to other systems across the network like your db or queue or some rest service. the cost of spinning up threads is memory and context switches by the os. Async is just a way to avoid this for io operations. Node.js is just an incredibly convenient way of doing this as there are less ways of shooting yourself in the foot than if you apply the async patterns to other tech platforms.

Cpu bound is still cpu bound on any tech platform and needs another strategy to deal with it. I think it's better to look at node.js as a tool in your toolbox that is well adapted to running high load web-services and apps that are io bound instead of a swiss army-knife that does everything great.

But as any toolbox you need more tools in the box to be a good carpenter. That includes picking tech that can do the cpu bound stuff in a appropriate manner, be it java, scala, c++, erlang whatever. You don't write a graph database in node.js unless you are masochistic. Just as you hopefully don't write an async server in assembly.

From my usage perspective node.js is great for my async needs and the low resource usage means I need less hardware to scale my typical web app.


True, but wouldn't just making IO async be enough?


Yes it would and in fact most language platforms have async io. It's just harder to keep your yourself from introducing blocking calls in your event loop as they come with mostly blocking libs :)


Well, there is plenty of point avoiding blocking on IO if blocking on IO is your bottleneck, and blocking on CPU is not. For some reason, a lot of people have focused on that case for many years.


How is running 40 instances of node processes different from running 40 instances of a single threaded web server? How is running 40 instances of single threaded event loops better than running 10 instances of a process with 4 threads each? If you had a choice would you not rather have light-weight processes (threads) rather than actual processes because of lighter memory requirements? Threads are too hard to program to? Try STM? I'm not buying the notion that node's event model has any advantage over anything whatsoever. Try Erlang if you want a properly engineered, event-driven development environment.


> How is running 40 instances of node processes different from running 40 instances of a single threaded web server?

40 Instances of a single threaded webserver can block for I/O. If your webserver is 50% CPU bound this means that your CPU utilization is lower than in the case of a perfectly async system. You will serve fewer requests per second than an async framework. This is where the rule of thumb "no of threads = 2X no of cores" originates. Of course this rule wont work well for heavily I/O bound servers with high latency I/O. With node.js latency/IO percentage etc won't matter.

> If you had a choice would you not rather have light-weight processes (threads) rather than actual processes because of lighter memory requirements?

It would be great to have a multithreaded async framework. However a multithreaded environment eventually ends up introducing several blocking I/O functions which Dahl wanted to avoid. Hence the choice of Javascript.

node.js is one of a 100 possible solutions. Nobody insists that you use it. In fact I haven't even written a single line of node code. However I have done enough systems work to know the benefits of async programming.

> Threads are too hard to program to? Try STM?

STM can only handle scenarios that do not involve I/O. One of my colleagues was in the group at Microsoft tried STM with I/O that fell hard on their faces. Sure there are plenty of approaches - threads, actors, STM. Async programming is one such approach. If you want to write an async web server, right now node.js is the only solution. I think it might be possible to do a pure async web server in Haskell, as any IO gets captured in the type signature but I don't know of any async Haskell webserver framework.

> I'm not buying the notion that node's event model has any advantage over anything whatsoever.

You are basically asserting that async programming has no advantage over any other approach whatsoever. Having dug into hard disk device drivers, filesystems and caching for the Windows CE kernel, I would have killed to have a proper async I/O framework in CE from the ground up. We had a gazillion locks in the kernel modules, for gazillion data structures when all we really wanted to do was perform I/O without grabbing a lock. The Linux epoll, BSD kqueue and Windows IO completion ports are all async APIs added for high performance systems. These APIs are, strictly speaking, not required if you have threads but when you get into sufficiently advanced systems programming you cannot live without these. Trying to say that async programming is useless is equivalent to claiming that APIs such as epoll/kqueues are useless.


@zohebv there are several errors in deduction here:

> a multithreaded environment eventually ends up introducing several blocking I/O functions

really? multi-threading and non blocking io are two separate tools and one doesn't have to choose one of them exclusively. they can be intertwined, boost::asio allows you to run a single async event processor on as many threads you want. all without doing any explicit locking. if you're averse to locking (which based on your experience seems to be the case) you're taking things too far by avoiding threads completely.

> You are basically asserting that async programming has no advantage over any other approach whatsoever

i don't think anyone's asserting that. the original argument made was - if all you have in one process is a tight loop dispatching non blocking io handlers then you can't handle computationally intensive tasks. of course, you can spawn 40 other process - but i don't like a model where that's your /only/ option. there are middle grounds and any system that discounts them is short sighted.

> Trying to say that async programming is useless

yet again. i don't see that being said anywhere.

> Node is an asynchronous programming framework bundled with a largely async library ... If you have done any kind of systems programming, you would know that availability of asynchronous I/O is a life saver

now, if you'd indulge me with my own escapades in logic and word play.

icing is sugar whipped in butter. if you have eaten any desert, you'd know that sugar is nice and tastes very sweet. thus, conclusively, irrevocably, icing is good and we shall eat nothing else.


> @zohebv there are several errors in deduction here:

@ajd I would disagree with this statement very easily and very confidently.

> really

yes

> multi-threading and non blocking io are two separate tools

asio is a way of minimizing the number of threads in a system and extracting maximum performance.

> i don't think anyone's asserting that

It follows by logical deduction. There are only 3 possible criticisms of node.js 1. I need shared memory 2. I dislike Javascript 3. I dislike asio

Given that the cancer post doesn't make a big deal of 1 and 2. you are left with 3. And if you read his follow up post he is again complaining about "event-loop" programming. Yes he is indeed complaining about asio. asio necessarily introduces event loops.

There is only one reason to use multiple threads over multiple pre-started processes. You need the shared memory. You should still be able to pull off shared memory in node but it can be cumbersome. I am yet to see anyone complain about needing shared memory and hence disliking nodejs. Its either

1. "threads are robust for cpu bound workloads"

- which can be solved by simply launching as many nodejs processes as Python processes, though you only need to launch as many nodejs instances as cores if you are not interested in winning this argument.

2. "Don't force this programming model on me"

It is not forcing any programming model on you except that shared memory is harder. And yes you cannot acquire a lock and go off and do io. If you want to do that then you can, by piling up pending events in a queue, but the ugliness of the code will stick out easily. You might as well switch to Python/Java threads or whatsoever.

> of course, you can spawn 40 other process - but i don't like a model where that's your /only/ option

Can you describe the other options you want to try?

Check this out http://teddziuba.com/2011/10/straight-talk-on-event-loops.ht... He has broken down his own defense with his fancifully named "Theorem 2". If you do more IO than CPU then "use more threads". Except he doesn't give you the number of threads because he doesn't know how many. In fact, he cannot know. And this is why asio is a win.

Fact is on an n-core system if you launch n nodejs processes you are guaranteed one of the 4 hold true 1. You service all requests thrown at the system 2. You maximize the CPU utilization 3. You maximize I/O utilization 4. You maximize RAM utilization - "4. is some what pedantic"

i.e. it will extract the maximum possible performance from the hardware you throw at it. It is just a consequence of making sure all io is non blocking. With a threaded solution you will never get the number of threads right and you will end up with a server that cannot serve all requests even when it has spare CPU, spare I/O capacity and spare RAM. Maximizing CPU utilization implicitly assumes the absence of locks. If you use asio as well as locks CPU utilization will not be maximized. This is something that most of the "nodejs" critics don't understand or fail to appreciate.

Yes shared memory is harder to do in node, but erlang doesn't do shared memory, Python cannot do threads sensibly, Java cannot do coroutines, why dump hate on nodejs because it isn't an ideal environment for a solution that requires shared memory? Sure there are many problems that require shared memory. However 95% of webservers don't fall into this category.

Lastly ted's inexperience really shows here, he is only 27 years old and has a lot left to learn. To start with he can stop trying to school Ryan Dahl, who is someone who certainly knows his Computer Science and is making a valuable contribution to the community.

Lastly as a practical exercise, try to build (at least as a thought experiment) some web service that outperforms node using respected platforms such as Python and Ruby. asio is a technique for IO bound loads but node will do better than Ruby/Python even on CPU bound loads thanks to V8. An order of magnitude faster. I have said that you need to have n processes for n cores, but in practice it seems just one process turns out to be enough as loads tend to be io bound and V8 is typically 10 times faster than Ruby. And if you have coroutines event based programming is not hard at all. So yes people have discovered that one nodejs process has replaced their two dozen Ruby processes and is serving out twice as many requests from the same box and they are impressed. And they don't care about what Ted thinks.

As a footnote, of course, you can do shared memory within a single nodejs process(trivially true), but for pure CPU workloads Java/C++ would probably be a better option.


>>> a multithreaded environment eventually ends up introducing several blocking I/O functions

>> really?

> yes

ok.

> Can you describe the other options you want to try?

http://www.boost.org/doc/libs/1_47_0/doc/html/boost_asio/exa.... See HTTP Server 1 to 4.

my issue with node is independent of the programming language. i came here neither to bury ted nor to praise him. i don't care what he has written now. i found his original article funny and thought that it had some truth in it. all the other "logical deductions" you're making, require axioms that you have and i don't. i don't know this ryan guy either but if he's made a system so many people (including yourself, who i admire sincerely) think so strongly about, he must be an all around great guy. good for him.

as for me and this discussion, if node=asio and threads=avoid in your world, so be it, i'm ok with that :)

> one nodejs process has replaced their two dozen Ruby processes and is serving out twice as many > requests from the same box and they are impressed

that says a lot about Ruby (with which i have absolutely no experience)


As it so happens, I am now implementing a load balancing solution on the JVM using the hybrid technique you recommend, lots of threads and some asio coming soon. However asio support on the JVM is poor. You are correct, in that using asio does not preclude you from using threads and viceversa; while node.js does not let you use threads. Your argument is valid, but this is very different from the argument ted is making viz. a CPU bound task will make the server useless. This is demonstrably false, and ignores the fact that typical web workloads are io bound.

The C++ solution you posted is probably a hybrid solution as you described, but it has its own set of restrictions. The environment does not guarantee run time safety, functional programming support is poor and coroutine support is non-existent or weak(I noticed there is an unfinished Boost.coroutine library that must heavily depend on #include <functional>). While node shuts out threads it enables other solutions more suitable for a heavily I/O bound server. I only want to draw attention to the fact that almost all solutions available today involve some kind of compromise and hybrid solutions are possible in almost all of them.

As for the "deductions", I thought that they were obvious. You need threads rather than processes, so that you can share memory and avoid the serialization/deserialization complexity when communicating between threads. Or less often, you need a more elaborate synchronization mechanism such as reader/writer locks. As for asio extracting maximum performance, I think I will be better off writing a blog post about it.


If you use Python/Perl you will never really know the number of instances of the process to run

Eh? Think you could do async IO in both Perl & Python (and C and erlang ..) years before node came along..


ted you're so cool be my friend pleaseeeee


how do you dare ?

and why are you at the top ?

really disappointed


I'd really like to know why people like to joke on cancer and downvote people which might be hurt by these stupid titles.


dboza is pretty awesome, beck didnt really address the underlying issue. If you have any real computation, node.js is not your solution. I figured out this in event-driven IO back when I was in school 15 years ago fffffff




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: