Hacker News new | past | comments | ask | show | jobs | submit login
Tuple Space (2014) (c2.com)
108 points by paulgb on April 28, 2023 | hide | past | favorite | 53 comments



Tuple spaces implementations and their efficiency - 2016

https://arxiv.org/pdf/1612.02979.pdf

> SUMMARY

> Among the paradigms for parallel and distributed computing, the one popularized with Linda, and based on tuple spaces, is one of the least used, despite the fact of being intuitive, easy to understand and to use. A tuple space is a repository, where processes can add, withdraw or read tuples by means of atomic operations. Tuples may contain different values, and processes can inspect their content via pattern matching. The lack of a reference implementation for this paradigm has prevented its widespread. In this paper, first we perform an extensive analysis of a number of actual implementations of the tuple space paradigm and summarise their main features. Then, we select four such implementations and compare their performances on four different case studies that aim at stressing different aspects of computing such as communication, data manipulation, and cpu usage. After reasoning on strengths and weaknesses of the four implementations, we conclude with some recommendations for future work towards building an effective implementation of the tuple space paradigm.


For a long time I toyed with the idea of implementing a tuple space as a pub/sub mechanism for system alerts.

I tried to get Java Jini, specifically JavaSpaces which was inspired by Linda, up and running a long time ago as a hobby project, but it seemed like it had been long-neglected even then and my limited knowledge of the JVM world left me stranded.


I implemented something like this! The Dr. Lojekyll [1] datalog system was built around the idea of inputs as receiving messages over time, and publishing outputs (also as messages) or responding to queries (using materialized views). It worked quite well, but it's differential nature ended up being the downfall for the usecase we had in mind.

Specifically, it had no way to "extend" its consistency model out to consumers of its messages, and so if consumers were also producers, then this distributed system as a whole could enter into a state where parts of it are lying to itself!

Otherwise, there were some really fun things you could do with it. For example, we could have per-client databases that would bring themselves up-to-date given the differential outputs of the server database. This would let you engage in a kind of manual sharding of data.

[1] https://www.petergoodman.me/docs/dr-lojekyll.pdf


One of the inventors of the original Tuple Spaces idea was David Gelernter. He was actually targeted by the Unabomber, and survived but was heavily hurt (https://www.nytimes.com/1995/05/21/magazine/the-unabomber-an...). David was a brilliant researcher, with lots of great ideas including Mirror Worlds (https://en.wikipedia.org/wiki/Mirror_Worlds). I sometimes wonder if what the Unabomber did to him in early 90s slowed David down, and made it harder for David's idea to get out there.


Mirror Worlds (the book) is a great read. I have it on my shelf next to his other book, Machine Beauty. I didn't know he was a victim of the Unabomber, that's unfortunate.


"This site uses features not available in older browsers."

It is difficult to express the degree of my frustration with C2 being rewritten from a gunmetal HTML page that worked everywhere into an inaccessible heavyweight SPA mess with a frankly bizarre UI.


C2 is such a good resource. It's a shame a simple, plain readable and most accessible website has been turned into a bloat.


Is C2 doing some sort of A-B testing? Because when I go to the link I see the exact same interface that I have seen there for the past quarter century or so!


Click on any of the links and instead of taking you to a new page it's a pop-up page on top of the existing one (plain left click, not middle-click or whatever you have registered to "open in new tab"). Happened a few years back, very annoying. Performance has improved but I remember a few years ago getting a noticeable delay on some pages.

If you don't follow any links (or you open them in new tabs) it does pretty much look like it always has.


It does look like just another bloated SPA, but Federated Wiki was a geniune advance when Ward was developing it a decade ago. Still is. Shame it hasn't caught on.

I agree it was a mistake to move the existing c2 content into it. Maybe a better try was to start a new wiki of some kind. But the temptation to leverage the existing site to launch Federated Wiki is understandable.


No offense, but how is it an advance? Who needs a wiki that can't be edited? (If I'm mistaken about this, maybe someone can educate me)


If you know git: a federated wiki works like that. To make a federated wiki work like a traditional wiki, you can think of there as being a "canonical copy" which you can clone, make edits, and then use a 'pull request' process for incorporating those edits back into the original. But you also don't need to have a single source of truth: you can, for example, have a group of people—say, students studying for a class—who are building their own wikis, and among each other they can copy in pages, make changes, copy changes back, and so forth, all building a web of things. In the same way that git can replicate a subversion-like workflow but also introduces the possibility of different workflows, a federated wiki can replicate a traditional wiki but also has a number of workflows it can accomplish.


Thanks! I'm aware of the abstract technical idea, but is there any evidence that Ward's wiki ever worked like that? Did he ever accept a "pull request" or are there any notable forks which do?


The federated wiki concept is that you run your own wiki, and replicate, or "fork" content from other federated wikis to yours.


I don't get the qualm.

It works like the Wikipedia mouse-over popups. (Besides that you need to click.)

That's actually quite convenient.

The only real issue is that it does strange things to the browser back button. (Seems like a bug to not pop stuff form the history stack when closing the "popups".)


Let them know.


I did (last year). Ward Cunningham basically told me to GTFO (not quite so directly, but that's the impression I got from him).


That's interesting. I just looked again and I had exactly the same problem for this same site and raised it a couple of years ago as well. The same guy, Ward Cunningham, sent me the text of the page but was otherwise not receptive to my plaint (but wasn't rude). I guess if enough people (more than two anyway) start complaining maybe he'll listen.

I do share your frustration with clever shit over basic usability. Flipside is, I guess, Cunningham isn't your average guy. But I think lesser mortals are still allowed to disagree.

Anyway, thanks for actually complaining instead of just moaning and doing nothing like so many others on the web.


Another interesting technology that was inspired by tuple spaces is DDS which tries to be a distributed tuple space for communication between embedded/field devices.

https://en.wikipedia.org/wiki/Data_Distribution_Service


I hadn't heard of DDS until an hour ago, when I did some preliminary research into GM's just open sourced Eclipse uProtocol, their car IoT system built on Protobuf & CloudEvents.

Among some usual suspect transports (http2, mqtt, android binder), they listed DDS.

https://news.ycombinator.com/item?id=35742650


It's also used in ROS the robot operating system.


what makes it a "tuple space" vs a "message queue" system?


tuples don't have an implicit ordering and are pattern matched on contents instead of 'head'


I remember when tuple spaces were the new hotness and going to a presentation from GigaSpaces at my local JUG.

And now I can't even remember the problem they solve or any potential use cases. Weird.


Primary problem: Distribute work on heterogenous data across a cluster of heterogenous processes. Add new processes (either additional workers or additional capabilities) by attaching them to the existing shared tuple space. Add new data by wrapping it into a tuple and tossing it into the pile. Add a processor for the new data by having it search for a pattern that picks up the new tuple.

Similar concept behind message queue based systems, service oriented architecture, and many, many others. Though more pull-based than push-based. You wouldn't explicitly launch a process, you have a process that sits around waiting to for a tuple to show up and then takes/copies it and does the work.


> Add new data by wrapping it into a tuple and tossing it into the pile

In David Gelernter's original work, it's not just data. In Linda, it's possible to put Objects in, which are little executable (`eval` in Linda terms) which a processor can manipulate through whatever interface the tuple type exposes.


That does not sound very secure.

Also, what's to stop another processor "finding" the Object before the intended recipient?


1. Just bare like that, it's not. Unless, of course, the object requires a key to allow a processor to execute it. See for example Capability-Based Access Control

2. In TupleWorld, there's no concept of "the" intended recipient. The tuple's properties determine what kind of processor would match. If those properties happen to only match a single processor, you'd get that 1:1 mapping you want, but that's missing the point of tuple spaces.


Just coordinating the 1:1 mapping I might want is fraught with challenges. Network delays, out-of-order execution, stampeding herd situations, poison objects, processors crashing (or being rebooted) at random times... I'm not seeing the benefit of a "bag" of objects to be processed whenever by whoever or whatever.

Perhaps this is attempting to solve a problem I've not come across. While it sounds interesting, I'm not sure I see the benefit? (Probably me just being short-sighted).


It's a way of doing massively parallel computations in a scatter/gather system. I suggest looking up the paper "Linda in Context. I'd provide a link but I'm on my phone.


Here you go (not on mobile):

https://dl.acm.org/doi/10.1145/63334.63337 (free access)


Also worth reading are the "Linda Letters" from Communications of the ACM, October 1989, vol 32, no 10, p 1244. I found an excerpt at https://web.archive.org/web/20160901000000*/https://www.cyph..., starting on page 4 of the PDF


Also, on "poison objects". We execute Javascript all the time in our browsers. Sandboxing and other measures keep the executable objects from exploiting our computers. It's imperfect, of course, and there are still exploits, but that's the general approach. Things like out-of-order execution and so forth require careful parallel design, but that's pretty much the bread-and-butter of things like SETI@Home


Somehow this model works to build complex organisms in embryology...


Still have a problem with viruses, though...


This exactly. Sadly the idea went through a low pass filter and just became a way to pass around dumb data objects.


It's not unlike the Blackboard system pattern (https://en.wikipedia.org/wiki/Blackboard_system#Metaphor) in that respect, but Gelertner had something more ambitious in mind.


Honestly, I suspect the other, fancier parts of the idea were more like a solution in search of a problem.


The way I think about it is as the ultimate decoupling between producer and consumer. Bundling producer and consumer together into one unit spatially couples them. But even if the producer runs as a separate service that waits for requests from the consumer the interaction is still temporally coupled.

By having the producer gradually update the responses to all possible queries to a tuple space, completely on its own schedule, there is not even temporal coupling between it and the consumer.


DynamicLand (Bret Victor's 50yr research project) uses a tuplespace model, except instead of basic pattern matching on content and type it uses a datalog-like query language (the wishes and claims of RealTalk).


"that is, TupleSpace won't work on the Internet, precisely because of its underlying "distributed shared memory" idea". anyone know why?


Given the bare-bones API and semantics, probably because it describes absolutely no fault tolerance at all.

Lose a processor or a message and it's just lost. Try to address that with copy-put-take and you now have duplicates. Try to address that and you now have byzantine failures with no resolution or supporting tooling. So it's wonderfully simple until you need to do something outside of a single process and then it's useless.

Internet-friendly stuff generally has a hard requirement on being fault tolerant.


Some further notes are here: http://wiki.c2.com/?TupleSpaceScalability


near the beginning of the Scientific Computing Associates work (?) they implemented an extended model that supported multiple spaces - to scope the number of machines involved in a given update.


I always thought Tuple Spaces were a super cool idea and am a bit bummed out they never took off. JavaSpaces was based on the idea but had a somewhat limited release.


having ported Yale's Gelernter and Carriero (sp?) Tuple Spaces to Alliant FX-2800 in 1988, so familiar for a long time, you just gave rise to a question that I can not answer.

Is NoSQL close enough, so MongoDB etc displaced tuple spaces?


There's similar ideas almost in things like Apache Ignite, Hazelcast, Datomic.


[flagged]


Heh, nice ChatGPT reply... it's more eloquent than what I was going to reply with, and doesn't start with "Grad school was a long time ago, but from what I recall..."

> To mitigate this, tuple spaces may use locking or transactional mechanisms to ensure consistency and prevent conflicts.

I think I played with some kind of TupleSpace implementation around 2009 or so (as part of a "paradigms of distributed systems" course) and I'm pretty sure it was some kind of hybrid STM (software-transactional memory) / TupleSpace thing that addressed that issue. It was pretty cool... again from vague memories here, I believe it only had tuple-level locking, but you could build a bunch of really neat higher-order synchronization primitives on top of just tuple-level locks.


Your comment was flagged, I assume for the ChatGPT dump, but that seemed unreasonable to me so I vouched for it.


Thank you :) Upon reflection perhaps this is fair enough, all Hacker News discussions might descend into ~50% ChatGPT content. I'll endeavour to put what ChatGPT produces through my messy human filter. Humans before bots? Guys before AIs. Ho(mo sapien)s before b(r)o ... doesn't quite work.


[flagged]


While I agree that the current rendition is unfortunate, it's a bit amusing hearing someone say that the literal inventor of the wiki is a twisted ego that should be completely ignored.


I stand by my statement. Terrible people can build good things.


Both of your comments in this thread broke the site guidelines badly. Can you please review https://news.ycombinator.com/newsguidelines.html and use the site as intended?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: