Hacker News new | past | comments | ask | show | jobs | submit login

It's actually sets that are underrated. All languages should come with a reference implementation.

Hashtables aka Maps aka Dictionaries aka Associative arrays are just fine.




Though once you have hash/maps/dicts set are with reach by making sets from hashes were keys are the element and values are just true or 1 or something like that.

But I think you probably meant having and using set operations effectively in day to day tasks, as in "make 2 sets and do a set different operation" instead of "do a for loop on first hash check if it is in the second, then put results in an accumulator.

Another thing is to think about set of sets. Can that be useful sometimes? Implementing that is slightly trickier. You'd need to be able to get a hash of a set. Python has frozenset https://docs.python.org/3/library/stdtypes.html#frozenset. I've used those on occasion.

Then of course there is Erlang sofs (sets of sets) module. Stumbled on it by accident. Oh my, it comes complete with an introduction to set theory and relational algebra:

http://erlang.org/doc/man/sofs.html

It just struck me as so out of place with the rest of the standard library modules. Would like to know its history


Yes, while a set can be derived or mimicked if you have a hashtable for example, I'm specifically referring to all the common set operations you'd typically want to have.

Using a map/hashtable as a set is just the tip of the iceberg.

In many cases you don't need a key and a value and you can get rid of a lot of loopy and complicated code if you had a simple set to utilize with its useful operations.


>"Though once you have hash/maps/dicts set are with reach by making sets from hashes were keys are the element and values are just true or 1 or something like that."

Can you elaborate on how you can derive a set from hashes by using values of True or 1? Might you have a link? Thanks.


> Can you elaborate on how you can derive a set from hashes by using values of True or 1? Might you have a link? Thanks.

Sure. Np!

I mean that a simplified set is just a hash where the elements of the set are keys of the hash table and the values can be anything. I used 1 or True as example.

As in adding an element would be:

   my_dict[element] = 1
Then membership check is:

   if element in my_dict
Then removal is deleting:

   del my_dict[element]
and so on.

In other words, the reason sets are sometimes not explicitly there is because they are easy to implement on top of existing data structures.

Basic operations like union, difference, intersection between two sets can be done with a few simple for loops.

But like I mentioned in other comment, there is one interesting aspect to set (and hashes) in that the element now have to be hash-able. That kind of depends how mutability and identity works in the particular language.


Thanks for the clear explanation, this makes total sense. Cheers.


Ruby's standard library does this. Here's the code: https://github.com/ruby/ruby/blob/trunk/lib/set.rb


Python essentially does this as well.

The idea is that an item is in the set if it's in the hash table. You can add, remove, or test for membership in constant average time.

Set operations still need to be built on top of this.


I would argue that Hashtables are a specific implementation of the abstract datatype "map" or "set".


Sets are just like hashes where the value is always "true" for each key.


While that's true you could implement one that way, it's very nice to have the set operations implemented. Intersection, union, subtraction, etc. And having a uniform set type makes API signatures more consistent.

Plus you may be able to optimize Set<T> to use less space than Map<T, bool>.


> While that's true you could implement one that way, it's very nice to have the set operations implemented

Unrelated, but does anyone know why the new JavaScript set implementation is so limited? Why didn't they bother doing this right?


I think I remember reading a claim that they pushed for a small API surface in order to make sure it got through. Now that it's part of the language, anyone else can work on getting those extra features in.

You can implement most basic functionality easily enough, MDN even has an example [0]. Although I agree that it should really be part of the language.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


Because it is JavaScript, and there's some kind of unspoken rule about not doing things properly and instead releasing broken things.


This exactly, I hear over and over: just use a hashtable...totally misses the point.


Does it really matter what the 'value' is? A set is surely implementable as a supertype of a hash, where the value is totally arbitrary; it only matters that the entry exists.

With:

    {'A': true, 'B': false}
you seem to be suggesting `B` is 'not in the set'. What's `C`?


Not in the set. You're describing the same thing as the parent comment, but you're saying "arbitrary value" which they substituted for "True"


They're not really the same thing then. In OJFord's implementation, there's a one-to-one correspondence between states: the element is a member of the set iff it exists as a key in the hash table.

In the true/false implementation, two distinct hash table states (false or key not set) map to one set state (not a member). The program must check whether the key is set, and then check its value.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: