Hacker News new | past | comments | ask | show | jobs | submit login
Interesting but lesser known data structures (stackoverflow.com)
161 points by mak120 on March 24, 2011 | hide | past | favorite | 23 comments



I feel like there isn't a good excuse for programmers to not know about a lot of these structures. Yes, some of the examples are somewhat unknown (I didn't know about the 2008 paper on "Left-Leaning Red-Black Trees" for example), but many of these are something that would be at least mentioned in basic college algorithms or data-structures courses. (I have a feeling the voting is skewing my perception, and people are voting up what they've heard of and think are lesser known, and ignoring what they don't recognize at all).

I don't have a CS degree, nor am I solely a programmer, but I make sure I'm aware of the underlying fundamentals of my profession. Even without taking courses, one could easily take some time to read through the related wikipedia articles - http://en.wikipedia.org/wiki/List_of_data_structures

P.S. I was trying to think up a good analogy to another technical trade, but I can't think of one where you can be completely unqualified, and work without licensing or certification.


I think perhaps you are looking at basic college data structures courses through rose-tinted glasses.

A typical first course in data structures is where novice students learn about everyday tools like hash tables and balanced binary trees. It probably mentions a few slightly more amibitous things, but not many.

I would hope a second course at least introduced topics like purely functional and lock-free data structures and their applications, but obviously you're not going to get Okasaki and everything that's developed since at that level.

Most people doing a straightforward CS/SE degree aren't going to do a lot more than two general data structures courses, and maybe encounter a few of the more specialist tools in applied courses on topics like 3D graphics or databases/information storage.


IMO, 99.9% of the time you don't need the more obscure data structures. And I speak from experience working in companies on software requiring high performance (runs taking 6 hours to two days are not uncommon).

That said, I was disappointed nobody had added the corner stitched data structure - I found that to be a very intriguing idea (and in my world, it turned out to be very useful).


> IMO, 99.9% of the time you don't need the more obscure data structures

Absolutely, and familiarity with these data structures is what's required to prevent these mistakes. I think a good understanding of these is more likely to prevent one from using them, or in some cases, "inventing" them.


huge point. Nothing like seeing someone hand-implement (poorly) a balanced tree.. and usually be very proud of it.


That's a fair point.

Though another way to avoid making the mistakes is to not invent data structures. That's what libraries are for, and anyone not using a library data structure should be able to defend why they're not. Odds are, they've got no chance of writing something better than what the libraries provide (or they'd be a library writer and already know these things).


I've heard someone joke that "real world" software uses just two data structures: arrays and hash tables.

1. An array can implement a stack or queue and is often cache-friendlier than a linked list of nodes scattered throughout the heap.

2. A hash table can implement a map or set for fast lookups of unordered data.


On the absence of qualifications, licensing and certification for programmers--I think it doesnt exist because its not yet an effective indicator.

People who can pass the tests can easily add no value (or negative value) to a project

People who've never bothered with the tests can quickly build great things, rescue projects, and fix the bad code/designs/patterns of test-passers.

If certifications/licensing/qualifications becomes and indicator of value in the future, I'm sure it will be adopted in the market then.


I agree. I was thinking about trying to compare it to a profession like structural engineer, pilot, doctor, etc., but each of those has rigorous training and certification.


Those professions are characterized by the value of replicating previous work. The "problem space" involved in removing an appendix or flying a Cessna on a specific route is just much, much smaller and simpler than the problem spaces software developers are expected to navigate. When you have larger problem spaces -- Boston's "Big Dig", experimental surgery, etc., that certification and rigor becomes less relevant, and the results less predictable, just like in software.


Previously on HN: http://news.ycombinator.com/item?id=1370847

Certainly worth re-thinking nonetheless.


Also: binary decision diagrams (BDDs), Prolog's difference lists, and the persistent variants of common data structures. I like skip lists, too, though they seem to be fairly well-known among programmers who don't just naively use linked lists and generic hash tables for everything.


BDDs are already on there (I'd never heard of them before Knuth's fasicle, but I guess he's sort of popularized them), as are a lot of persistent variants.


Noob question: I hacked together my little mobile website and am looking to cut down on search time for a "similar search" I built.

Short Description: Each "unit" has x attributes in the form of strings (tags). Searching queries a database using a standard mysql regexp search for each tag across all "units" and sorts responses by # of tags matching Desc.

I know this isn't StackOverflow so this isn't really a tech question. What I want to know is if the problem is just that I'm a noob (just learned to code/do web development in the past few months) and need to learn more about data structures or if that's just the nature of php/mysql regexp queries (if its a database structure issue, let's not even get into it lol)


You probably got downvoted for being off-topic - it'd be better to ask that kind of question on Stack Overflow. Leave out the subjective "is it because I'm a noob?" parts, and stick with the more objective "is there a better way to do this?", and you'll probably be able to figure out from that where any holes in your knowledge/experience might be.


I appreciate the constructive response. I'm going to do that. Just joined this community so just learning the rules now. Thanks.


Not sure one could call it a data structure but who remembers the so-called Sneaker Network or Tackie Network, which consisted of someone copying information onto a disk and physically walking to the other machine to copy the information. Networks were so slow/buggy that it was often quicker to transmit information using this "data structure"


I think that is a transport layer, not a data structure.


Sneaker neat is to Transport Layer as Filing Cabinet is to Data Structure.


Frisbee-net is more efficient then sneaker-net in open-plan environments. A trained professional can deliver a 3.5" floppy to its destination, and catch a return packet, with remarkable precision.

(though ohyes is right: these are transport layer methods/algorithms, not data structures)


Yes the Floppy disk is the Data Structure, which is interesting in itself, but in order for this data structure to work it must be both loosely and tightly coupled to the transport layer....

edit: To the downvoters - is a sense of humour not allowed on HN?


If you're going to try to be funny, at least include it as part of an otherwise worthwhile comment, otherwise it just adds noise to the discussion (and encourages more of same). I know HN seems kind of neurotic about this, but we're trying really hard to keep away trite "first post", "[citation needed]", etc. comments, and given HN's age, it actually seems to work.


Then I don't get the joke - it just sounds like you don't know what a data structure is. (I don't have downvote-superpowers, just offering an explanation)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: