Hacker News new | past | comments | ask | show | jobs | submit | jsomers's comments login

Hi, author here. This article was many years in the making. It's ostensibly a story about reading minds but really it's about the unreasonable power of high-dimensional vector spaces.

That made it pretty tough to write: how do you explain dimensionality reduction, PCA, word2vec, etc., and the wonders of high-dimensional "embeddings" (of the sort you find in deep neural nets) when a lot—or all—of these ideas might be new to the reader? I'm not sure—but this was my attempt!


Hey I liked all the examples.

Thought you might like this one: A Geometric Analysis of Five Moral Principles (OUP 2017)

Ethics using vectors or from a description of the technique: The geometric approach derives its normative force from the Aristotelian dictum that we should “treat like cases alike.” The more similar a pair of cases are, the more reason do we have to treat the cases alike. These similarity relations can be analyzed and represented geometrically. In such a geometric representation, the distance in moral space between cases reflects their degree of similarity. The more similar a pair of cases are from a moral point of view, the shorter is the distance between them.


That was a fascinating article, I liked how you covered the human element in how it helps the paralyzed and the intuitions/visuals of the researchers in the field.

Future applications can be good or bad, but of course that makes it even more important to record the early history of the field and these kind of articles will also help in start the ethics discussion at an earlier stage.


Great article! I think you’ve done a great job of introducing these difficult concepts in simple language. I saved to Pocket and it’s already got a Best Of label there.

PCA (KLT) can be introduced as a generalization of the Fourier Transform. This can follow from using a cocktail mix analogy to Fourier Series. When I was a TA this was the approach I took with students, which seemed to make things easier for them.

An introductory post on PCA vs FA is here: https://towardsdatascience.com/what-is-the-difference-betwee...

Personal note: Susan Dumais, mentioned in the article also did great early work in text summarization, just after she joined Microsoft. I tried using some of her approached in video summarization in my PHDin early 2000s. How time flies.


I don’t think explaining it as a generalization of the Fourier transform is going to help very much with New Yorker readers.


PCA applies an orthogonal linear transformation, while FA uses a series of coefficients to scale a sequence of functions, which are then integrated. They are similar in use but very different in method. Calling one a generalization of the other seems misguided?


Ummm… the Fourier Transform is an orthonormal linear transformation.


Great article! I’m developing Machine learning systems and my partner is working on psychiatric use of Deep-Brain stimulation, so a rare moment that we can share.

Very minor point: the King - male + female = Queen is a good example, but widely decried as not true by specialists. I don’t have much better examples (I haven’t been able to tell if Paris - France + England = London, for instance) but if you reuse that story, it makes sense to investigate that myth. There’s a lot there too.


I think you did a very good job - it captured the feeling that it's almost sorcery which still hits me any time I successfully apply it, without getting bogged down in technicalities. I think it's OK to be superficial as long as you give people enough information to look up and learn more about it. Mentioning word2vec will certainly give interested readers a head start.


i find the simplest way to explain pca to a general audience is to draw a ellipse of points off center and tilted in 3d space, and then draw plots for x, y and z. then center the ellipse and rotate the axes to match the major and minor axes of the ellipse and then show how it can be drawn in just x and y and that those x and y plots are far easier to interpret. done.


It’s cool to see you here! Fascinating article. I had no idea this was being pursued in an applied way; assumed it was all theoretical. Exciting!


Why did you write an article glorifying people who are working as hard as they can toward dystopia?


why are you asking a loaded question?


I'm not sure if you're reading the same Hacker News that I've been, but mine's mostly been about the confluence of surveillance capitalism, laissez-faire treatment of vulnerabilities in the tech stacks that power it (or even eg absentmindedly putting customer data into an unsecured S3 bucket), and the inability/unwillingness of governments or regulatory bodies to do anything about any of it. In light of these modern realities, I have difficulty believing in a positive final form of this technology. "Mobile pocket telephones" have evolved into "expensive powerful swiftly-obsolescent general-purpose computers mostly used for providing telemetry on the user to unaccountable corporations". Even if the HN crowd end up being able to opt out of the worst aspects of this, like one can with a modern smartphone via GrapheneOS or whatever, we still have to live alongside everyone else who can't.

I can think of lots of nefarious uses for this sort of thing, and I'm just some asshole who's read some science fiction. The real nefarious uses will be architected by people much smarter than me, whose moral difficulties will be dismissed by The Profit Motive, psychopathy, or both.


And here I am thinking about the potential benefits to para- and quadriplegics of circumventing a damaged spinal cord if only we could reliably interpret signal from the brain.

For all the bad you've listed, there's a reason people voluntarily choose to carry those surveillance devices in their pockets: the boons outweigh the ills by an order of magnitude. They're rarely dwelt upon because they're ubiquitous... much like nobody bothers to extol the virtues of fire.

We talk here about what's wrong because there is room for improvement, not because we should halt progress.


[flagged]


Maybe so, but please don't post unsubstantive comments here.


"unsubstantive"? This technology will be used against people. I just wanted to point it out in the most unflattering way possible. Like graphic warning and danger signs, it makes people stop and think.

And strange to hear from you for a second time in the last week or so.


Of course it's unsubstantive. It's not just an internet cliché, it's nowhere near the top of the internet cliché barrel. If you have something interesting to say, please say it explicitly, without tedious tropes, and without flamebait.

> And strange to hear from you for a second time in the last week or so.

I had no idea I'd replied to you repeatedly, but if so, the simplest explanation is that you've been breaking the site guidelines repeatedly. It would be helpful if you'd review them and stick to them from now on: https://news.ycombinator.com/newsguidelines.html.


Good question! I would feel free to use the doc itself as the forum—maybe just create a section at the bottom for general comments.


It’s hard to think about Wolfram. His tone is so off-putting — he’s constantly discovering the glorious and singular capabilities of his own products — but for instance he did basically invent the now-ubiquitous idea of the computational notebook.

Sadly I don’t know a single person who has used Wolfram’s software extensively. I haven’t used it myself extensively. Is that because it’s not what it’s cracked up to be, or because it’s a closed world?


Frankly, I'm really, really tired of this being brought up in every single Wolfram-related thread.

Mathematica is used by a lot in academia, and you'll find it cited in Methods section often. I personally know a lot of incredibly talented and prolific physicists who swear by it, and have used it to get very good work done.


It'll be easier to stop when the posts stop insisting on coining "new kinds of X".

The only innovation I see here is an unusual attention to constructing visual representations and using them as identifiers in code. Which is cool! But insisting it's new demeans the work of thousands of engineers who do similar and prerequisite work, and who make an effort to situate their work in literature and indusrtry rather than insisting on some kind of exceptionalism.

Arrogance should be called out in every instance, because it's not actually a route to greatest impact.


I didn’t mean to say that nobody uses the software, just that I don’t know anybody who does. Which just makes it hard for me to get a handle on its value.

The question I’d ask those physicists is whether Mathematica does the kinds of things Wolfram claims for it / does things other programming languages can’t do. I wouldn’t be surprised if the answer was “yes,” I’m just not sure.


For some mathematical work, going from Mathematica to Python/Julia/Octave/Scilab would be a significant step backwards.

This is coming from a pythonista btw. To each their own. Python is a better scripting language, but Mathematica is a better analysis language for a lot of things. Python is catching up in a lot of ways though with Numpy and Tensorflow. I'd say Python's Tensorflow is a lot more mature than Mathematica's neural nets, but Mathematica's symbolic math seems to be world class.


I used mathematica as a graduate physics student. It is an incredibly powerful equation solver and visualizer. I do not think I would very easily be able to do what I was able to do in mathematica in another language. That said I don't use it anymore.


Even Trump is less self-aggrandizing than Wolfram. It’s just hard to take him seriously

> It’s only recently that I’ve begun to properly internalize just how broad the implications of having a computational language really are—even though, ironically, I’ve spent much of my life engaged precisely in the consuming task of building the world’s only large-scale computational language.

I guess the rest of us will just need to muddle through with the (apparently) small-scale (non?)-computational languages that we use.


"I only now realize how great my contribution is"? Gag me.


>Even Trump is less self-aggrandizing than Wolfram.

I think that might be a little unfair. Wolfram's certainly more articulate in his egotism, but nobody matches The Donald for sheer volume:

"I understand things, I comprehend very well. Ok? Better than, I think, almost anybody."

https://www.youtube.com/watch?v=5GqJna9hpTE


Python is used more than Mathematica, but you never read a blog by Guido van Rossum claiming that he changed the world. It's the messiah complex that people find off-putting, not that he has put together a good product.

Mathematica/Wolfram Language is a perfectly fine programming language. It has an amazing standard library. It's definitely worth the money (if you're working in a field that benefits from it). But that's it. Don't call it a computational language. It isn't a new way of thinking. It isn't revolutionary, there is literally nothing in Mathematica that can't be done in another language just as easily (except the CAS engine is world leading). But somehow these mundane blog posts by Stephen Wolfram make it to the front page of HN every few months, so someone must be upvoting them.

So maybe his hype machine is working as intended- he is in charge of a successful company and I'm hiding anonymously behind a keyboard. So what if a few people don't like his attitude, he's more successful than most of us.


Technically that honor goes to a lesser known program known as Mathcad, which had "worksheets" (basically notebooks) in 1986.


That may be true, but in case anyone thinks the two are comparable:

I used Mathcad a bit in undergrad and my wife (civil engineer) used it extensively for all engineering courses for 4 years. One of her main professors wrote an entire civil engineering textbook in Mathcad and that was why they pushed it. It was a decent and cheap math tool, but Mathematica is orders of magnitude more mature from a language, functionality, graphics...really it beats Mathcad in everything except cost. The gulf between the two is like Windows 95 and Windows 10. Mathematica has support for neural networks, time series data, blockchain, 3D printing, running on Arduino, transpiling to C, Natural language processing, insanely detailed graphics primitives, web crawling...etc etc.

Matlab and Maple have done a better job keeping up, but Mathematica beats those easily as well in my opinion although they are somewhat different products.


> His tone is so off-putting

His writing does come across that way. He is self-confident, certainly. I have met him in person. He is actually a kind and generous person. He is also very likely to be the smartest person in the room most places he goes. I met him at a place where there were a lot of other math PhD's in the room (not me by any stretch) so maybe he felt he was among his people and was mixing more naturally. But in any case, my approach was to shut up and listen because I was more likely to learn something that way.


I've used Mathematica extensively, and it's very nice. It's better at symbolic algebra than Matlab and much more intuitive than Maxima. Its primary strength is computing purely symbolically with no numerical methods, for when you have no tolerance for numerical error.


I've used it a bit and it is pretty powerful to say the least as they've been shoving functions into it to cover nearly every domain of computing for decades in a consistent manner.

The notebooks are great, but I find Jupyter notebooks to be good enough even if they aren't as good in many ways if you aren't skilled in markup.

One of their senior scientists (Matt Trout) has some insane blog posts that show off the power of the language. He has one on using Laplace Transforms to hide an image of a goat. It is basically pages of Math. It would take two or three times the code in Python I bet.

Annoyingly, I wouldn't really use it in production as deployment looks painful and it limits the number of cores you can use. I usually use it as a super powerful prototyping tool.


When Nassim Nicholas Taleb posts stats stuff on Twitter, it is almost always Mathematica.


I recently built a little trivia game for my friend, and not wanting to spin up a database, I used Google Sheets as the backend.

My friend can edit the sheet at will, and the app pulls the data in via JavaScript. Google already sets the right CORS headers.


I tried this a few years ago and ran into some speed issues ( http://www.mooreds.com/wordpress/archives/1359 ). Is that something you've encountered?


There are a few interesting hacks you can do, the calls are slow to an individual cell (I think like 50-100ms). If you need arbitrary cells, you can use a sheet formula to put the desired result in a single cell.

For example I had a sheet that had hundreds of columns that the headers would change over time - instead of reading through each cell, I just used CONCATENATE(), read a single cell then split it in my program after pulling it.


I started it somewhat arbitrarily in 1900.



ahh was about to publish something in BC just to see what happens ;)


The Wheel: A More Efficient Method for the Production of Round Ceramic Wares (3500 B.C.)


But more seriously, I remembered the complaint tablet to Ea-nasir[1], which was on HN a few months ago[2]. Written in 1750 BCE.

[1] https://en.wikipedia.org/wiki/Complaint_tablet_to_Ea-nasir

[2] https://news.ycombinator.com/item?id=15669759


Occasionally you see articles on HN with a date in the title, like “(1998)” — and over the years I’ve noticed that these tend to be some of the best posts.

It makes sense: on a site devoted to news, an article posted so long after it was published has to be especially good.

So I hacked together this page, which links to every HN post with a date in its title earning more than 40 votes. It’s sorted in chronological order to encourage wandering.


Really cool!

Out of the 2k+ posts you list, some people are good at finding good quality classics!

The top 10 here combined accounts for more than 10% of all the posts:

    USER         POSTS COUNT
    Tomte        66
    luu          42
    tosh         33
    ColinWright  27
    adamnemecek  26
    vezzy-fnord  26
    brudgers     25
    pmoriarty    24
    networked    23
    shawndumas   22


Unfortunately, I have so far failed to get a (1538) submission to stick.

Well, let's try again: https://news.ycombinator.com/item?id=16444460


I think it does not stick, because I don't understand anything from the title


Well at least you know that it's by Andreas.


LOOOOOOOL :D


Looking at the source code, it only considers titles from 1900 to 2010 as classics.


Tomte is Swedish for "Santa", which seems fitting.


Tomte is not really Santa. It's more a gnome like creature. He's normally living near farms and protects the family and animals. The connection with christmas is quite new and seems more connected with it's other name Nisse, or Julenisse in particular. "The Tomten" by Astrid Lindgren seems to give a good overview of the folklore.


Why? Does he know when you are naughty? Do you also feel like watching out and not crying when he comes to town?


If I recall correctly, the story of Tomte is one where he is alone on Christmas night, pondering what life and death is, and decides that the art of giving is best virtue. He then knocks on everyone's door with a pig beside him and hands out presents to everyone who opens it for him.


we need more Tomtes in this imperfect world of ours.


For links that are now defunct/gone (such as #2 on the chronological list), can you pull the web archive at the closest date to the submission?

(dead) http://yorktownhistory.org/homepages/1900_predictions.htm ===> https://web.archive.org/web/20100108205037/http://yorktownhi... (archive from 6 days later)

I don't think this can be done programmatically though... thanks for putting this together. I enjoy the old posts a lot, too.


The url can be found via the archive's API, and you can specify a timestamp.

http://archive.org/wayback/available?url=http://yorktownhist...

It returns a json with, notably, the closest archived page given the timestamp.

https://archive.org/help/wayback_api.php


That's way more fantastic than I thought their API would be! Thanks. Hope OP sees it.


FWIW, #4 "Two 4000 ft plumb bobs hung down a mine shaft, with baffling results (1901)" can be found at http://www.lockhaven.edu/~dsimanek/hollow/tamarack.htm


I also stumbled upon this too. The article was archived, I wonder why it's not redirecting. Now I'm finally enjoying this post too.

http://yorktownhistory.org/wp-content/archives/homepages/190...


Very cool! Reminds me of the Ezra Pound quote, "Literature is news that stays news."

Ezra was a... complicated man, but he had keen insights.


He was also a fascist to say the least of him. I would refrain from using any quote of him


can "bad people" not have good ideas?


You are absolutely right, some articles are indeed timeless yet it's quite easy to miss them. This is a very good idea. A personal favorite of mine that gets posted here often is the Golden Rules for Making Money By P.T. Barnum (1880).



This is a clever and simple way to dredge up some great posts.

I wonder if you could ensure fewer false negatives (i.e. find even more great stuff) by doing the opposite: attempting to filter out every post whose link is to a page that came into existence within a month of the post's submission date.

This would likely require scraping the source links (unless you can get that from the https://cloud.google.com/bigquery/public-data/hacker-news dataset or somesuch), but it might be worth it anyway. It'd literally be "Hacker News, minus anything that looks like News."


I wonder how would you determine when a page came into existence.


Someone mentioned the archive.org/wayback API. You could check the oldest archive.org/wayback snapshot is over a certain age.


google certainly has some approximation of that, for example


It helps that for us readers we've got a lot of time after they were published to to evaluate those articles with.

I probabbly wouldn't understand most of them nearly as well if they weren't proven out by recent history ;)


hi james, i miss you


merci, mr. j.


That's a cool device, and I thought about getting one -- but for me, I wanted the satisfaction of putting ink to paper.


"Whoever wrote this needs somebody to take the fall. And that's Phreak, and that's Joey, and that's us." –Hackers (1995)


I had to resist the urge to add HACK THE PLANET to my comment.


The way I think about it is less as an encyclopedia of sequences per se than it is a kind of projection of mathematics onto sequence-space. The point being that an integer sequence, because it often has so many mathematical interpretations, acts in some sense like the intersection of those interpretations. There are of course many ways in which two or more mathematical concepts are linked, but a shared integer sequence is among the most useful, precisely because it can be browsed and searched like an encyclopedia.


That's a pretty thought


you're on your honor to actually guess :), i.e., to type enough of the name that there won't be a false positive


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: