Hacker News new | past | comments | ask | show | jobs | submit login
StackOverflow Importer – Import code from Stack Overflow as Python modules (github.com/drathier)
301 points by drathier on Nov 3, 2016 | hide | past | favorite | 84 comments



Relevant xkcd: https://xkcd.com/1185/

"StackSort connects to StackOverflow, searches for 'sort a list', and downloads and runs code snippets until the list is sorted."


And https://gkoberger.github.io/stacksort/ is an actual implementation of that sort.


  "StackSort connects to StackOverflow, searches for 'sort a
  list', and downloads and runs code snippets until the list is
  sorted."
If still unable to sort a list, it'll emit an arbitrary sorted list as the result.


This might be a joke but everybody does it manually for real. Whenever I copy-paste some tricky function from SO I like to add a code comment with the url pointing to the stackoverflow.com answer. This allows future maintainers to refer back to SO and see if a better answer has surfaced or read comments about any issues.


Almost 3 years ago Visual Studio demo-ed a feature that would copy-paste code from StackOverflow, while automatically renaming the variables to fit your code :

https://blogs.msdn.microsoft.com/visualstudio/2014/02/17/int...

The tool is available here:

https://visualstudiogallery.msdn.microsoft.com/a1166718-a2d9...


I never, ever copy paste code, unless it's really long. I always end up rewriting everything myself, so I have the time to think about what's happening.


Hmmmm. I'm the other way round. I copy and paste short snippets because I'm lazy and I can see if they are ok at a glance. Longer code I'll go through line by line and try and 'touch' each line to force me to think about it.


I do it for that reason but also for license attribution and indicate that I didn't write it.


> everybody does it manually for real

I've been programming professionally for much longer than there's been a stack overflow. I have never, ever, cut and pasted code from stack overflow.


I've worked with people who did it with pride. When I pointed out we had no license for the code, they proudly told me anything without an explicit license was public domain.

I quit not longer after that.


StackOverflow code is under a license - cc by-sa 3.0. They were planning on clarifying this by moving to the MIT license, but there was some pushback on that and they are holding off for now at least: https://meta.stackexchange.com/questions/272956/a-new-code-l...

Besides that, most snippets are small and trivial enough that using them would easily fall under fair use.

Obviously, your case showed a lack of care on their part, but you can use SO code.


How do you know how to loop over an array for (int i=0;... ? You didn't invent that code. You read it somewhere many years ago and have copied it over and over many times since then. Do you have a license for that code? The author you learned it from did not invent it either, was he violating a license? It wasn't even invented by Dennis Ritchie since he copied the idea from earlier languages.

My point is that trying to copyright basic snippets of code is just ridiculous, it's something lawyers have tried to force on programmers but it makes no sense. We would be completely unable to program at all if we listened to lawyers all the time.


Well, actually I just typed [] in my browser's console, expanded it, then expanded the prototype and found a method called "forEach" which sounded like what I wanted. Fiddled with it a bit until it worked.

Clean room usage~


Reminds me of stacksort[1]

[1] https://gkoberger.github.io/stacksort/


Very funny. I hadn't seen this before. I particularly like:

>>Is it safe?

>>Uh… it evals both user input and random code, unchecked, from an external site. This is what security-minded folks would refer to as Very Bad™.


I do my best to make it safe(ish), though :) It only runs code samples that were posted before I made StackSort.


What if someone edits an answer posted before you made it?


Thank you for implementing stacksort!


It's not hard to imagine that stacksort works, but seeing it in action is really something different. It's scary and awesome at the same time.


Awesome. Now we just need to integrate upwork with our IDE so I can just write the method signature and a bid price and my method will be magically implemented.


It's interesting to imagine adding something like this to an editors auto complete or as some kind of search function to automate functions you've coded a million times.

I.e. maybe you type: "repeat a function over a list python" in a special format and your editor pastes in the accepted answer from stackoverflow for you to choose whether that's acceptable / whether you want to use it as a template. Then maybe you could get into a better flow without ever having to stop what you're doing to Google something ... It would all just show up magically in your editor.

You could even have libraries of common answers written in such a way that you could write expected input and output formats and then get a code fragment that would fit that description. Something like a more organized test driven development. Though that would really only work for very specific things. Improve these ideas enough and make them more general - and maybe you could improve programmer productivity by quite a large margin.

Disclaimer: I know the OP was a joke but I see no problem with reusing code if you understand what its doing and the costs / trade-offs for doing so.


There is something like that for JavaScript https://emilschutte.com/stackoverflow-autocomplete/


Visual Studio has an extension by Microsoft called Bing Code Search which does exatcly what you want. It's really good.


It brings new meaning to the term "full-stackoverflow developer."


Err.. did you mean "full-stack(overflow) developer"?


I like the description: "Import arbitrary code from Stack Overflow as Python modules."

(Emphasis mine.)

--

Now Stack Overflow should be even more concerned about the availability of their service.



It's hard to imagine a better argument for vendoring your dependencies in case they change.


because executing arbitrary unknown code is always a good idea


On the other hand, at least the code has gone through some review.

Compare that to NPM or PyPI, which is full of crap. Some of the modules in PyPI were merely ads for putlocker, without even a single line of Python.


>Some of the modules in PyPI were merely ads for putlocker

Seriously? Are they still there?


I think that's the point of this parody.


Sure, but even playing with this joke on a machine that has your personal or work data is a very stupid thing to do.


Do not do that then.


It's all arbitrary and unknown unless you verify the whole thing yourself down to the kernel.

And then you better hope that the processor dies are as advertised.


Also, this is a joke.


Am I dumb? How is this a joke?


It seems pretty apparent from the readme that it's a joke along the lines of "If you're going to just copy and paste the first thing you find on Stack Overflow, you may as well do this."


> Do you ever feel like all you're doing is copy/pasting from Stack Overflow?

I don't really feel like anyone is taking this very seriously.


I think the commenter was being facetious haha


Job security for us, I guess...


Based on some of the comments here, a sarcasm disclaimer might be necessary on the repo!


(OT) HEY! I finally noticed your Desert Combat comment and responded to it. If only HN had a better commenting/messaging system...


I often find that the accepted answer with the most votes is the oldest and often outdated and that there's a newer answer with also a lot of votes which is more up to date and better many times.


Stackoverflow-driven development (SDD):

https://goo.gl/images/xXCZrx


Did I just render that book obsolete?


Not sure what you mean..


Can I be all code review mode, and ask you to lower case the first f. It's sticking out like a sore thumb


And the relevant book for people who prefer reading material. https://tra38.gitbooks.io/essential-copying-and-pasting-from...


I did something similar with a project that allowed you to import code from anywhere[0]. StackOverflow would be a nice addition.

[0] https://github.com/libeclipse/import-from


Cool joke. Post leftpad it needs a disclaimer lest someone actually do this for real.


I'm kind of shocked how many people here are talking as though they would actually include this in a codebase.


What if, somehow, there was a guarantee that the code can't harm your machine -- would you use it then?

This isn't a joke.


It's not just about harming your machine - it can harm your* data, or data passing through your* system!

* Where "your" can mean you, your employer, your customer, another company, etc.


What if it can't harm your data, or data passing through your system?

What if it can't harm anything.


How could a system that executed arbitrary code ever make such guarantees? I mean sure, if there were a magical thing that only ever produced correct code for what you wanted to do I guess people would use it.


Its amazing how little code it takes to implement this (aside from dependencies).



If I have editing powers could I hijack the answer that gets its code imported?


Edits get reviewed before being published.


No they don't. From http://stackoverflow.com/help/privileges/edit

> We believe in the power of community editing. That means once you've generated enough reputation, we trust you to edit anything in the system without it going through peer review. Not just your posts—anyone's posts!


If you've been on Stack Overflow enough, you might have noticed that crap is sometimes approved even after peer-review. Many doing the review do it like robots (to earn internet reputation and badges).


That doesn't guarantee the folks doing the reviewing know what they are reviewing.


This is a really stupid idea, even as a joke to play with. Someone can just edit the top answer to make it malicious and steal your personal or work data from your workstation.


Someone stupid enough to actually use this in a serious environment arguably deserves it.


and then own your network, your DVR, your toaster, your cat.


Please leave the cat!


This is a pretty cool idea! Concerns for running unknown code aside, it seems like the results of this type of thing would be prone to potentially frequent change.

I wonder if there'd be a way to maintain consistency, something like a requirements.txt file that got spit out to describe what the results of the search terms mapped to when the code was run.

It could just contain the search term and link to the chosen question. Although you'd still run the risk of the answer itself being edited. The code itself would have to be cached in whatever this file was.


What a fascinating concept! I think that with some kind of review built in, this kind of dynamic loading would be really useful.

I think if you do this, you'd want external tests, or some way to pull the code in and "freeze" it so that you wouldn't be at risk of external compromises. (For example, a high ranked SO user might be able to change a highly ranked answer . They might not do it, but someone suborning their account might.)

But really, this is an optimization of a lot of code that is currently written.


I don't see this being useful for real programs, but it could be amazing for interactive use in a shell. I use iPython for all sorts of quick data manipulation daily, and being able to just quickly type an idea and get it would be amazing.


You're right, maybe it is better for prototyping, but as I've experienced, for better or worse what you prototype sometimes becomes production.


I was thinking the same thing about the freeze.

Something like the equivalent of `pip freeze`


Agreed. It'd be cool if snippets had a checksum api.


code is data; lispy!


Anyone else tinkering with / armchair philosophizing about ways of making stuff like this practical for regular use?


This is a license hell, whatever you do, don't use it if you care about your code not being owned by other people.


Importing and running code from an external site without verification or code review. What could possibly go wrong?


What a time to be alive. Edit: this is officially the best thing to happen to me this week, heck, month.


This is so funny. Kudos to the author!


I look forward to using this in a future job interview and seeing what happens.


This is awesome. But how is `time_delay` used?


The code snippet it finds imports sleep and calls it directly. Fire it up in a repl and check __author__ or _code yourself :)


Makes me think of Kite[1]. Kite is a "real" thing that one can use though.

[1] https://kite.com





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: