Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, `LIMIT 1000` would of course be the simplest case, so closing it isn't really an achievement. What if I want to list all information about files with some mode? `ls` will still send 10^6 lines, carefully formatted, and almost all of them will be discarded with a simplest filter it could perform before reading all additional information from disc.



> What if I want to list all information about files with some mode? `ls` will still send 10^6 lines, carefully formatted, and almost all of them will be discarded with a simplest filter it could perform before reading all additional information from disc.

Indeed. That's probably one of the reasons PowerShell was designed to execute the commands in-process: The objects being piped are just the object references.

There's still the problem of using native indexes. That hasn't been solved elegantly yet. Reading all file names when wildcards would have eliminated 99% of them using file system metadata seems a waste. Which is probably why "ls" in PowerShell still allows a -Include filter and an -Exclude filter that take wildcards.


To execute this automatically in an efficient way we need a query planner which can look at the whole pipeline and decide to use a different set of primitives if the naive/implied ones aren't sufficient. What you're talking about is implemented in relation database management systems but it requires the query planner to know about the whole system - that is it's the opposite of Unix - there is a piece of the system that needs to know about the whole system.

As far as Unix is concerned, the whole list is not stored in memory. Pipes are buffered streams (they are iterators, just with a buffer attached, which makes them more efficient not less).


> What you're talking about is implemented in relation database management systems but it requires the query planner to know about the whole system - that is it's the opposite of Unix

And that's exactly my point: that this line of thinking about how to make these things right will lead to overcomplicated, bloated system.


Speaking of bloat (from another comment in this thread):

> I am really curious about how much parsing and formatting occupy *nix source code. (IIRC 30% of ls),


If you've got a lot of files in a directory and you only want information about a subset of them then you're better off using a command like find:

    find . -maxdepth 1 -perm /g=w -exec ls -ald {} \;
This finds all group-writeable files in the current directory and lists them using ls.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: