Hacker News new | past | comments | ask | show | jobs | submit login

Sounds good, but what uniquely identifies a file? Right now it's path + name.

If I have two files with the same name, one tagged A and the other tagged A and B, are they the same file or not? What if I add a tag of B to the first one?

A directory hierarchy makes this unambiguous.




I think we should use several mechanisms at once to identify files.

Tags. The default mechanism for sorting and searching files. The assumption is, most files are passive data. When sharing a file, its tags should be sent along with it, so the receiving system can propose them by default to its user. Note that one may want to categorize tags themselves (meta-tags?). I'm not sure, but it may be necessary if a given system use many tags.

Descriptive names. This is the user-facing name of the file. No need for it to be unique. Like tags, a file's descriptive name should be sent along with it.

Locations. It may be of import to know where a given file is physically located. It is cool to transparently access more files when you plug you thumb drive in. It is less cool to forget to actually copy those you need.

Unique keys. Devised by the system and not directly accessible by the user. When a search yields several files with the same descriptive name, or when two files share tags and name and location, the system can be explicit about the ambiguity.

Unique names. Devised by the user. The system checks uniqueness (or, more likely, uniqueness by location). Follow a directory structure convention. Discouraged by default. Their primary usefulness would probably be for system and configuration files, which need to be accessed automatically and unambiguously by programs. May be implemented on top of descriptive names (the system could treat descriptive names that begin with "/" as unique, and enforce that uniqueness).

There. End users would primarily use tags, descriptive names, and locations. With the right default, users may actually be able to avoid making a mess of their data. To prevent unwanted access to sensitive system files, the system can by default exclude those files from search results. Typically those both tagged "system", and located on the main drive. Unique names would be for programs, power users, and those who want to fiddle with their system settings (either directly or through a friendly interface). Unique keys belong to the kernel.

So, how does that sounds?


Hard to say. It may be brilliant, and may be the Future Of Files, for all I know.

My first reaction, though, is that it sounds a bit confusing to me, and very confusing for novice users.

Right now, Mom understands that "C:\My Documents\bird.jpg" is not the same as "C:\My Documents\My Pictures\bird.jpg". The rule is simple: unique names per folder.

What's the new rule?


This is kind of a paradigmatic change. Right now, the default when dealing with files is to point to them. What I envisioned in the grandparent was to make search the default. Tags, descriptive names, and locations are all search criteria.

In a way, it is more complicated: instead of 0 or 1 file, you now get a whole list. On the other hand, everyone understands search. My hope is, the initial extra complexity would be dwarfed by the ease of sorting and finding your files. Because right now, one or the other is difficult: it's hard (or at least bothersome) to properly sort one's data in a directory tree, but it's even harder to find it if your disk is messy.

Now there are two snags we might hit: first, I'd like to do away with unique names, because they get us back to the old, difficult to manage, directory tree. Second, to have good tags, you have to internationalize them. For music stuff for instance, French speaking folks would like to use "musique", while English speaking ones will use "music". It has to work transparently when they exchange files, or else it would defeat the purpose of default tags. I can think of solutions such as aliases, normalization at download time, or standard tag names that can be translated by the system, but I'm not sure that's really feasible or usable.


I think that all these different types of identifiers might make security a challenge.


Access rights should of course not be tied to identifiers, but to files themselves.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: