This demonstrates the dangers of loose path resolution rules. Traditionally, con...

wglb · on Sept 10, 2014

We should be passing around string lists, e.g. ["foo", "bar", "baz"] instead of "foo/bar/baz".

But that doesn't in and of itself solve the problem, because "foo/bar//baz" would map to ["foo" "bar" "" "baz"/] without any additional convention.

This is actually not that unusual. this site does not treat two consecutive slashes as a single slash. There are likely others implementation differences.

Certainly in posix consecutive slashes count as one for file paths, but URL query strings are not file paths.

daveloyall · on Sept 11, 2014

... "foo/bar//baz" would map to ["foo" "bar" "" "baz"/] ...

No, I think it'd be more like proto://host/thing?foo&bar&baz (put an =1 on each of those if you like).

Yeah, I'm employing a convention, but so to is the concept of list of strings that the commenter invoked.

rlpb · on Sept 11, 2014

Does the HTTP standard or robots.txt specification mandate the collapse of consecutive slashes, though? I agree that it is common, but if it is server-side implementation detail, then a correct implementation of robots.txt should not collapse them, as they might mean different things to a particular server.

kentonv · on Sept 11, 2014

I agree. If there's a bug here, it's in the server which collapses slashes seen in request paths, not in the indexer's interpretation of robots.txt.