Hacker News new | past | comments | ask | show | jobs | submit login

Here's a real-world example of using C# to create a set of file paths, aiming to eliminate duplicates. I'd've written this in Java as:

    Set<File> s = new HashSet<File>();

    s.add( new File( "c:/temp" ) );
    s.add( new File( "c:/temp" ) );

    System.out.println( s.size() );
This prints 1.

The C# code:

    ISet<FileInfo> s = new HashSet<FileInfo>();

    s.Add( new FileInfo( "c:/temp" ) );
    s.Add( new FileInfo( "c:/temp" ) );

    Console.WriteLine( s.Count );
This prints 2.

They are line-for-line identical. IMO, Java prints the correct result. C# may have reasons for this behaviour, but when you're looking at the code, it's not obvious that two FileInfo objects for the same paths are not equal.

When I wrote KeenWrite[0], I chose Java because of all the reasons you mentioned, plus JavaFX. JavaFX is one of the richest cross-platform GUI libraries available for desktop applications, although it does have quirks.

[0]: https://github.com/DaveJarvis/keenwrite




I don't understand the point you're trying to make with your example. Is the concern here that in C# `FileInfo` does not override `Equals` and because this is inconsistent with other standard lib objects this is a trap for developers? If that's the case, well, Java has its own share of weird and inconsistent behaviour that developers simply need to be aware of.

---

On a tangent, is there a reason why FileInfo does not override `Equals`?


FileInfo is a stateful reference type, so overriding Equals and GetHashCode to use value equality would be a Bad Idea™.

E.g.:

    var f = new FileInfo("image.jpg");
    var set = new HashSet<FileInfo> { f };
    f.MoveTo("image2.jpg");
    Console.WriteLine(set.Contains(f)); // Would be false (sometimes)


Then FileInfo should not be hashable, so a Set of it could not even be declared.


But the current implementation using reference equality is perfectly valid, and often quite useful.


https://en.wikipedia.org/wiki/Principle_of_least_astonishmen...

Here's another:

    string path = "/tmp/filename.txt";
    FileInfo f = new FileInfo( path );

    File.CreateText( path ).Dispose();
    Assert.True( f.Exists );

    f.Delete();
    Assert.False( f.Exists );
Astonishingly, the test fails because the result of Exists is cached. ¯\_(ツ)_/¯

My expectation is that if I put two things that are identical---for all intents and purposes---into a set, only one copy is added to the set. Likewise, upon deleting a file, I expect that testing for its existence always returns false. Both of these expectations are based on my mental models for math and file systems.

I haven't tested the behaviour of caching file states in Java, but I suspect it will return false if the file is deleted. Point being, in my limited time with C#, I've found it to violate the principle of least astonishment in ways that Java does not.


C# FileInfo uses reference equality, because that's the default for objects and they didn't override it. Why? I suppose they thought of it as a helper to get information about a file (or open it), rather than being a file handle itself. Or perhaps it wasn't clear what value equality meant for FileInfo. If two FileInfo point to the same file, but use different paths (relative paths, case insensitivity, hard links, etc.) should they compare as equal? Java just string-compares the two paths, but that's not what I'd want it to do. (I wouldn't want reference equality either. I'm not a big fan of reference equality, personally, and C# has gotten more value-equality oriented over time.)

Doesn't Java also have things that use reference equality? Arrays?


>Doesn't Java also have things that use reference equality? Arrays?

Yes. There are inconsistencies all over the place. In fact, OPs example of java.io.File is ironic, because that object had a number of issues and was somewhat replaced by java.nio.file.Path [1].

Even outside of that, file equality in general is fraught with a lot of subtlety. For example, should java.io.File equality be case sensitive or insensitive?

[1]https://docs.oracle.com/javase/tutorial/essential/io/legacy....


Perfectly intuitive, just like how `"foo" + "bar" == "foobar"` is only sometimes true.


I aimed to eliminate duplicates of URLs using that approach in Java. Do they resolve to the same IP? Same URL.


Yeah, URL is from the first release of Java and should be avoided. You want to use URI instead.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: