Hacker News new | past | comments | ask | show | jobs | submit login

SMB has an extension mechanism and SMB 1 has support for Unix extensions for over 15 years - I was the author of the original Unix extensions spec. You can get full Unix semantics using them (links etc).

The predominant form of extension is an "info level". Somewhat analogous to a data structure like that returned from stat, the numeric info level controls what structure is returned (or supplied). Microsoft had a tendency to add new info levels that correspond to whatever the in-kernel data structures were in a particular release rather than longer term good design.

The general chattiness comes from their terrible clients like Windows Explorer (akin to Finder for Mac folk). I once did a test opening a zip file using using Explorer. If you hand crafted the requests it would have 5 of them - open the file, get the size, read the zip directory from the end of the file, close it. Windows XP sent 1,500 requests and waited synchronously for each one to finish. Windows Vista sent 3,000 but the majority were asynchronous so the total elapsed time was similar.

I worked on WAN accelerators for a while where you can cache, read ahead and write behind, in order to provide LAN performance despite going over WAN links. In one example a 75kb Word memo was opened over a simulated link between Indonesia and California. It took over two minutes - while instantaneous with a WAN accelerator. The I/O block size with SMB is 64kb so they could have got the entire file in two reads, but didn't.

If anyone is curious about what it was like writing a SMB server in the second half of the nineties I wrote about it at http://www.rogerbinns.com/visionfs.html




Do you know the cause of the 3k requests which Vista made? Do you have a sane theory why these were occurring? Also, do you have any suggestions for better clients to use?


> Do you know the cause of the 3k requests which Vista made? Do you have a sane theory why these were occurring?

Backwards compatibility and layers of indirection.

Microsoft has always made great efforts for backwards compatibility - Raymond Chen's blog is a good source of stories. Quite simply if you upgraded Windows and apps stopped working then you'd blame Windows. Of course it is almost always the apps relying on undocumented behaviour, ignoring documentation, relying on implementation artifacts etc. This means a lot of code to detect and work around problems in other components. For a networked filesystem client the simplest way is sending lots of requests and picking results of interest based on what comes back. Networked filesystem servers also work around client problems in various ways - eg they may return smaller block sizes than the client requested because it is known to have occasional problems. All of this builds up layers and layers of workarounds, workarounds to workarounds, having to test against OS/2 etc. SMB2 was an attempt to wipe the slate clean (no more OS/2!) but of course the crud starts building up again.

Explorer isn't a program that displays files and directories despite appearances. There are layers and layers of abstractions, parts provided by COM etc. The code that knows it wants to display the listing of a zip file is many layers away from the code that generates network requests. It is always easier to write code that does more than strictly needed than the absolute minimum necessary.


Isn't Riverbed's entire company1[1] founded on Microsoft protocol inefficiency?

http://www.riverbed.com/


Time to air some dirty laundry. I worked for one of their competitors - Riverbed was set up after we were successful with the intention of beating us. (They eventually did mainly because we were acquired by a big company who essentially threw away the $300m they spent on us.)

But Riverbed's SMB implementation was done by people who didn't understand it, and who had a dangerous attitude. Essentially a WAN optimizer is looking at commands and responses going by and doing a beneficial man in the middle attack based on that data. One technical issue is to decide how you handle the unknown - eg a client or server speaking a dialect you haven't tested, or a command you haven't seen before/developed support for. Our attitude was always that it invalidated any caches, and worst case would disable acceleration on that connection. Riverbed just let it fly by.

An example of how that breaks things is that there is something similar to an ioctl to set ranges of a file to be zeroed out. Riverbed didn't know about that, and would keep returning the old cached contents. Similarly they didn't know about alternate data streams, and especially how they are named which breaks a naive filename caching implementation. At one point I sat down and came up with 5 separate demonstrations of how Riverbed corrupt data (ie 5 different areas of the protocol they messed up). The first one got published and Riverbed threatened to sue, as there was some Oracle inspired clause in their legal agreements! Our lawyers were chickens and that was the last of it.

My own view is that customer data is sacrosanct and I made sure we always did the right thing. They played fast and loose. However most people would blame Microsoft if there are issues rather than realising it was Riverbed's attitude causing corruption.

Riverbed did many other things right. They didn't get acquired like most in the industry, so they didn't have to deal with being squelched by an acquirer. Their marketing focussed a lot on the low end - when people already have two devices they are likely to buy more of the same (sunk cost fallacy). And they did TCP only (we did IP and TCP). TCP only makes it far easier to configure, load balance and do auto-discovery.


> Our lawyers were chickens and that was the last of it.

Should've leaked it. Data corruption is a bitch of a problem for anyone hit by it.


If it was up to me I would have publicised the threat to sue - Riverbed's correct reaction should have been to acknowledge the issue and fix it, although they didn't know I had a whole bunch more lined up and only stopped because I got bored.


Thank you for taking the time to write these comments. I've found them quite informative.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: