Hacker News new | past | comments | ask | show | jobs | submit login
65535 interfaces ought to be enough for anybody (aakinshin.net)
131 points by kiyanwang on Feb 14, 2017 | hide | past | favorite | 79 comments



Android is worse — it limits number of methods to 65535 (but has hacks to workaround it) https://developer.android.com/studio/build/multidex.html


A coworker mentioned that multidexing also take ages and uses huge amounts of memory.


Why do people always talk about multidexing like it's such a nightmare?

https://developer.android.com/studio/build/multidex.html

It's really not that bad, It only gets a little squirrely if you're trying to support super old android versions.


Our app's build time ballooned by 2-3X or more when we went to multidex. We've had some luck reducing it, but we'd still like to axe multidex.

I definitely consider a 3X build time increase (not 30 seconds to 90 seconds, but 1 or 2 min to 5-8) to be a significant problem, bordering on nightmare... especially since we just included a few more libraries that put us past the 65k mark.

Is that what other people have seen?


How is your dev pipeline set up such that an 8 minute release build becomes such a problem?


I guess I'm thinking of builds on my machine taking that long, not even release builds.


Surely that's just a release build, though...? How would this bottleneck productivity for a team?


When you deploy release-like builds to your QA team (as you should), a tripled-build time is a bit problematic if you're trying to iterate quickly.


Who needs to "iterate" so quickly that several minutes added to your build time becomes a huge issue?


Me? I work on embedded android installations that have over a thousand years combined of running with 0 software crashes. Once I get to a certain point in developing an app for our installations I start profiling for the slightest memory leaks by watching changes in memory usage profiles for different user flows.

And I can only do that with release builds if I want useful numbers. I've had us split up apps over build times because of Multidex (for example, most of our AWS centric code lives in a separate app that exposes specific functions via an AIDL interface because the SDK was bringing in too many methods)


Are there any legitimate use cases for that many methods?


It sounds like a ridiculous number of methods until you realize just importing Guava brings in 20k methods, using half the Amazon SDKs would probably bring another 20k-40k, Apache's main utils bring about 10-20k, etc.


I mean, it means you can't pull in Scala. I mean, you can, but pull in enough bloated Java 3rd party libraries and you might be hitting that limit pretty quickly. Google broke Google Play Services out into a large number of small dependencies once they were past the 15k method mark. Mind you, certain compilation settings will totally strip out unused methods, but it's still annoying.


For example, the amazon sdk jar has 40,000 alone or something insane. You can hit if you're using a few large libraries.

That said, you can use proguard to strip your unused methods and it usually fine even for large applications.


Java's crippled generics, lack of tuple support, and lack of default parameters encourage interface bloat in utility libraries.

Given how bloody good Java IDEs are (By which I mean to say - how good IntelliJ is), the cognitive cost of this interface bloat is very low... Until you start developing for Android.


Wouldn't Java's crippled generics actually help reduce the method count since due to type erasure they're all the same method at the VM level anyway?


On the one hand yes, on the other hand Java also generates Bridge methods when you implement a generic interface. A MyType implementing Comparable<MyType> will contain a compareTo( MyType ) method and a generated compareTo( Comparable ) bridge method.

It is more likely that generated classes, countless getters and setters and a tendency to small methods has a higher impact.


For reference types probably yes, but Java also often needs extra methods or overloads for handling primitive types, which would get unnecessarily boxed when used in generic methods.


IIRC the Facebook app had many more than that at one point, and had to use multi-dexing. Not sure if that's still the case.

I've also heard that scala apps on Android can sometimes hit the limit, because pulling in the scala stdlib greatly bloats the method count.


Wasn't that the initial reason behind splitting off Messenger into a separate app?


I don't know, but I would doubt it. Making such a huge product decision based on an engineering limitation (that has a workaround) seems like a poor idea.


Because of "single responsibility principle" it's better if classes and methods do one thing. That usually leads to lots of small classes and small methods.


Ad SDK's have tons of methods.


Also:

65535 tcp-ports ought to be enough for anybody [1]

65534 hardlinks ought to be enough for anybody [2]

[1] http://stackoverflow.com/questions/113224/what-is-the-larges...

[2] http://unix.stackexchange.com/questions/5629/is-there-a-limi...


I find it extremely awkward that we actually use numbers for ports. Port 80 is typically used for HTTP, but there's nothing preventing another application from using port 80. Why not call the port "HTTP" instead. Better yet why not give it an integer range of ports to go along with the naming - Eg MyApp[1], MyApp[2].

If you want to hide a port from being probed then give it a GUID as a name. No more port scans.


At the time TCP/UDP were invented, keeping packet size small was very important. A string or GUID as addressing information would have been out of the question.

(If there wasn't a need to distinguish multiple clients on the same machine, we might have only had 255 ports!)

A standard port for HTTP servers is needed, as most HTTP clients don't support DNS SRV records.

That said, in the IPv6 world there's no technical reason you can't just let every service bind a different IP address.


> in the IPv6 world there's no technical reason you can't just let every service bind a different IP address.

I'm not a networking expert, but is that true that you can create multiple ip's for a single device using ipv6? Sounds like it would end up very messy.


You can do that, provided that the network routes multiple IPs (or a whole prefix) to the device. It tends to be pretty messy keeping track of all the addresses in the OS configuration, but that's an opportunity for OSes to get better.


> It tends to be pretty messy keeping track of all the addresses in the OS configuration, but that's an opportunity for OSes to get better.

Seems like an odd thing to do, create bizarre (unnecessary?) complexities simply as an opportunity for OSes to get better.

Are there any advantages to doing it that way?


Packet size is important. If your ports are strings, every packet gets that much bigger. Nobody wants slow internet for programming convenience.


It's not even convenient. Strings are one of the biggest scourges of any programming language out there. Giant mess in basically every language until Java got it somewhat right, and even then don't forget your equals...

C# (.NET) could be the forerunner if only they had not gone with UTF16, but then it's only with hindsight that we can point out UTF8.


Strings are only a problem because they are being treated as text. In this case, they would simply be treated as binary identifiers, so there is no problem. The raw bytes would do just as well for this kind of thing.


Only if they are fixed length. If you used variable length strings in packet headers things would get complicated.

But if you're fine with fixed-length strings, you could just as well use all those bits to represent a number instead.


The name could be used for connection establishment, then a random port is assigned, just like the client port.


Portmap[1] lets you use names instead of port numbers. A server just grabs an arbitrary port number and then registers itself with the portmap server. When you want to connect from a client, you first ask the portmap server for the port number.

[1] https://en.wikipedia.org/wiki/Portmap


As an alternative, predating Portmap by around 700 RFCs: tcpmux[0]. Everything connects to port 1 and asks for services by name.

[0]: https://tools.ietf.org/html/rfc1078


I have an experimental protocol that uses tcpmux. Works very well for that purpose.

However, it sort of requires your service to be started by inetd. It doesn't work for services that want to manage their own listen queues.

It also complicates firewalls. It makes it much harder to have rules that vary per service.


Easy to do with DNS SRV records. Just add: _http._tcp.hostname SRV priority weight port target. and you can use any available port you like.

Of course, current TCP has 16-bit ports hard coded. But IPv6 has almost unlimited addresses.

For some reason, SRV records are not in wide spread use.


>For some reason, SRV records are not in wide spread use.

No legacy tooling supports it. Poor support on the products already there. Huge numbers of firewalls are hard coded to 80:443 so alternate ports would be blocked.

When you make a poor choice on the internet, it lasts forever.


> Why not call the port "HTTP" instead

This is what /etc/services is for:

telnet localhost http ...

But unfortunately very few applications do resolve "port name to port number" as they resolve "hostname to ip address".


More practically, it's very nearly become a moot point due to firewalls which block all traffic except to a very narrow set of ports. Opening holes on enterprise firewalls is a gargantuan battle. Which is why we've seen what are effectively new services implemented as port 80 features: Twitter, Reddit, Facebook, Gmail, Dropbox just off the top of my head (irc, Usenet, finger, email, ftp, arguably, in their basic implementations).

See Meredith L. Patterson's "On Port 80" https://medium.com/@maradydd/on-port-80-d8d6d3443d9a


Is it possible to file an RFC to IETF for these kinds of solutions? I see lot of great ideas but never see anybody proposing them anywhere, why?


This particular idea is bad.


IIRC, appletalk used service names, not ports.


Which is why when Apple moved to TCP/IP, multicast DNS (Bonjour, zeroconf) was such a crucial part, because it let them implement a similar model on top of port numbers.


The hard links issue often reared its head on Linux, because a sub-directory has a hard-link to its parent directory called "..". So you could only mkdir 32767 directories in any given directory, which hits you fairly fast if you try to do something like sorting numbered files into "000000/00.txt" etc



After yesterdays posting of the list of incorrect assumptions programmers make about the world I put it into my daily web routine to visit the page and read one list per day till done.


Do you have a link to this? I missed it and would like to read it.

EDIT: Think I found it? https://news.ycombinator.com/item?id=13640409



File this under 'premature optimization bites after 12 years'.


Why makes you say it is "premature"? Isn't it possible that they hit that limit because the project is a monolithic monster? Or because it uses libraries that waste those resources?


i'd argue the opposite.. i mean, i think you're referring to malloc when you cite premature optimisation, but this seems more of an issue of failing to optimise for the future or generalise beyond the initial needs of the developer


Is it still "premature optimization" if it bites after 12 years?


Any time you have a 16-bit count, it won't be enough. It's a dangerous size...


You're succumbing to selection bias[1]. You never notices all of the 16-bit counts in all of the software you use that don't overflow. For all we know, there could be a thousand of them for every case where 16 bits is too low.

[1]: https://en.wikipedia.org/wiki/Selection_bias


The same is true of 32 bit numbers. 64 bits might finally be enough. crosses fingers


Well, for counts 32 bits is enough, unless you count something really small, like every individual byte in something.

And for counts 64 bits should be enough, since it's 20 billion billion.


Assuming you managed to accumulate enough items to fill a 64-bit counter, if you got 1 core of your super-fast PC to increment a 64-bit value for each one, it would take over 100 years to count them all. (Assuming you don't just decide to pop -1 in the counter. Presumably you'd want to make sure you haven't miscounted during the accumulation phase.)


Well, for counts 16 bits is enough, unless you count something really small, like every individual byte in something. And for counts 32 bits should be enough, since it's 4 billion.

someone somewhere 20 years ago.


For counts of things you have in memory, 32 bits is usually enough -- even if each thing is just 8 bytes, 2^32 of those objects would require 32 GB.


32GB isn't an exceptional amount of memory these days.


Sure -- but it is an exceptional amount of memory to use for tiny objects like those. If you're working with four billion objects at a time, they're probably more substantial than eight bytes.


Or you spent a lot of effort to get them to 8 bytes or even smaller, to fit as many of them as possible in memory. See use-cases like time-series/analytics databases, point-clouds, simulations with many elements...


You mean 4GB.

And yes, 32-bit programs can't use more then 4GB.


No, I mean 32 GB -- 2^32 x 8 bytes. We're discussing 32-bit integers, not pointers.


16 bits can represent 65536 different values, not 65535.


0 Interfaces 1 Interface ... 65535 Interfaces


I can't tell if you're joking, but when you're counting a quantity, you don't start from 0.


Why not? Is having zero interfaces not distinct from having one interface?


It is. But when you have one interface, that interface can be given the number 0. So you have a total of 1 interface, even though you've only counted up to 0.

Edit: I misunderstood. I'll leave the comment up. But I originally interpreted the story as meaning that each interface needed to be assigned an unsigned 16-bit id, which allows for a total of 65536. That was just inference on my part though. It literally says that more than 65535 are not allowed.


Quantity as in great amount? Either way, sounds like the worst kind of premature optimization to save one bit's worth to lose the ability to ever have the quantity be zero.


I suspect it is a play on the 0, 1, infinity rule.


This sounds familiar. Mono worked for me until it didn't. And it would be some corner case like timing and sockes, or HTTP connection issues. By that time we already bought into it and so I was running around opening a huge black box trying to figure out the internals.


Remember how we laughed about visual basic which supports just 65536 different variable names?


How many of these would fit into 640 kB of RAM, anyway?


you need 16 bits per name, so 2^20 bits. That fits nicely into about 20% of your 640kB of RAM.


It's almost as if programming languages should support integers, rather than just machine words. Or, put differently: Fixed-Width "Integers" Usually Harmful.


> A short fact about .NET: if you have an interface IFoo<T>, the runtime generates a separate method table per each IFoo<int>, IFoo<double>, IFoo<bool>, and so on.

In Java you have to do the same manually: https://docs.oracle.com/javase/8/docs/api/java/util/stream/S... wow such polymorphism much ad-hoc




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: