Over many years of reading various essays on "*types*", this is the list of syno...

yarvin9 · on Oct 18, 2015

Urbit actually wound up with three different concepts which all want the word "type," and can't all have it. Worse, none of these quite matches the PL-theory meaning. So none of them wound up with the word.

One easy way to state a type is to state it as the range of a function. The function accepts all values, and produces some subset of all values; ideally, all values in its range are fixpoints, f(x) == f(f(x)). This is kind of nice because you've defined both a constraint and a validation function. We often want to validate untyped data and make it typed.

On the other hand, this range (Urbit's "span") is not the same concept as the function itself (Urbit's "mold"). Moreover, when you send data over the network, you don't want to send the range or even the function; you want to send the name of the function, which hopefully is bound to something compatible on the other side.

This namespace becomes a third level of type (Urbit's "mark"): like a MIME type, if a MIME type was an executable specification. One of the problems with most PL type systems is that they predate ubiquitous networking; so they're not designed to address the problem of describing external data, eg with MIME types, XML DTDs, protobuf schemas or whatever.

astrobe_ · on Oct 18, 2015

> One of the problems with most PL type systems is that they predate ubiquitous networking; so they're not designed to address the problem of describing external data, eg with MIME types, XML DTDs, protobuf schemas or whatever.

The problem existed before "ubiquitous networking" if you realise that networking is just a specific kind of I/O. But I/O from a PL perspective is nothing more than sending or receiving bytes. PLs usually do not impose a format and a meaning on I/O because it would restrict the capabilities of the applications and libraries.

One could argue that PLs should impose such restrictions just like some PLs impose restrictions on memory access ("managed languages"). However, a huge difference between memory and I/O - and especially networking - is that they are unreliable by default: transmission errors, disconnections, lags, etc.

But even if you solve that, your PL can only speak to itself, which quite significantly restricts its practical usefulness. As far as I know, only languages specialized in distributed computing do that.

But for a general purpose language, high level networking is usually not something you want to be carved in stone (in the language's specs). You'd rather want to leave that in libraries, that can be replaced and enhanced.

yarvin9 · on Oct 18, 2015

Not to be all Winnebago Man here, but all the code in Urbit (protocols and language alike) can be replaced and enhanced, except Nock. In fact, the normal way to drive this process is just by merging the new version of the code. It's certainly true, however, that you don't want locked-down components.

Urbit is very happy to be an HTTP client and server. (Alas, HTTP has become the "everything protocol.") It also has its own network protocol, which it prefers for conversations between urbits.

The unreliability of network packet delivery (or more exactly, the unreasonable effectiveness of unreliable networking) is basically the greatest technical idea of the 20th century as far as I'm concerned.

There is actually a significant difference between processing unordered, unreliable packet streams, and ordered, reliable, message streams, with the "universal design." Neat thing about packets: you can drop them. If a packet sends the server into an infinite loop? Or at least, a timer detects an infinite loop? Drop the packet. From the network's perspective, the CPU is simply detecting congestion.

You can refuse to process a message, but that's not a stateless and passive operation; you have to return a failure report. But dropping a bad packet is the most wonderful thing in the world. By definition it doesn't change your server's state. It also doesn't leak information to the sender...

wyager · on Oct 18, 2015

No offense, but this is exactly the sort of nonsense I associate with urbit.

>ideally, all values in its range are fixpoints, f(x) == f(f(x)).

This doesn't make any sense whatsoever. The only function for which this is true is identity, and I assure you it takes more than just the identity function to make a useful program. This isn't even possible unless the function has the type "a -> a".

The only other thing I can see you trying to express here is some butchery of set notation, where by f(x) you actually mean {f(x) | forall x of the correct type}. Even then, what you're saying is still completely nonsensical.

I honestly cannot tell if urbit is some sort of coordinated trolling effort.

tel · on Oct 18, 2015

Identity is not the only function for which that holds. As an immediate example, constant functions have this property as well. More interesting examples exist as well.

Also what was stated was not an equation for any old fixed point but instead idempotence—a specific kind of fixed point. In particular, this describes precisely the shape of a "validation" function so long as it either succeeds on all inputs or has a notion of failure in either its computational context or it's range ("span" I guess).

dllthomas · on Oct 18, 2015

'a "validation" function'

Or "normalization"

yarvin9 · on Oct 18, 2015

Yeah, that's exactly right. I usually say "normalization."

eru · on Oct 18, 2015

> The only function for which this is true is identity, [...]

Any projection suffices.

geocar · on Oct 18, 2015

> The only function for which this is true is identity

If you want to clamp a value to some arbitrary range (a hint to the name), then any function that does so has this property.

    unsigned char f(unsigned long x) { return x & 255; }
    long f(long x) { return x < 0 ? x : -x; }

lmm · on Oct 18, 2015

The return type isn't the same as the input type there. It only looks that way because of C's limitations.

Put it this way: f(f(x)) never makes sense for those definitions of f; it would always indicate a programming error. And so the type system ought to prevent it.

yarvin9 · on Oct 18, 2015

To be slightly clearer, the values here are Urbit "nouns."

A noun is an atom or a cell. An atom is an unsigned integer of any size. A cell is an ordered pair of nouns.

I am fairly confident that many useful sets can be specified as the fixpoint set of a function in this space.