It has distinct .chars .codes and .bytes that you can specify depending on the use case. And if you try to use .length is complains asking you to use one of the other options to clarify your intent.
my \emoji = "\c[FACE PALM]\c[EMOJI MODIFIER FITZPATRICK TYPE-3]\c[ZERO WIDTH JOINER]\c[MALE SIGN]\c[VARIATION SELECTOR-16]";
say emoji; #Will print the character
say emoji.chars; # 1 because on character
say emoji.codes; # 5 because five code points
say emoji.encode('UTF8').bytes; # 17 because encoded utf8
say emoji.encode('UTF16').bytes; # 14 because encoded utf16
https://arxiv.org/pdf/2303.12712.pdf This paper discusses (among other things) how a GPT4 model navigated between rooms in a text adventure game and was able to create a map afterward. Literally building a model of the world as it was navigating and drawing a map of that afterwards
I mean, just like you can create 1-line python script that claims "I am an AGI" and have that fact be false, you can have ChatGPT tell you it has no explicit world model, while exhibiting behaviors that can only really be explained by it having some sort of model of the world inside it.
Fine-tuning is like a PR agent teaching someone what sorts of things not to mention on TV even though they may be true.
That wasn’t the case at all for fusion. It is that to be viable different scientific breakthroughs in particle physics, and lasers. Especially to the point where the demo recently showed the viability of it. If you read up on it you’ll see that there was a lot of important steps that lead up to the short demo that was achieved. That is my point though, if you read the article even after the ellipses it wasn’t a monetary issue.
This chart shows how much people in the 1970s estimated should be invested to have fusion by 1990, to have it by 2000s and to "never" have it.
We ended up spending below the "never" amount for research over four decades so of course fusion never happened exactly as predicted.
I think the main difference is that no one was interested in investing in fusion
back then, while everyone is interested in investing in AGI now.
This paper from late last year shows that LLMs are not "just" stochastic parrots, but they actually build an internal model of the "world" that is not programmed in, just from trying to predict the next token.
Raku seems to be more correct (DWIM) in this regard than all the examples given in the post...
my \emoji = "\c[FACE PALM]\c[EMOJI MODIFIER FITZPATRICK TYPE-3]\c[ZERO WIDTH JOINER]\c[MALE SIGN]\c[VARIATION SELECTOR-16]";
#one character
say emoji.chars; # 1
#Five code points
say emoji.codes; # 5
#If I want to know how many bytes that takes up in various encodings...
say emoji.encode('UTF8').bytes; # 17 bytes
say emoji.encode('UTF16').bytes; # 14 bytes
Edit: Updated to use the names of each code point since HN cannot display the emoji
The whole schtick of GPT-3 is the insight that we do not need to come up with a better algorithm than GPT-2.
If we dramatically increase the number of parameters without changing the architecture/algorithm its capabilities will actually dramatically increase instead of reaching a plateau like it was expected by some.
"To the surprise of most (including myself), this vast increase in size did not run into diminishing or negative returns, as many expected, but the benefits of scale continued to happen as forecasted by OpenAI."
There is no indication that it was "the judiciary's decision" to store it "near a major population center".
That's a story floated by the head of customs to try and shift the blame to the judiciary. But there is no evidence for it, and there is much evidence against it.
Source: The court documents released by journalists Riad Kobeisyi and Dima Sadek.
I think the storage was one of convenience. Ship was at the docks, so store it at the docks. Who would have thought it would have taken 6 years? The next problem is this 2750 tons of material. You're talking around 70 semi loads to carry it all off. This has to be funded by someone, who?
I mean, at the bare minumum someone must have been willing to take this stuff off the states hands at least for free, right? A Google search says 500$ a ton for it
Very few people need explosive grade ammonium nitrate in this quantity and also prefer a source that has no provenance, and pay some fraction of 1.3 Million list price over their established supply chain.
Off the top of my head:
Concurrency and parallelism using high level and low level APIs. There is no GIL.
Grammars which are like regular expressions on steroids for parsing. ( admittedly still not optimized for speed)
Gradual typing, you can go from no types at all for short one liners, to using built in types, to defining your own complex types for big programs.
subset Positive of Int where { $^number > 0; } #create a new type called Positive
multi factorial(1) { 1 } #multi subroutines that dispatch on type
multi factorial(Positive \n) { n * factorial(n-1) }
#say factorial(0); #Error type mismatch
#say factorial(5.5); #Error type mismatch
say factorial(5); #120
hyper for (1 .. 1000) -> $n { #use hyper indicate for loop can be run in parallel on all available CPUs
say "Factorial of $n is { factorial($n) }"; #Gives correct results by automatically upgrading to big int when needed
}
It has distinct .chars .codes and .bytes that you can specify depending on the use case. And if you try to use .length is complains asking you to use one of the other options to clarify your intent.