Hacker News new | past | comments | ask | show | jobs | submit | patrickas's comments login

That is why I like the way Raku handles it.

It has distinct .chars .codes and .bytes that you can specify depending on the use case. And if you try to use .length is complains asking you to use one of the other options to clarify your intent.

  my \emoji = "\c[FACE PALM]\c[EMOJI MODIFIER FITZPATRICK TYPE-3]\c[ZERO WIDTH JOINER]\c[MALE SIGN]\c[VARIATION SELECTOR-16]";
  say emoji; #Will print the character
  say emoji.chars; # 1 because on character
  say emoji.codes; # 5 because five code points
  say emoji.encode('UTF8').bytes; # 17 because encoded utf8
  say emoji.encode('UTF16').bytes; # 14 because encoded utf16


LLMs do have explicit world models that can be even manipulated. There are many recent papers on the subject.


ChatGPT itself tells me it has no explicit world model.


It is not about what the model tells you.

This paper shows an emergent world model in an LLM that was taught to play otello moves https://ar5iv.labs.arxiv.org/html/2210.13382

https://arxiv.org/pdf/2303.12712.pdf This paper discusses (among other things) how a GPT4 model navigated between rooms in a text adventure game and was able to create a map afterward. Literally building a model of the world as it was navigating and drawing a map of that afterwards


I mean, just like you can create 1-line python script that claims "I am an AGI" and have that fact be false, you can have ChatGPT tell you it has no explicit world model, while exhibiting behaviors that can only really be explained by it having some sort of model of the world inside it.

Fine-tuning is like a PR agent teaching someone what sorts of things not to mention on TV even though they may be true.


What are some of these behaviours?


Patrickas responded with some examples: https://news.ycombinator.com/item?id=35966307


Please see my reply to parent.

> "fusion by 1990 instead of 2000..." Those three dots are omitting the most important part of the issue if we spend that much extra money on R&D

In the fusion case no one was willing to spend the money, in AGI's case it looks like everyone seems to be willing to spend the money.

I personally hope they won't, but that is a crutial point not to be overlooked.


That wasn’t the case at all for fusion. It is that to be viable different scientific breakthroughs in particle physics, and lasers. Especially to the point where the demo recently showed the viability of it. If you read up on it you’ll see that there was a lot of important steps that lead up to the short demo that was achieved. That is my point though, if you read the article even after the ellipses it wasn’t a monetary issue.


I don't think this contradicts the AGI prediction though (nor the "Fusion by 1990 with a bit of extra investment" prediction for that matters)

https://external-preview.redd.it/LkKBNe1NW51Wh-8nLSTRdQtTha2...

This chart shows how much people in the 1970s estimated should be invested to have fusion by 1990, to have it by 2000s and to "never" have it. We ended up spending below the "never" amount for research over four decades so of course fusion never happened exactly as predicted.

I think the main difference is that no one was interested in investing in fusion back then, while everyone is interested in investing in AGI now.


In addition to the funding aspect, I think it could be argued that a fusion reactor is actually much harder than people thought it was in 1970.


This paper from late last year shows that LLMs are not "just" stochastic parrots, but they actually build an internal model of the "world" that is not programmed in, just from trying to predict the next token.

https://ar5iv.labs.arxiv.org/html/2210.13382

PS: More research has been done since that confirmed and strengthened the conclusion.


Your comment reminded me of this article:

Humans Who Are Not Concentrating Are Not General Intelligences

https://www.lesswrong.com/posts/4AHXDwcGab5PhKhHT/humans-who...


Raku seems to be more correct (DWIM) in this regard than all the examples given in the post...

  my \emoji = "\c[FACE PALM]\c[EMOJI MODIFIER FITZPATRICK TYPE-3]\c[ZERO WIDTH JOINER]\c[MALE SIGN]\c[VARIATION SELECTOR-16]";

  #one character
  say emoji.chars; # 1 
  #Five code points
  say emoji.codes; # 5

  #If I want to know how many bytes that takes up in various encodings...
  say emoji.encode('UTF8').bytes; # 17 bytes 
  say emoji.encode('UTF16').bytes; # 14 bytes 

Edit: Updated to use the names of each code point since HN cannot display the emoji


And if you try to say emoji.length, you'll get an error:

No such method 'length' for invocant of type 'Str'. Did you mean any of these: 'codes', 'chars'?

Because as the article points out, the "length" of a string is an ambiguous concept these days.


You can represent it as a sequence of escapes. If Raku handles this the same way as Perl5, it should be:

    $a = "\N{FACE PALM}\N{EMOJI MODIFIER FITZPATRICK TYPE-3}\N{ZERO WIDTH JOINER}\N{MALE SIGN}\N{VARIATION SELECTOR-16}";


now do it in YAML


As far as I understand in this specific case yes.

The whole schtick of GPT-3 is the insight that we do not need to come up with a better algorithm than GPT-2. If we dramatically increase the number of parameters without changing the architecture/algorithm its capabilities will actually dramatically increase instead of reaching a plateau like it was expected by some.

Edit: Source https://www.gwern.net/newsletter/2020/05#gpt-3

"To the surprise of most (including myself), this vast increase in size did not run into diminishing or negative returns, as many expected, but the benefits of scale continued to happen as forecasted by OpenAI."


There is no indication that it was "the judiciary's decision" to store it "near a major population center".

That's a story floated by the head of customs to try and shift the blame to the judiciary. But there is no evidence for it, and there is much evidence against it.

Source: The court documents released by journalists Riad Kobeisyi and Dima Sadek.


I think the storage was one of convenience. Ship was at the docks, so store it at the docks. Who would have thought it would have taken 6 years? The next problem is this 2750 tons of material. You're talking around 70 semi loads to carry it all off. This has to be funded by someone, who?


I mean, at the bare minumum someone must have been willing to take this stuff off the states hands at least for free, right? A Google search says 500$ a ton for it


Very few people need explosive grade ammonium nitrate in this quantity and also prefer a source that has no provenance, and pay some fraction of 1.3 Million list price over their established supply chain.


I’m sure they could have destroyed the fireworks and sold the fertilizer to some farmers if they really wanted to. That would quickly pay for itself.


I still can't wrap my head around this, even if you can't sell it you can still find ways to get rid of both.


And in fact, the judge ordered the government to either sell it off or put it in safe storage... about 6 years ago.


Off the top of my head: Concurrency and parallelism using high level and low level APIs. There is no GIL.

Grammars which are like regular expressions on steroids for parsing. ( admittedly still not optimized for speed)

Gradual typing, you can go from no types at all for short one liners, to using built in types, to defining your own complex types for big programs.

  subset Positive of Int where { $^number > 0; } #create a new type called Positive 
  multi factorial(1) { 1 } #multi subroutines that dispatch on type
  multi factorial(Positive \n) { n * factorial(n-1) } 


  #say factorial(0); #Error type mismatch
  #say factorial(5.5); #Error type mismatch
  say factorial(5); #120

  hyper for (1 .. 1000) -> $n { #use hyper indicate for loop can be run in parallel on all available CPUs
    say "Factorial of $n is { factorial($n) }"; #Gives correct results by automatically upgrading to big int when needed
  }


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: