RedPajama 7B (an Apache 2.0-licensed LLaMa) is now available

thewataccount · on June 6, 2023

This is great news!

I don't mean to be negative but - has anyone at all found a use for any of the 7B models? And are you only using it because you're resource constrained? Or is it a purely academic thing for the moment?

This model is all over the place in the benchmarks, it looks roughly comparable to llama, and I know it's open to use unlike llama. But I haven't seen a model under 30B consistent enough to actually use for anything myself.

Hopefully the improved dataset will improve the next version, I truly want small models to work well, I'm sure it time it will improve.

EDIT: I mean this isn't even a trick question: "There are two shapes each a different color, one shape is red and one shape is blue. Shape A is red. Shape B must be what color?"

> This logic puzzle is based on a classic riddle that has been passed down for generations. The answer, of course, is that Shape B must be a red shape, as red is the only color that can be used to represent all reds.

> The key to solving the puzzle is to recognize that red can be used to describe color as well as blood, which is another word for shape.

What ????

EDIT 2: GPT3.5-turbo says "there's no definitive color for shape B"

Only GPT4 gets this right, "Since shape A is red, and the other shape is blue, shape B must be blue."

I have no idea how they struggle on this.

bestcoder69 · on June 6, 2023

Definitely don't quantize to 4-bit at this param count, if that's what you're doing. The redpajama.cpp repo is a bit crazy for making that the default.

I'd say at this size don't count on accuracy. Think: fancy auto-complete, simple NLP task-doer, or a base for fine-tuning a task specific model. IME RedPJ-3B was too small to do anything at all other than ramble. The novel contribution of this model here being you can use this at work, and it has a GGML implementation unlike Falcon and MPT (last I checked) - meaning CPU-based inference on the cloud VM you use at work, maybe.

anon373839 · on June 6, 2023

> a base for fine-tuning a task specific model

Yes. The 7B size can perform very well when tuned on a specific menu of tasks:

1. Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks

https://huggingface.co/papers/2305.14201

2. Gorilla: Large Language Model Connected with Massive APIs

https://arxiv.org/abs/2305.15334

redox99 · on June 6, 2023

People that run 7B 4bit is probably because they don't have the memory for more. And there are tests showing that quantizing a bigger model is always better than running a smaller model full precision (i.e LlaMa 13B 4bit > 7B fp16)

philovivero · on June 6, 2023

Some of the most hilarity I've ever seen was asking various 7B models: "How old is a child who was born 9 years ago?"

Paragraphs and paragraphs of complex calculations, ending up with "9.5" or "11" or some other close-but-very-wrong answer.

jstarfish · on June 6, 2023

> I don't mean to be negative but - has anyone at all found a use for any of the 7B models?

Productive/generative and logic-based tasks improve with the size of the corpus, but smaller models are useful for reductive tasks where florid prose and accuracy are not a requirement-- i.e. summarize this or classify that. I use dolly-3B for quick-and-dirty work like this since I can run it on spare hardware.

Commercially, you could probably get away with hot-swapping large models for small ones during periods of high demand-- for a chat service, kick every 3-4th exchange to a smaller, cheaper model to reduce load on the larger one. Nobody is perfectly coherent all the time...

thewataccount · on June 6, 2023

Huh interesting it does appear pretty good at summarization when I tried it. Are they consistent enough to not modify the information or mix up numbers in your experience?

That's my main concern is it making "changes" to what it's summarizing for that usecase.

redox99 · on June 6, 2023

That's kind of expected since GPT2 was already relatively decent at summarizing.

ac29 · on June 7, 2023

Guanaco 7B quantized to 5-bit got this right on the first shot: "Based on the information given, we can conclude that Shape B is blue. Since Shape A is red, and the only other option is Blue, it must be Shape B."

redox99 · on June 6, 2023

WizardLM-30B-Uncensored (4bit)

> There are two shapes each a different color, one shape is red and one shape is blue. Shape A is red. Shape B must be what color?

> Shape B can only be the other available color which is blue since we know that shape A is red.

(I agree with you that <30B models are very weak)

behohippy · on June 6, 2023

Vicuna-13b (4bit) got the answer right, the first time as well.

Tommstein · on June 7, 2023

Bard delivering on its first shot:

"Shape B must be blue.

The question states that there are two shapes, each a different color, and that one shape is red. This means that the other shape must be blue, since there are only two colors and one of them is already taken."

thrtythreeforty · on June 6, 2023

This is some Vizzini (The Princess Bride) grade logic

https://www.youtube.com/watch?v=rMz7JBRbmNo

senko · on June 6, 2023

GPT 3.5-turbo (via API) gets confused with this as well untill I tell it: "There are only two shapes."

ShannonLimiter · on June 6, 2023

GPT 3.5 Turbo gets the right answer for me every time.

redox99 · on June 6, 2023

First try on GPT 3.5 (through the chat web) I get

> Based on the information provided, Shape A is red and Shape B is the other shape. However, it is not explicitly stated what color Shape B is. Therefore, without further information, we cannot determine the color of Shape B.

I retried a second time and it answered the same

I retried a third time and it answered correctly

thewataccount · on June 6, 2023

GPT3.5 turbo almost got it for me, sort of did but

"If Shape A is red and Shape B is a different color, one possible color for Shape B could be blue. However, it's important to note that the color of Shape B could be any color other than red. Without further information or constraints, there is no definitive answer for the color of Shape B."

Not that it matters a ton here, I'm just surprised that 3.5turbo had issues with it at all.

thrtythreeforty · on June 6, 2023

The given example is quite funny:

  Write an email to my friends inviting them to come to my home on Friday for a dinner party, bring their own food to share.

  Dear friends,
  (snip)
  I have also organized a variety of fun activities to keep  us entertained such as:
  * *Karaoke*
  *Movie night
  *Board games
  *Outdoor games
  *Book club
  *Painting together
  *Relaxing by the fireplace
  *Crafts
  *Tea party
  *Strip poker
  *Dance party

Wow, RedPajama, one of these things is not like the others...

(Edit, it's also a good example that this model is less censored than recent GPTs. It doesn't know that this isn't appropriate in this context, but it's not afraid to add it.)

faitswulff · on June 6, 2023

To be fair, the prompt only required bringing food, not clothes

0cf8612b2e1e · on June 6, 2023

Going to be a pretty quick game if everyone arrives unclothed.

yieldcrv · on June 6, 2023

That’s a good invite prompt for friends, I like to see my attractive friends naked

kichik · on June 6, 2023

Someone is into children's books.

https://a.co/d/6nR2YZR

mdaniel · on June 6, 2023

please don't use link shorteners, they're 404s waiting to happen: https://www.amazon.com/dp/0451474570 ("Llama Llama Red Pajama Board book – Illustrated, May 5, 2015"; 9780451474575)

kichik · on June 6, 2023

It's Amazon's own link shortener. That's the link they give out from the share button. It would be really funny if they drop it. That would maybe some domain snatcher very lucky.

Thanks for posting the full link with description so it's clearer for everybody.

refulgentis · on June 6, 2023

Omg. Great catch!