Hacker News new | past | comments | ask | show | jobs | submit login

It’s remarkable we’ve hit a threshold where so much can be done with synthetic data. The reasoning race seems an utterly solvable problem now (thanks mostly to the verifiability of results). I guess the challenge then becomes non-reasoning domains, where qualitative and truly creative results are desired.





It seems like we need an evaluation model for creativity. I'm curious, is there research on this -- for example, can one score a random painting and output how creative/good a given population is likely to find it?

How do you account for the impact of culture/lived experience of the specific population viewing the painting? Intuitively it seems like that would be the biggest factor, rather than the objective attributes of the painting, no?

All art is subjective. Any attempt to "verify" a piece of art would be entirely dependent on cultural and personal sensitivities. Art isn't a math problem with a solution.

But you can dissect it into concepts and see if it is something truly new to the model - if the output contains things which aren’t there in the weights, you have a nice specimen to study and, crucially, a recipe to get a bunch of matrices to output untrained things.

This is like saying: All cooks are equally good, even the most disgusting slop (e.g. water/flour soup) isn't any better than a dish from a cook with several Michelin stars. Of course the latter is better. And if it is better, it is objectively better. Even if 0.001% of people prefer flour soup.

> culture/lived experience of the specific population viewing the painting

Isn't this lived experience baked into LLM language bases? It's certainly very hard to target all possible populations at once. And art doesn't need that, doesn't do that. Only rare marketing sometimes attempts to do that and only in very limited ways, such as a brand name acceptable all over the world.


There are two kinds of creativity at play here. One is mashing together combinations of learned things - it’s kinda like shuffling a deck of cards where basically every shuffle gets you a deck that has never been seen and won’t be seen again, but it’s still the same 52 cards every time. The other kind is going outside of the box and inventing truly new, unseen/untrained concepts. This one is hard, but I don’t think it’s impossible - the <think> slop stirring the learned concepts with a bit of randomness should make progress here.

A new "AI challenge" -- can an AI make a hit movie (even if just for Netflix) in each of Documentary, Action, Thriller, Comedy, and Drama genres. This isn't art like the "Mona Lisa", but more like the ability to make "art" that has appeal to some level of the public. I think if an AI can do that, I'll be pretty impressed.

The prompt: "Create a feature length [Action/Comedy/etc...] film that can borrow elements from existing films, but would generally not be considered a copy of any given film."


You can train a supervised model, taking into account the properties of the rater as well as the artwork, and tease out the factors that make it rated so.

You can probably cluster raters and the artwork they rate highly - but probably not in large quantities? -- Which might be the case also with raters being willing to tell you why - and how! most love to do that - but also not in very large quantities. With the added issues that the raters' own opinion of why they love or hate something is likely not to be entirely true and self-understanding.

You could use a larger corpus, like auction house files and art magazines. But then you are confounding for celebrity - a large ingredient in art prices.


> can one score a random painting

You can get very mechanical in scoring an image. Ask any art student. If you want to or if your instructor or audience wants to. For example "fits rule of thirds?" yes is a point to common attraction, no is a point to unexpected at the risk of outsider-ness. You can do that in color, composition, recognizing objects and fitting that to memes or associations or non-associations. Too many points in "unexpected" is meta points in "unpleasant chaos" and so a strong downgrade in common attraction. You can match all this to images in the library (see how copyright or song recognition operates in the music category) and get out of that some kind of familiarity vs edge score (where too much edge goes against common attraction.)

I would expect you could get better than most humans at recognizing shapes in an image and drawing associations from that. Such associations are a plus in unexpected / surprise if they are rare in the culture or a plus in common attraction is they are common.

After that, to be cynic about it, you can randomize and second guess yourself so your audience doesn't catch on the 1st level mimicry.

Creativity is not normally used as an absolute with a unique measure. It's not "length". And you only need to please part of the audience to be successful - sometimes a very small part, some of which loves surprise and some hates it, etc. Someone elsewhere objected on the grounds that creativity or attractiveness is culture based - yeah so? if you were to please much of just one whole culture, you would have an insane hit on your hands.

Sounds feasible to me.


It's still reasonning based on pattern matching, which should go only so far. But "only so far" could be plenty for lots of applications.

Tuning for qualitative outcomes is pretty much solved via RLHF/DPO (what this post calls "preference tuning"). Right?



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: