And now it's ruined as so many submissions are just a solid color. It's really too bad, because if you go through the submissions there are some really amazing ones in there.. particularly for raccoon driving a tractor.
Yeah it's a shame, I get that someone tries to beat the ai, but you would think that it's less fun to draw the eleventh solid color when the top ten is already filled with them.
This is really fun to do, I'm currently ranked 36th for the racoon drawing and 15th for the giraffe (So top 5 if you ignore the solid colors), but I should probably do some work right now.
This is an interesting observation. The cool part about CLIP is that we can ask it any prompt we want. So “fixing” this loophole is just a matter of prompt engineering.
We already added a penalty to the score for text by subtracting similarity to “handwritten text”.
Any ideas what these solid color blocks would rank highly for that and actual drawing wouldn’t? I can update the prompts and regenerate the leaderboards this morning.
Hey everyone! We made this as a hackathon project over the weekend to experiment with OpenAI’s new CLIP model. https://openai.com/blog/clip/
Glad you’re enjoying playing with it and discovering failure modes! Yesterday we tweaked the prompts to penalize people writing text when we discovered CLIP can read. Looks like we should add something to prevent it from falling in love with solid blocks of color. Any ideas for how to describe that? (Someone below suggested a bonus for “complexity” but I’m concerned that’d make it fall in love with scribbles.)
Update: "An illustration of an empty canvas filled with color" seems to work pretty well to extract the solid blocks of colors from the actual drawings.
We're adding that to our prompts now and then will recalculate the leaderboards. Please surface any other failure modes you find! It's pretty neat to learn about what trips it up; it's brand new so you all are amongst the first to poke at it.
Update 2: appears this was actually a bug in my code that applies a text penalty :blush:
These blocks of color weren't scoring highly for the prompt... they were scoring really low for similarity to text... which ended up giving them a sort of anti-text bonus relative to other submissions. The fix should be up shortly.
Update 3: leaderboard should be mostly fixed!
Update 4: leaderboard fix fully implemented. And they're looking good! There are some artists amongst you. https://paint.wtf/leaderboard
It runs each image through a variety of prompts including screens for racist symbols, nsfw drawings, and nudity. If you send me a link to yours I can let you know what it tripped up.
(One of our heuristics is "is it closer to nudity than to the prompt" so if it wasn't a good representation of the prompt it could be more that it was far away from the target vs being particularly close to the nsfw target.)
> I sort of love that the AI prefers solid gray over any sort of creativity or complexity
Well that's this specific AI...
> A preview of... gray goo!
We are the gray goo, or rather: our ecosystem is. A flood of self-replicating machines that took over the planet a long time ago, eating everything. If we introduced a new kind of nanite-based life that actually managed to devour our ecosystem (which is probably more challenging than it sounds, given that there are real thermal limits on how fast such molecular disassemblers can work), I'm "optimistic" that they would differentiate into another complex ecosystem over time as well.
Reminds me of the time I added neural networks to my asteroids-like game (with gravity and walls). I recorded myself playing the game, added sensors to detect walls, enemies, powrups around the player's ship and the output was any combination of "throttle, rotate left, rotate right, shoot".
The AI analyzed hours of my recorded games and decided the best strategy is to constantly hold the throttle and do nothing else :)
It's already ruined. All this shows to me is that the best way to train an AI using people is to not tell the people they're training an AI.
Edit: I just imagined that it would post submissions to reddit instead and have people vote up and down. No mention of AI, just voting up and down on other people's art.
Then that could train the AI, right? I have no idea, never worked with ML.
Of course perfection only comes with persistent practice, my natural talent was not enough to project me to such a hallowed height of artistic accomplishment.
The AI tends to prefer solid blocks of color over actual drawings, and a particular shade of pink over other colors. However, I think this is the case because the drawings are largely terrible. For one of the challenges that I saw, I couldn't beat a very well-drawn giraffe with a solid color. Of course, it's possible that for that particular challenge a different color from the standard pink would have been more highly rated.
This is a really creative use of what I assume is CLIP. Seems like there are really simple hacks though, like choosing the right solid background color
This made me so happy. I get high rank for just drawing green on green background. And I always thought my drawing skills sucked :)
Joke aside, this is a cool idea! With some tweaking for not identifying pure colored images as high rank it would be more fun for the artists among us!
Nice, is your drawing the green upside down long neck dinosaur?
I drew almost the same thing except I didn’t color mine only drew the outline and mine was facing the other way and also it was smaller, probably because I drew it on my phone. Guessing that you drew on a laptop or desktop, or that you zoomed out if zooming out is possible.
I got 97th place at the time of my submission, and now I am currently 108th place out of 388 total.
I got 1 for making a purple square with decreased saturation. Purple works better than the other colors for some reason, it would be really interesting to see the training data set.
No, I did only the outline also, in black, with one eye. Currently 11 out of 604 submissions. I used the mouse on a desktop computer and used 2 strokes to do the whole thing.
I had a look at the scoring. It looks like it penalized you pretty heavily for all the "H" letters on the honey.
We added a text penalty because it turns out the model can read and so all the top results were people writing the prompt. Not sure how to tell the model to only penalize bad text but let Ok text through.
You must have caught us in the middle of the leaderboard update job. If you go back via the homepage it should link directly to your submission for that prompt so you can see your rank.
The neat thing is: there is no training set! The CLIP model[1] takes arbitrary text prompts. So for these we're just sending eg "An illustration of an upside down dinosaur" and it returns a similarity value between the image's encoding and the text's encoding.
To filter out things we don't want (eg text and nudity) we penalize the score based on the similarity for eg "A nsfw illustration".
AI is only as smart as it’s programming (which can be pretty darned smart, but this AI wasn’t trained to deal with people who are trying to break it).
On a side note, Roboflow is freaking amazing.
https://paint.wtf/ranking/5qms/pORkuKYlFhXWjn6SWD4M