Instead of returning the mean rgb values, a better idea is to cluster the colors (e.g. using mean-shift algorithm) in a perceptually uniform color space [2] (such as CIELAB).
I heartily agree. For example, searching kiwi should result in roughly the color of the inside part. Instead it's a light muted green, probably taking into account a white background that it's usually featured on. Taking the mode by clustering instead of the mean would improve results.
Really interesting idea! I was thinking this might be a neat color name-to-rgb tool, but the resulting color deviates substantially from expected colors (which, given the algorithm, is to be expected):
By the website's definition, which is right on the main page, it is not wrong. The website uses the "average color across the approximately 25 image results".
And for that purpose this site works fantastically, and I would say it's a pretty hilarious idea.
For example searching for Dolphin doesn't return grey but instead a shade of blue, because most dolphin pictures are dolphins in water.
I think the website takes our expectations and instead shows us the reality. And I think it does a fantastic job.
What I would like to see is an option to maybe increase the number of images to average from, maybe to 50, 100, etc. I would also love to see the source released so I can this instance locally.
I agree. The problem has to do with not recognizing the background of the image. That'd explain why ketchup and solo cup are pink -- the first 25 pictures all have white backgrounds.
I feel like it would only be right for this kind of project to not "stoop" to background-removal, but instead do something more ridiclous by throwing even more ML at the problem:
1. classify the image
2. use word-net to figure out what "container" word is most closely associated with the image's classification
3. search the container word, and pull out its dominant colors
4. return the colors that are in the image but not in its classificatory container. Unless that set is empty (e.g. a polar bear in snow), in which case return the colors you'd have given without all these extra steps ;)
I (and probably a few others) searched for "Hacker News" and it comes up with a really nice colour that is probably a cross between the top bar and the background, but not a colour I'd associate with the site in general.
I would contend that if the suggest color is more than X away from the traditionally named color (as defined in rgb.txt), then it needs to be reconsidered to see if its "right" or not.
In this case, it could be reasonably argued that the CSS/X11 definition of “coral” is just as wrong, and a quick image search provides all the proof we need. ;)
This site is an image search query tool, not a CSS color reference. We knew that before querying. It would be reasonable, and a great idea even, to return multiple results, and include the CSS color if the query matches that name. But maybe I really do want to query for pictures of coral and not the CSS color names. The CSS “coral” was someone’s arbitrary choice, and not the only right answer. Don’t make the mistake of assuming that CSS color names should be considered the authority on color names- to the exclusion of any other use of the word - outside of HTML or X11 contexts, or even within those contexts.
For people who are interested in this kind of idea, there is some research on estimating Word-Color associations.
A popular crowd-sourced dataset for this is [1] which contains average color association for 14,000 words. There is a demo available at [2].
There's also a recent work[3] trying to estimate such association using image data from Google, similar to the OP project, but a bit more sophisticated than just taking an average.
1: Colourful Language: Measuring Word-Colour Associations, Saif Mohammad, ACL 2011 Workshop on Cognitive Modeling and Computational Linguistics (CMCL). https://www.aclweb.org/anthology/W11-0611/
I wonder if it would be best to return multiple colors when something appears to have multiple primary ones - e.g. black and white for a zebra, green and red for a watermelon.
Separating the subject from the background would also be useful - as simple as weighting the center of the image more than the edges?
I've done this exercise and generally what you want is a combination of your suggestion (which is elaborated by 'polyphora[0]) and 'p1necone's suggestion [1]
Polyphora mentioned CIELAB as just one example, and it's a good example. I believe state of the art these days is Oklab[2], talked about here[3]. I'd like to pull out a comment from 'jiggawatts in that discussion:
> This is a tour de force of colour theory, and should be mandatory reading for anyone serious about computer colour!
I completely agree.
With regards to 'p1necone's suggestion, k-nearest neighbors is one simple and relatively easy way to separate the colors into bins. I've only done this on a single image, but with multiple images maybe you could also k-nn bin the resulting colors from each image and only return bins which have multiple members.
I had so much fun with this. I actually like that it uses the mean rather than some clustered mode in a lot of the cases I tried out. For example, “space” is pleasantly brighter than the dark black/blue hue that would come from that approach. It’s a little warmer and brighter than that, and feels like a truer representation of the concept in my mind’s eye. Obviously it will end up muddy for some things with several dominant colors. The mean approach also allows for some fun experiments like “black and white” which returned a rewardingly exact gray for me. Which led me to “newspaper” which is gray with most weight to the red and some to blue, and looks just right. I just spent about 15 minutes on this. Simple fun and useful, I am totally going to use this for something, thanks!
I also get shades of brown. If you take the mean of any image with many similarly distributed colors, it will be brown. Mean is clearly not the best function for this. It should maybe calculate the dominant color instead.
Not that these do the same thing, but if you'd like, I wrote this proc in tcl/tk a couple of days ago if anyone wants to compare this site to tk's internals.
proc color_name_to_hex {name} {
set hex "#"
foreach color [winfo rgb . "$name"] {
append hex [string range [format %02x $color] 0 1]
}
return $hex
}
I like this project. It's not accurate[1], but it's still very cool and I don't think there's a great way to make it better without ruining the fun and simplicity.
I wonder if that specific bad result is due to safe search?
Searching for Animals seems to be hit or miss too - e.g. I would expect 'Dolphin' or 'Shark' to return grey but of course all the images of them have blue ocean too and so you get a shade of blue instead.
This is neat, coincidentally I was wondering about a superficially similar problem recently:
Given a list of arbitrary strings, generate a consistent mapping from that list to a color palette of distinguishable colors.
This is distinct from deterministically mapping a single string to a single color (easy) because the intention is to provide a reasonably distinct set of colors, which I think can't be guaranteed when looking at single values.
Bonus: if [str1, str2, str3] -> [rgb1, rgb2, rgb3], have the map update incrementally, so that [str1, str2, str3, str4] -> [rgb1, rgb2, rgb3, rgb4].
I assume I'm restating a version of some existing algorithm, but I don't know what.
It's a neat idea, but the colors seem off. For example, I search for tomato and I get #DC9385 which is a light rust color(?). Maybe this is just what I think as the color tomato should be is different. I would think a vibrant red. I know tomatoes come in different colors, but to me that is the color that immediately jumps to mind. Cool idea though.
I like how the result text switches from white to black depending on the brightness of the selected background color. does anyone know how this is done? I could use this algorithm in other projects
Cool idea, but as expected, any word/phrase that isn't approximately one-to-one with a particular color or region in the color space turns up nearly gray.
This reminds of Randall's xkcd colors[0], who did the reverse experiment: given a color, what would be the most appropriate name? The 954 names made it into Python's matplotlib.
[1] https://en.wikipedia.org/wiki/Mean_shift [2] https://programmingdesignsystems.com/color/perceptually-unif...