Hacker News new | past | comments | ask | show | jobs | submit login
Colorize (alexbeals.com)
155 points by didizaja on Jan 19, 2021 | hide | past | favorite | 77 comments



Instead of returning the mean rgb values, a better idea is to cluster the colors (e.g. using mean-shift algorithm) in a perceptually uniform color space [2] (such as CIELAB).

[1] https://en.wikipedia.org/wiki/Mean_shift [2] https://programmingdesignsystems.com/color/perceptually-unif...


Neat suggestion. I too am seeing shades of brown for most queries. Maybe this would help.


I heartily agree. For example, searching kiwi should result in roughly the color of the inside part. Instead it's a light muted green, probably taking into account a white background that it's usually featured on. Taking the mode by clustering instead of the mean would improve results.


Agreed. A flower should not be represented by a brown color.


Yes, this would be much better. Additionally it could return a pallet of N colors instead of just one.


Using a better algorithm makes sense, though a single color seems to be the point of the "product" in this case.


There's a paper which does something similar including an analysis of different statistical models on top of a CIELAB color space.

Estimating Color-Concept Associations from Image Statistics. Ragini Rathore et al., IEEE VIS InfoVis 2019. https://arxiv.org/pdf/1908.00220.pdf


I wrote a tool that explores different ways to cluster in the RGB space - https://github.com/mattnedrich/palette-maker

It compares k-means, median-cut, and simple RGB space quantization.

mean-shift is great, but it is unbelievably slow (n^2)


I thought you can make mean shift go faster with the Fast Multipole Method.

See: https://home.cscamm.umd.edu/programs/fam04/fgt_duraiswami_fa... (PDF file)


Really interesting idea! I was thinking this might be a neat color name-to-rgb tool, but the resulting color deviates substantially from expected colors (which, given the algorithm, is to be expected):

https://alexbeals.com/projects/colorize/search.php?q=coral

https://alexbeals.com/projects/colorize/search.php?q=goldenr...

https://alexbeals.com/projects/colorize/search.php?q=eggplan...

I don't believe just switching from average to dominant color will actually fix this.

(Edit: s/more often wrong than not/<diplomatic verbiage>/)


>> more often wrong than not

By the website's definition, which is right on the main page, it is not wrong. The website uses the "average color across the approximately 25 image results".

And for that purpose this site works fantastically, and I would say it's a pretty hilarious idea.

For example searching for Dolphin doesn't return grey but instead a shade of blue, because most dolphin pictures are dolphins in water.

I think the website takes our expectations and instead shows us the reality. And I think it does a fantastic job.

What I would like to see is an option to maybe increase the number of images to average from, maybe to 50, 100, etc. I would also love to see the source released so I can this instance locally.


I agree. The problem has to do with not recognizing the background of the image. That'd explain why ketchup and solo cup are pink -- the first 25 pictures all have white backgrounds.

https://alexbeals.com/projects/colorize/search.php?q=ketchup https://alexbeals.com/projects/colorize/search.php?q=solo+cu...

The results would definitely be closer to human expectations if run through something like https://www.remove.bg/ first.


I feel like it would only be right for this kind of project to not "stoop" to background-removal, but instead do something more ridiclous by throwing even more ML at the problem:

1. classify the image

2. use word-net to figure out what "container" word is most closely associated with the image's classification

3. search the container word, and pull out its dominant colors

4. return the colors that are in the image but not in its classificatory container. Unless that set is empty (e.g. a polar bear in snow), in which case return the colors you'd have given without all these extra steps ;)


I agree. Otherwise it would try to strip the background from an ocean or sky image.


I (and probably a few others) searched for "Hacker News" and it comes up with a really nice colour that is probably a cross between the top bar and the background, but not a colour I'd associate with the site in general.


> because most dolphin pictures are dolphins in water

that is one possible reason, but it's a guess.

it could also be that it's weighted by the extremely high incidence of the miami dolphin team colors.




In that context, this sure was unexpected: https://alexbeals.com/projects/colorize/search.php?q=medium


What kind of result would you expect with the query "coral", which doesn't have a dominant color, or a correct answer?


https://en.wikipedia.org/wiki/Coral_(color)

I would contend that if the suggest color is more than X away from the traditionally named color (as defined in rgb.txt), then it needs to be reconsidered to see if its "right" or not.


In this case, it could be reasonably argued that the CSS/X11 definition of “coral” is just as wrong, and a quick image search provides all the proof we need. ;)

This site is an image search query tool, not a CSS color reference. We knew that before querying. It would be reasonable, and a great idea even, to return multiple results, and include the CSS color if the query matches that name. But maybe I really do want to query for pictures of coral and not the CSS color names. The CSS “coral” was someone’s arbitrary choice, and not the only right answer. Don’t make the mistake of assuming that CSS color names should be considered the authority on color names- to the exclusion of any other use of the word - outside of HTML or X11 contexts, or even within those contexts.


https://commons.wikimedia.org/wiki/File:Corail_algérien.jpg

https://www.futurity.org/coral-hiv-aids-1307772-2/

https://www.forbes.com/sites/linhanhcat/2019/11/27/corals-en...

Part of the challenge is that saying "coral" is like saying "rainbow" in that app... which also gives you a brown color.

There are coral colored corals, there are green, blue... but its more in the pinks and reds than other colors.

https://www.ba-bamail.com/content.aspx?emailid=7375

The color https://alexbeals.com/projects/colorize/search.php?q=coral is most likely the color of the environment that the coral is in rather than the color of the coral itself.

Unless its based on https://www.pnas.org/content/117/5/2232 and similar images... in which case, that's rather depressing.


Blood is similarly weird. Very flesh tone tinged a little pink.

https://alexbeals.com/projects/colorize/search.php?q=blood


Makes sense: blood images are often blood from a wound in someone’s skin/flesh!


I thought so too but the first page of google image results mostly doesn't feature skin in the images.


This is really fun! I started plugging in US states. It makes a pretty interesting palette for the map.

https://gist.github.com/riebschlager/2f1a3c0391bc42c9ec9bdc6...


For people who are interested in this kind of idea, there is some research on estimating Word-Color associations.

A popular crowd-sourced dataset for this is [1] which contains average color association for 14,000 words. There is a demo available at [2].

There's also a recent work[3] trying to estimate such association using image data from Google, similar to the OP project, but a bit more sophisticated than just taking an average.

1: Colourful Language: Measuring Word-Colour Associations, Saif Mohammad, ACL 2011 Workshop on Cognitive Modeling and Computational Linguistics (CMCL). https://www.aclweb.org/anthology/W11-0611/

2: http://www.lexichrome.com/

3: Estimating Color-Concept Associations from Image Statistics. Ragini Rathore et al., IEEE VIS InfoVis 2019. https://arxiv.org/pdf/1908.00220.pdf


"Truth" gives me a nearly perfect 50% gray, which is kind of beautiful when you think about it.




Watermelon.

"Average color" is not really what I want.


I wonder if it would be best to return multiple colors when something appears to have multiple primary ones - e.g. black and white for a zebra, green and red for a watermelon.

Separating the subject from the background would also be useful - as simple as weighting the center of the image more than the edges?



Depends on exactly what "average" means in this context.

The most naive of interpretations (e.g. averaging each of the R/G/B values) is generally not what we want.

Would be super cool to add some different averaging schemes to this.


I've done this exercise and generally what you want is a combination of your suggestion (which is elaborated by 'polyphora[0]) and 'p1necone's suggestion [1]

Polyphora mentioned CIELAB as just one example, and it's a good example. I believe state of the art these days is Oklab[2], talked about here[3]. I'd like to pull out a comment from 'jiggawatts in that discussion:

> This is a tour de force of colour theory, and should be mandatory reading for anyone serious about computer colour!

I completely agree.

With regards to 'p1necone's suggestion, k-nearest neighbors is one simple and relatively easy way to separate the colors into bins. I've only done this on a single image, but with multiple images maybe you could also k-nn bin the resulting colors from each image and only return bins which have multiple members.

0: https://news.ycombinator.com/item?id=25828733

1: https://news.ycombinator.com/item?id=25828773

2: https://bottosson.github.io/posts/oklab/

3: https://news.ycombinator.com/item?id=25525726


I had so much fun with this. I actually like that it uses the mean rather than some clustered mode in a lot of the cases I tried out. For example, “space” is pleasantly brighter than the dark black/blue hue that would come from that approach. It’s a little warmer and brighter than that, and feels like a truer representation of the concept in my mind’s eye. Obviously it will end up muddy for some things with several dominant colors. The mean approach also allows for some fun experiments like “black and white” which returned a rewardingly exact gray for me. Which led me to “newspaper” which is gray with most weight to the red and some to blue, and looks just right. I just spent about 15 minutes on this. Simple fun and useful, I am totally going to use this for something, thanks!


Good idea, except I'm getting shades of brown for almost all queries.


I also get shades of brown. If you take the mean of any image with many similarly distributed colors, it will be brown. Mean is clearly not the best function for this. It should maybe calculate the dominant color instead.


Not that these do the same thing, but if you'd like, I wrote this proc in tcl/tk a couple of days ago if anyone wants to compare this site to tk's internals.

  proc color_name_to_hex {name} {
    set hex "#"
    foreach color [winfo rgb . "$name"] {
      append hex [string range [format %02x $color] 0 1]
    }
    return $hex
  }


I like this project. It's not accurate[1], but it's still very cool and I don't think there's a great way to make it better without ruining the fun and simplicity.

[1] https://alexbeals.com/projects/colorize/search.php?q=piss


I wonder if that specific bad result is due to safe search?

Searching for Animals seems to be hit or miss too - e.g. I would expect 'Dolphin' or 'Shark' to return grey but of course all the images of them have blue ocean too and so you get a shade of blue instead.



Fuchsia too, but pink.. wow.


Maybe. I got some pretty accurate results in situations I would expect, like "beer."

Piss, unlike dolphin (for reasons you note), I would expect to return pretty close to the correct color, but I didn't go check the google results.


Results are similarly inaccurate for other common fluids like milk, cream, oil, orange juice, or antifreeze.



This is neat, coincidentally I was wondering about a superficially similar problem recently:

Given a list of arbitrary strings, generate a consistent mapping from that list to a color palette of distinguishable colors.

This is distinct from deterministically mapping a single string to a single color (easy) because the intention is to provide a reasonably distinct set of colors, which I think can't be guaranteed when looking at single values.

Bonus: if [str1, str2, str3] -> [rgb1, rgb2, rgb3], have the map update incrementally, so that [str1, str2, str3, str4] -> [rgb1, rgb2, rgb3, rgb4].

I assume I'm restating a version of some existing algorithm, but I don't know what.


Searching for "pink" results in a muddy brown:

https://alexbeals.com/projects/colorize/search.php?q=pink



It's a neat idea, but the colors seem off. For example, I search for tomato and I get #DC9385 which is a light rust color(?). Maybe this is just what I think as the color tomato should be is different. I would think a vibrant red. I know tomatoes come in different colors, but to me that is the color that immediately jumps to mind. Cool idea though.


I tried wine and cabernet and got a similar color to the bkg on this site. It did nail beer, though. Maybe not a wine lover?


Cool project, but can you add API?

I coded this basic python script to show colors in terminal;

https://github.com/tuxys/python/blob/main/color.py


Funnily enough, the word "BA0BAB" returns #B688AB, close to #BA0BAB but not exactly it


I like how the result text switches from white to black depending on the brightness of the selected background color. does anyone know how this is done? I could use this algorithm in other projects


It's called Luminance Contrast: https://stackoverflow.com/a/36888120


ty!


Cool idea, but as expected, any word/phrase that isn't approximately one-to-one with a particular color or region in the color space turns up nearly gray.


You could probably improve the accuracy a lot by ignoring some percentage of the edge of images, to help eliminate averaging the backgrounds.


Colors of Python and C++ are more pleasant to me, than colors of C# and Java. Oh, that's why, I guess :-D


All the vehicle manufacturers I tried gave varying shades of "grey".

Fire is also not what I expected :)



This is oddly addictive. Just typing in stuff and see what colors it gives me :)


Indeed -- I really like the color "blargh" :-)


This reminds of Randall's xkcd colors[0], who did the reverse experiment: given a color, what would be the most appropriate name? The 954 names made it into Python's matplotlib.

[0] https://blog.xkcd.com/2010/05/03/color-survey-results/


I wish this wouldn't take over my browser history like that.



FWIW, I tried some other words like American or German and mostly got shaded between pinkish grey and brown. European is a pale blue.


Entirely off topic, I know, but the title is one of the few words Canadian English spells uniquely.

'Colourize'.

In British English: Colourise. American: Colorize.

And we're very upset that spell checkers make us choose between only American and British options.


As an Aussie, I'm over it.

OED says that -ize is equally "correct" to -ise. Australian spelling guidance has dropped the U in "color", "honor" etc.

I keep the "e" in axe, I spell metric measures (and centre and theatre) with "-re"

I still differentiate between advise and advice, practise and practice, defense and defence.

And I'd spell "check" as "cheque" except no-one outside the US still uses those. /s


License and licence always throws me off. And I much prefer aging to ageing.


Wait, are you saying that Australian English uses 'Colorise'?


"pink" seems to fail miserably.


Just for fun, I tried this :-)

https://alexbeals.com/projects/colorize/search.php?q=Trump (#4F4252) https://alexbeals.com/projects/colorize/search.php?q=Biden (#433A4A)

Colors looks very similar O_o Could this be a sign?


I am reminded of this by some of the seemingly bad results: https://www.youtube.com/watch?v=wh4aWZRtTwU


Hmm...

Socialist: pinkish Communist: bright red Capitalist: light brown Fascist: dark brown


Neat!

It would be a nicer experience if search bar would autofocus.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: