Hacker Newsnew | past | comments | ask | show | jobs | submit | onepunchedman's commentslogin

When you say European, except for the fact that Tolkien wrote the story, what part of the show thus far has given you an explicit European feel? Did the Harfoots resemble British subjects, or do the race of men have to be from the European continent because of their skin color? Which city did Lindon look like? If we're already in a fantasy setting, why is there such a focus on relating the setting of the fantasy to our own European history, rather than telling a good fantasy story. Can you not relate to the story if it isn't explicitly describing a historically accurate proto-Europe?

If the argument is that Middle Earth should reflect a proto-European history, should we just accept that everyone from the east are pictured as orcs; mindless, ugly, uncivilized brutes? Is everyone from Africa part of the Haradrim or the Easterlings? The comparison to a proto-Europe only works as long as you wilfully neglect the horribly racist parts of the comparison. Or are you okay with those parts too? If the addition of a black Harfoot, black elves, or Durin's wife being black is your critique of the show, then you should reflect on why you feel that way and why you're able to so easily excuse the clear racism in favor of defending "authenticity".

It's legit WILD to read some of the remarks about the show on reddit and IMDb. It's horribly racist rhetoric disguised as a defence of authenticity and staying true to Tolkien's work. If portraying the racist parts of Tolkien's work is * that * important, maybe we shouldn't make media based upon it?


> When you say European, except for the fact that Tolkien wrote the story, what part of the show thus far has given you an explicit European feel?

"Middle Earth" was an old phrase used to refer to Western Europe in several Scandinavian languages. Tolkien, being a linguist, would have known this and chosen that turn of phrase deliberately.


You are missing my argument completely. What I'm saying is that yes, Tolkien may have envisioned Middle Earth as proto-European fantasy, but when you're arguing for "historical accuracy", you can't pick and choose. You're arguing about skin color as if a) no people of color lived in Europe at the time, and b) you're completely avoiding touching on the origin of Tolkien's evil races and nations, and in particular orcs. If hobbits, men, and elves need to be white, then orcs necessarily need to be black and brown people, no? They need to be uncivilised brutes who only want to destroy with no mind of their own? That is a simplified depiction of Tolkien's works, but do you think that those prejudices hold, or that they should be depicted in modern media? If yes, then you are a bigot and a racist, and if no, it's WEIRD how the inclusion of people of color is where you're putting your foot down.

PS: read "you" as in the collective you, not you specifically.


> You're arguing about skin color as if a) no people of color lived in Europe at the time, and b) you're completely avoiding touching on the origin of Tolkien's evil races and nations, and in particular orcs. If hobbits, men, and elves need to be white, then orcs necessarily need to be black and brown people, no?

The problem with line of thought is that Lossarnach which mustered to Minas Tirith during the seige of Gondor were described as swarthy (i.e. dark skinned).

Also, the Easterlings were poignantly humanized by Sam.

"It was Sam's first view of a battle of Men against Men, and he did not like it much. He was glad that he could not see the dead face. He wondered what the man's name was and where he came from; and if he was really evil of heart, or what lies or threats had led him on the long march from his home; and if he would not really rather have stayed there in peace all in a flash of thought which was quickly driven from his mind."

All of this is a far cry from the stark dark people bad racism you've posted about a couple of times now.

Finally, the orcs also had slanted eyes, did Tolkien also hate Asians? Are Orcs supposed to be African or Asian? Pegging orcs as "dark people" simply doesn't fit as orcs don't closely match any ethnic group in particular. Orcs aren't even a single race for that matter. In any case, dark skinned=bad sounds a lot like allegory, which Tolkien finds distasteful. It is my theory that due to the time we all live in, you are (subconsciously most likely) hammering a square racism peg in a round hold.


Would you mind explaining how - even in this fantasy world - people living in closed societies would get wildly different features? Somehow elves are different from human who are different from gnomes - it there are 2-3 afro elves or afro-hobbits? That’s what’s jarring about “per quota” insertion of minorities - because it contradicts our experience. One can make southerons look Persian or Indian, but having a token minority is just laughable. And throwing around virtue signaling is not helping your argument.


I'm pretty sure Shakespeare knew what a woman was, but they weren't cast in any of his plays. I doubt Tolkien would give a shit whether actors with dark skin played any of his characters.


Op here. So my arguement is looking more at fantasy TV/movies in general rather than a particular focus on lord of the rings per se. But it is part of a wider discussion on representation in the European fantasy genre. As you know, what I am arguing for is that, I think it would be good to continue to have some shows which are 'diverse' and other shows which follow a more tradional casting. Do both. My concern is that there is political/media pressure to push always for diverse castings in all European fantasy shows. And this could go against good storytelling.

If I'm watching a story about a Japanese Fantasy story - it will feel less immersive if you introduce a blonde character. Likewise with an Indian, or African story. Everyone would think this reasonable. What some people get upset about, is if we also say this about European fantasy stories. And I don't agree with this (the point that a tiny tiny fraction of people living in Europe may have been nonwhite in the 1400s and below I find mute quite frankly). I also find it... Not considerate to call someone racist for making this point. And quite frankly its a great example of where we are right now. Film and TV studios terrified in their casting decisions and feeling like they have to please the media etc. Nor am I convinced that global audiences want to see themselves in European fantasy stories. I don't want to see a blonde white guy (or black or Indian guy) in a Japanese Fantasy story etc. etc.

I do feel that including a multicultural diverse population in a European fantasy story does have a large risk of less immersion and I don't agree thats 'racist'. You can do it, but it's a different world and a different experience. What I'm arguing again, is fine do that. But not everything fantasy driven needs to be like that.

The lord of the rings trilogy was not diverse and quite frankly was astounding. Game of thrones was not diverse, and was astounding up to the final season.

The latter was savegly attacked by woke groups for not being diverse and I just so disagree with that view point.

Let's have diverse stories and let's have traditional too. There are lots of good fantasy stories by Western authors that do include diversity (fifth season, and Ursula la guin Wizard of the sea story etc.)


My bet is that the only reason to have "harfoots" is avoiding to pay for the use of the copyrighted word "hobbits" in the near future.

It sounds like a fart. Definitely lacking the talent of Tolkien to find the surgically precise word for each term, but I can understand the legal aspects of the need-for-control part.


Tolkien also named and described the Harfoots, so why would they be any differently copyrighted?


I thought it was interesting from a linguistic perspective. These events are 3000-6000 years earlier than those of the Lord of the Rings. The languages would have changed in all that time. I spent time wondering how Harfoot might have evolved into Hobbit or how Hobbit might have arisen and replaced Harfoot.


If you want to hear some strange sounding names, you can look up the actual Westron names of the hobbits - Frodo Baggins and the other Shire names being an "English translation" from the Red Book of Westmarch. Sam's name is actually "Banazîr Galbasi", so maybe he's Turkish.


You honestly deny that middle earth is not based on Europe? Everything Tolkien wrote was drenched in that reference frame.


Middle-Earth most definitely is not based on Europe and claiming such is unsupportable. The geography and map of Middle-Earth is absolutely nothing like Europe, no similarities whatsoever, and no geological process known could make it similar. On the other hand, Middle-Earth is strikingly similar to a mirror image of North America.


I didn't say based IN I said based ON. Those are similar sounding but completely different. The former implies it's actually based in the same country and lands as Europe. Whereas the later implies it's merely inspired by to some degree.


Tolkien borrowed quite heavily from Scandinavian mythology, especially the epic Väinämöinen from Finnish mythology, and seems to have been obsessed with Odin from Norse mythology. Is that what you mean? Because claiming he based his fictional myths and his fantasy fiction on European myths and history is so vague and technically inaccurate that is must be false. Though Italy is firmly in Europe, describing lasagna as European food is at best misleading and at worst false.

Not for nothing, Europe is a continent. Whether you believe Tolkien based his works on or in Europe, both are false on their face. It doesn't even make sense, so please try to better articulate what you mean. Because you literally have argued that Tolkien based the continent of Middle-Earth on the continent of Europe, then you have waffled and changed your argument from in to on. Either way is nonsense. What you must have meant was Tolkien based his stories, not the element of his story setting, Middle-Earth, but the stories themselves, on history and mythology of the various peoples of Europe.

But, in fact, other than Scandinavian epics and myths, and specifically Finnish and Norse epics and myths, that is false.

If you can support your claim, you'd be more convincing. An example of some very similar non-Scandinavian European story found in Tolkien's work would drive your point home. But I am unaware of any example of, say, Italian or Romanian or Polish or Swiss or Danish folk stories or myths being borrowed by Tolkien. And we need not be so vague. Europe was never a single culture, but always many. And Tolkien was not writing to give Europe a history and mythology; that purpose was only for England.


If you don't believe Italy, or the Nordic countries are European I will not be able to convince you. Our world view is simply too different to come to an understanding.


Why specify Europe? Why not just say the Solar System? Or the Western Spiral Arm of the Milky Way? Or the Local Group?

Inexplicably, you chose to be incredibly vague. Just because France is in Europe does not make Europe representative of France. You can say the Eiffel Tower is in the Milky Way, but this is overly broad and imprecise; it makes far more sense to say it is in France.

Similarly, Tolkien borrowed from very specific Scandinavian sources, and that is borrowed; his works are not based upon these sources nor upon ancient Scandinavian culture. To conflate Scandinavia with Europe is the same mistake as conflating France with the Milky Way.

Your claim that Tolkien based his works on Europe is incongruous because it is overly general and imprecise, and it is also a pretty good example of the vagueness fallacy.


It's quite simply because Tolkien's work shows Germanic, Finnish, Greek, Celtic and Slavic mythological influences. That spans almost the entirety of Europe. I don't feel you are arguing in good faith here. Either you are willingly ignoring it's European heritage or just haven't done the research on it.


> It's quite simply because Tolkien's work shows Germanic, Finnish, Greek, Celtic and Slavic mythological influences. That spans almost the entirety of Europe.

He was a gifted linguist, influenced by Germanic, Celtic, Finnish, Slavic, and Greek language and mythology.[1]

Yes, indeed, as the wiki you've drawn from states, Tolkien, the man, was influenced by his studies of various ancient cultures. But to create his fiction he drew from his own life, his Christianity, his experiences during WWI, and Norse and Finnish mythology. Tolkien did not draw on the entirety of the catalog of European mythos to build his world. If you can show me, say, how one of his characters draws from characteristics of a specific Greek hero or god, or likewise for any Germanic, Celtic, or Slavic stories, I'd really be genuinely interested.

I have already named the specific Finnish source that Tolkien borrowed from, and provided a specific example of Tolkien borrowing from Norse sagas, namely, using characteristics of Odin for a few of his characters. Please provide any specific example of Germanic, Greek, Celtic or Slavic influence in Tolkien's work, and name the source. Just one will do, so please take your pick.

[1] https://en.wikipedia.org/wiki/J._R._R._Tolkien%27s_influence...


> Though Italy is firmly in Europe, describing lasagna as European food is at best misleading and at worst false.

I can’t understand what you mean by this at all and it sounds like an absurd thing to say. Can you please explain?


You're not at all touching on my argument. Read my reply to maxk42.


Wasn’t the whole point of Lord of the Rings to provide a (made-up) mythology for Great Britain? Middle Earth is ancient Great Britain (or maybe ancient NW Europe) according to Tolkien.


Do you have a citation on this assertion? I'm pretty sure when he said anything about it being a fake-prehistory, he said it was global prehistory.

He drew heavily from the European sources he knew, but I don't recall any implication that it was meant to be Britain or Europe only.



According to this Wikipedia article,

""" In his 2004 chapter "A Mythology for Anglo-Saxon England", Michael Drout states that Tolkien never used the actual phrase, though commentators have found it appropriate as a description of much of his approach in creating Middle-earth. """

So this is critical interpretation and not something he literally said he intended. His quotes (also in the article) suggest he was drawing from English and Norse mythology, of course, but not that the cosmology of Middle Earth is solely English.


There is not such thing as a "medieval US" with American natives dueling with broad swords. Maybe in Las Vegas.


Sorry, I can't wrap my head around how this comment relates to mine. Can you clarify?


>> Sorry, I can't wrap my head around how this comment relates to mine. Can you clarify?

> I don't recall any implication that it was meant to be Britain or Europe only.

LOTR has a few clear mentions to America but is about the feelings of an English literature professor and ex-soldier seeing the good old times, gorgeous nature and European mythology that he loved, being replaced and crushed by industrial development and world war.

The themes are universal, could be adapted to other places and other mythologies, but would lose part of its charm in the process.

Under a disguise of epic fantasy the book is basically a metaphor of twenty century Europe in war times, and is filled with details token directly from his real war experiences and depicted metaphorically or directly. The kind of details that you can't invent or wouldn't notice, unless you had experienced it first. Details like describing how the infantry traveling long distances by foot towards the battle field, get out of the path and start walking into the fresh grass bordering the road to alleviate their sore foot pain.

That experience, plus his obsession to consistence, religious background (christian humanism) and expertise in European myths, old languages and literature, blends all together in a complex history that conveys an incredible sense of realism and immersion rarely achieved by other epic fantasy books.

Tolkien don't needs to be lectured about including strong woman characters in his work, or about the need to talk more about ecology, compassion or racism. Those themes are exquisitely treated in the book yet that is filled with a sense of adventure and a sublime love for nature (to the extent to mention how the raising sun in a foggy day illuminates the spiderwebs in the path).

About racism. This is not "uncle tom's cabin" by Peter Jacksons sake!. Most of the book describes different races allying to fight against the evil and befriending each other while accepting organically that they have other cultures and interests. So... Europe in the war. The book is as anti-racist as you can have

Tolkien didn't deserved that but, most of all, didn't needed to be "improved" like that


You're not at all touching on my argument. Read my reply to maxk42.


This is just getting wilder and wilder by the day, how spectacularly this move has backfired. As others have commented, at this point all you need is someone willing to sell you the CSAM hashes on the darknet, and this system is transparently broken.

Until that day, just send known CSAM to any person you'd like to get in trouble (make sure they have icloud sync enabled), be it your neighbour or a political figure, and start a PR campaign accusing the person of being investigated for it. The whole concept is so inherently flawed it's crazy they haven't been sued yet.


The "send known CSAM" attack has existed for a while but never made sense. However, this technology enables a new class of attacks: "send legal porn, collided to match CSAM perceptual hashes".

With the previous status quo:

1. The attacker faces charges of possessing and distributing child pornography

2. The victim may be investigated and charged with child pornography if LEO is somehow alerted (which requires work, and can be traced to the attacker).

Poor risk/reward payoff, specifically the risk outweighs the reward. So it doesn't happen (often).

---

With the new status quo of lossy, on-device CSAM scanning and automated LEO alerting:

1. The attacker never sends CSAM, only material that collides with CSAM hashes. They will be looking at charges of CFAA, extortion, and blackmail.

2. The victim will be automatically investigated by law enforcement, due to Apple's "Safety Voucher" system. The victim will be investigated for possessing child pornography, particularly if the attacker collides legal pornography that may fool a reviewer inspecting a 'visual derivative'.

Great risk/reward payoff. The reward dramatically outweighs the risk, as you can get someone in trouble for CSAM without ever touching CSAM yourself.

If you think ransomware is bad, just imagine CSAM-collision ransomware. Your files will be replaced* with legal pornography that is designed specifically to collide with CSAM hashes and result in automated alerting to law enforcement. Pay X monero within the next 30 minutes, or quite literally, you may go to jail, and be charged with possessing child pornography, until you spend $XXX,XXX on lawyers and expert testimony that demonstrates your innocence.

* Another delivery mechanism for this is simply sending collided photos over WhatsApp, as WhatsApp allows for up to 30 media images in one message, and has settings that will automatically add these images to your iCloud photo library.


Before they make it to human review, photos in decrypted vouchers have to pass the CSAM match against a second classifier that Apple keeps to itself. Presumably, if it doesn’t match the same asset, it won’t be passed along. This is explained towards the end of the threat model document that Apple posted to its website. https://www.apple.com/child-safety/pdf/Security_Threat_Model...


What happens if someone leaks or guesses the weights on that "secret" classifier? The whole system is so ridiculous even before considering the amount of shenanigans the FBI could pull by putting in non-CSAM hashes.


For better or worse, opaque server-side CSAM models are the norm in the cloud photo hosting world. I imagine that the consequences would be roughly the same as if Google's, Facebook's or Microsoft's "secret classifiers" were leaked.


but in the cloud setting they have the plaintext of what was uploaded. The attack described above is about abusing the lack of information apple has so they will report an innocent user to the authorities.


The voucher that Apple can decrypt once enough positives have been received contains a scaled-down version of the original. How else would Apple be able to even run a second hash function on the same picture?


Can't they just make a new one and recompute the 2nd secret hash on the whole data set fairly easily?

Also, the whole point is that it's fairly easy to create a fake image that collides with one hash, but doing it for 2 is exponentially harder. It's hard to see how you could have an image that collides with both hashes (of the same image mind you).


Two hash models is functionally equivalent to a particular type of one double-sized hash model. So it shouldn't be any harder to recompute against a 2nd hash, if that 2nd hash were public.

Of course, it won't be public (and if it ever became public they'd replace it with a different secret hash).


If you have both models it is easy. If Apple manages to keep the server model private then it is hard.


You don’t need to have the weights. “Transfer attack” is a thing.


You can still hack someone's phone and upload actual CSAM images. That exposes the attacker to additional charges, but they're already facing extortion and all that anyway. I don't understand the "golly gee whizz, they'd have to commit a severe felony first in order to launch that kind of attack" argument.

Don't know why this hasn't already been used on other cloud services, but maybe it will be now that its been more widely publicized.


How...exactly did they train that CSAM classifier? Seeing as that training data would be illegal. I'd be most interested in an answer on that one. They are willing to make that training data set a matter of public record on the first trial, yes?

Or are we going to say secret evidence is just fine nowadays? Bloody mathwashing.


They didn't train a classifier, just a hashing function.


honestly asking — why is it illegal?


It may not be, so honestly I think my objection is best dismissed. Once I ran down the actual chain I mostly sorted things out with a cooler head.

However, the line of thinking was if Apple has a secondary classifier to run against visual derivatives, the intent is it can say "CSAM/Not CSAM". Since the NeuralHash can collide, that means they'd need something to take in the visual derivatives, and match it vs an NN trained on actual CSAM. Not hashes. Actual.

Evidence, as far as I'm aware, is admitted to the public record, and a link needs to exist, and be documented in a publically and auditable way. That to me implies any results of a NN would necessarily require that the initial training set be included for replicability if we were really out to maintain the full integrity of the chain of evidence that is used as justification for locking someone away. That means a snapshot of the actual training source material, which means large CSAM dump snapshots being stored for each case using Apple's classifier as evidence. Even if you handwave the government being blessed to hold onto all that CSAM as fitting comfortably in the law enforcement action exclusions; it's still littering digital storage somewhere with a lotta CSAM. Also Apple would have to update their model over time, which would require retraining, which would require sending that CSAM source material to somewhere other than NCMEC or the FBI (unless both those agencies now rent out ML training infrastructure for you to do your training on leveraging their legal carve out, and I've seen or come across no mention of that.)

Thereby, I feel that logistically speaking, someone is commiting an illegal act somewhere, but no one wants to rock the boat enough to figure it out, because it's more important to catch pedophiles than muck about with blast craters created by legislation.

I need to go read the legislation more carefully, so just take my post as a grunt of frustration at how it seems like everyone just wants an excuse/means to punish pedophiles, but no one seems to be making a fuss over the devil in the details, which should really be the core issue in this type of thing, because it's always the parts nobody reads or bothers articulating that come back to haunt you in the end.


i did a bit of reading as well and came across this. you might find it useful or interesting: https://www.law.cornell.edu/uscode/text/18/2258A at the end (h1-4), it details that providers must preserve the information they submit and also take steps to limit access to only people who need it. in this sense then, it’s not illegal for companies to possess csam. it’s not a big leap to then assume that storing csam for the development of detection software is legal (or at least as been throughly cleared with the courts, which is about the same). photodna was developed twelve years ago, and i can’t find anything about microsoft ever being charged with possession or distribution of cp.


Interesting!

Thank you, that was what I was looking for that closes the gap somewhat.


Somehow this didn't solidify my trust in Apple! By this standard you can probably mount a half decent defence off "ignorance" if you are even caught sending the colliding material. Add this whole debacle on top of what's going on in the EU parliament and 2021 has been WILD for privacy.


It seems like I'm not going to sleep tonight.

Sure, there is hyperbole in OP's comment (CSAM ransomware and automated law enforcement aren't a thing yet), but we're a few steps from that reality.

Even worse, how long will it take until other cloud storage services such as Dropbox, Amazon S3, Google Drive et al implement the same features? Or worse, required by law to do so?

This sounds like the start of an exodus from the cloud, at least in the non-developer consumer space.


Cloud services generally already do this, for example, here is Google's report:

https://transparencyreport.google.com/child-sexual-abuse-mat...


Yeh I was talking in hyperbole, but the possible attack vectors this system enables are so powerful I felt it warranted. Under this system you are able to artificially ddos organizations that verify if CP is sent by sending legitimate, low-res porn whose hash has been modified. You can trigger legitimate investigations by sending CSAM through WhatsApp or through social engineering. You can also fuck with Apple by sending obvious spam.

* With regard to the legislative branch, they can even mandate changes to this system they aren't allowed to disclose. Once this system is in place, what is stopping governments from forcing other sets of hashes for matching.


And this is just one step away from Apple and Microsoft building this scanning into the OS itself (into the kernel/filesystem code, why not?!). This is beyond insane. Stallman was right. Our devices aren't ours anymore.

Now, to be fair, there would be a secondary private hash algorithm running on Apple's servers to minimize the impact of hash collisions, but what's important is that once a file matches a hash locally, the file isn't yours anymore -- it will be uploaded unencrypted to Apple's servers and examined. How easy would it be to shift focus from CSAM into piracy to "protect intellectual property"? Or some other matter?


Jup. As others have pointed out, if Apple were willing to lie about the extent of this system and its inception date, why should we suddenly trust that they won't extend its functionality. They themselves explicitly state that the program will be extended, so if this is the starting point I don't think I will be around for the ride.

It's a shame as I really love some of their privacy-minded features (e.g. precision of access to the phone's sensors and/or media).


> Even worse, how long will it take until other cloud storage services such as Dropbox, Amazon S3, Google Drive et al implement the same features? Or worse, required by law to do so

They already do this. Google and Facebook have even issued reports detailing their various success rates…


So, everyone is going to turn off their iCloud sync and they won’t be a target anymore?


Well according to reports that are generally the source of these collisions, the hashing code has been on the device since around December 2020 (14.3)

https://old.reddit.com/r/MachineLearning/comments/p6hsoh/p_a...

If Apple hasn't been honest about WHEN it was built into and added to their code base, why would anyone take their word for HOW its being used, or many of the other statements they are putting in their documents as of yet, at least until they are verified


It doesn't necessarily mean that it will stop them from being a target, because Apple says this[1]:

> This program is ambitious, and protecting children is an important responsibility. These efforts will evolve and expand over time.

[1] https://www.apple.com/child-safety/


> This program is ambitious, and protecting children is an important responsibility. These efforts will evolve and expand over time.

"Think of the children" is the most recognizable trope in TV and film. They couldn't have phrased that to be more Orwellian.


Yes, until they add local scanning to macOS / iOS / iPad OS.


The attacker faces no charges because the colliding image can be a harmless meme.


LEO is not alerted automatically, where’d you get that idea?


They'd more or less have to be. Well, not necessarily 'police', but NCMEC.

I did work in automating abuse detection years back, and the US govt clearly tells you are not to open/confirm suspected, reported, or happened upon cp. There's a lot of other seemingly weird laws and rules around it.


Those laws don’t apply if it’s part of the reporting process. Apple’s stated that they do a manual to decide whether to send a report to NCMEC or not, just like other companies do.


Of course they do. If they didn't, every seedy pedo would be in the process of making a "report." It's probably also why Apple is using 'visual derivatives' for confirmation, rather than the image, though I can't find info on exactly how low resolution 'visual derivatives' are.

It is of course possible that companies may get some special sign off from LE/NCMEC to do this kind of work - I won't argue with you on that as I truly don't know. I can just tell you my company did not, and was very harshly told how to proceed despite knowing the nature of what we were trying to accomplish. But, we weren't anywhere near Apple big.

I remember chatting with our legal team, who made it explicit that laws didn't to cover carve outs - basically 'seeing' was illegal. But as you can imagine, police didn't come busting down our doors for happening upon it and reporting it. If you have links to law where this is not the case, I'll gladly eat crow. I've never looked myself and relied on what the lawyers had said.


They will be if you collide a low-res image that resembles CSAM.

Why would person doing manual review risk his job in case if he’s unsure? Naturally he will just play it safe and report images.


Not resembles. The adversarial image has to match a private perceptual hash function of the same CSAM image that the NeuralHash function matched before a human reviewer ever looks at it.


Do you have any material on this private function?


Not beyond the documents Apple has shared. Presumably it will be kept that way given it prevents an adversarial attack against it.


Why wait? Just send them the pictures on Facebook Messenger or Gmail or Dropbox today.


I can't tell if you are being sarcastic. In case you are not, isn't the act of sending those pictures completely illegal?


People here are proposing intentionally creating image assets which collide with perceptual hashes of known CSAM (ignoring whether that is legal or ethical) and sharing those assets to effectively SWAT unaware targets.


They still seem to be under the impression that a neuralhash collision would be enough to do this, which it isn’t.


Oh, I think I misunderstood you. I thought you meant instead of "sending images that collides with perceptual hashes of known CASM", why not "send actual CSAM in 'Facebook Messenger or Gmail or Dropbox', and since those services also use some other detection algorithm, it will also incriminate the receiver."


Those services will take your account through the same, if not more invasive, process if you are found with a hash match like the ones being proposed in these comments. Unlike Apple, they’ve built interfaces that surface all your account activity to reviewers.


> Unlike Apple, they’ve built interfaces that surface all your account activity to reviewers.

You can't know this without independent audits.


In some ways, you can start to see the value in Apple’s system which lets the device user inspect what is stored in the associated data for later review.


I haven't seen anyone proposing actually doing it, but I think a lot of people are rightly pointing out that bad actors, black hats and the Russian mob are going to have a field day with their ability to do so.


I’m not sure how you can conclude the speculation is “right” without engaging with the fact that this hypothetical is addressed directly in the threat model document and hasn’t been pulled off successfully against any of the other services which do similar scanning. Why can’t I buy compromat as a service for your Gmail account?


Nah that's so 2020, 2021 is all about low resolution legitimate porn being transformed to match CSAM. Get with the times!


Those will trip up 2020’s systems as well!


But why low resolution porn?


So that you are able to bypass the manual reviews. It still looks like CSAM, but it isn't.


Imagine being a parent that made pictures of their own children that bathed naked in their own backyard.

I don't know about you, but my parents certainly have lots of embarassing pictures of me in their photo album.

There will be so many false positives in that system, it's ridiculous. It doesn't necessarily have to be a false colliding hash, but legitimate use cases that - by definition - are impossible to train neural nets on unless the data is being used illegally by Apple.


That’s not how Apple’s system works. It’s not an image classifier. Only actual images that are derivatives of known CSAM images (a database of 250k images) will match. Random images of kids will not match those at any greater frequency than any other image.


Counter-question: At what point is child porn actually child porn, socially and statistically speaking?

If I share that picture of my child with my friends and loved ones on Facebook - at what "scale" is it considered to be added to that database as child porn?

1k shares? 10k? Who's the one eligible to decide that? The judicatives? I think this scenario is a constitutional crisis because there's no good solution to it in terms of law and order.


I think you're underestimating the severity of child abuse by orders of magnitude. CSAM is a database of child rape, not child nudity.


For now. You don't know what will be added next.

China will demand it to include pictures of the Tiananmen massacre.


Well that’s a pretty orthogonal concern to the above comment that was worried about getting flagged for sharing pics of their own kids


idk what's in the database, whether it's rape or nudes or both. Although depictions of sexual acts versus simple nudity seems like a logical place to draw a line, all the lines on adult pornography are arbitrarily drawn based on "community standards", and we're only a few decades away from state-level bans on any nudes as "porn" in the US, including artistic photos. (Not to mention anti-sodomy laws).

Even if what's in the database is 100% violently criminal as you suggest, and even if it remains limited to that material, we already have a process in place that denies the accused of even seeing the evidence against them if a hash matches. What a horrific, orwellian situation if someone sent you hash matches, the police raid your house and now you can't even see what they think they have or prove your own innocence.


You would presumably have the 30+ images on your device or in iCloud to prove your innocence.

For you to get caught up in this dragnet, 30+ plus images have to match NeuralHash’s of known illegal images, thumbnails of those images have to also produce a hit when run through a private hash function that Apple only has, and two levels of reviewers have to confirm the match as well.


What does Apple even do in this situation? That media won't match known CSAM, but if you modify childhood images so that its hash matches CSAM, what does Apple do. There are just SO MANY things that can and will go wrong as people try to exploit this system.


You can’t modify your childhood images so their hash matches csam because the visual derivative won’t match.


In the digital age, I certainly wouldn't be taking such pictures, let alone uploading them to cloud storage. Not because of any concerns about neural hashing, but simply because I wouldn't want such pictures of my children getting leaked / stolen / hacked.


Why would anyone save CSAM to their photo library?


A hash collision allows you to create material that matches CSAM signatures, without being CSAM. This opens up a new class of attacks.

Specifically, many criminal actors don't touch CSAM because it's wrong. But some of these criminal actors will happily abuse legal systems, e.g. SWATTing.


I would gladly have a mobile phone full of memes that have been modified to match, just for the lulz. I honestly think every meme should be put through just to have "illegal memes"


Illegal memes. Finally. Illegal Pepe will be the crowning jewel of my rare Pepe collection.


Holy cow. An Illegal Pepe is just too good not to have.


Maybe this is how 4chan finally demolishes itself.


> A hash collision allows you to create material that matches CSAM signatures, without being CSAM.

This is not correct. Hash collisions won’t match the visual derivative.


Sorry, this is not even wrong.

The visual derivative is just a resized, very-low-resolution version of the uploaded image. "Matching the visual derivative" is completely meaningless. The visual derivative is not matched against anything, and there is no "original" visual derivative to match against.

If enough signatures match, Apple employees can decrypt the visual derivatives, and see if these extremely low resolution images look to the naked eye like they could come from CSAM. If so, they alert the authorities.. Given a way to obtain hash collisions, generating non-CSAM images that pass the visual derivative inspection is completely trivial.


> Sorry, this is not even wrong.

Probably a mistake to say things like this, when the public documentation contradicts you.

> The visual derivative is not matched against anything, and there is no "original" visual derivative to match against.

Bullshit.

Here is the relevant paragraph from Apple’s documentation:

“as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possi- bility that the match threshold was exceeded due to non-CSAM images that were ad- versarially perturbed to cause false NeuralHash matches against the on-device en- crypted CSAM database. If the CSAM finding is confirmed by this independent hash, the visual derivatives are provided to Apple human reviewers for final confirmation.”

https://www.apple.com/child-safety/pdf/Security_Threat_Model...


I just want to be clear if I understand this... many images can result in the same hash, but the hash can and will be reversible into one image? And that image is a low res porn photo derived from the algorithm's guesswork? So once a hash matches they don't check if there was a collision and the photo is completely unrelated, they just see the CG porn? If that's the case then why even look at the derived image?


No, this is not what's going on at all. The employees never see the original photos in the government CSAM hash database. Apple doesn't even have these photos: it's precisely the kind of content that they don't want to store on their servers. If some conditions are satisfied, the employees gain access to the visual derivatives (low-resolution copies) of your photos, and they judge whether these look like they could plausibly be related to CSAM materials.

The exact details of the algorithm are not public, but based on the technical summary that Apple provided, it almost certainly goes something like this.

Your device generates a secret number X. This secret is split into multiple fragments using a sharing scheme. Your device uses this secret number every time you upload a photo to iCloud, as follows:

1. Your device hashes the photo using a (many-to-one, hence irreversible) perceptual hash.

2. Your device also generates a fixed-size low resolution version of your image (the "visual derivative"). The visual derivative is encrypted using the secret X.

3. Your device encrypts some of your personally identifying information (device ids, Apple account, phone number, etc.) using X.

4. The hash, the encrypted visual derivative, and the encrypted personally identifying information are combined into what Apple calls the "safety voucher". A fragment of your key is attached to the safety voucher, and the voucher is sent to Apple over the internet. The safety vouchers are sent in a "blinded" way (with another encryption key derived using a Private Set Intersection scheme detailed in the technical summary), so that Apple cannot link them to specific files, devices or user accounts unless there's a match.

5. Apple receives the safety voucher. If the hash in the received safety voucher matches that of known CSAM content in the government-provided hash database (as determined by the private set intersection scheme), the voucher is saved and stored by Apple, and the fragment of your secret key X is revealed and saved. (You'd assume that they filter out / discard your voucher if there's no match; but the technical summary doesn't explicitly confirm this; this means that they may store and use it in the future to run further scans).

6. If your account uploads a large number of matching vouchers, then Apple will gather enough fragments to reassemble your entire secret key X. Now that they know your secret key, they can use it to decrypt the "visual derivatives" stored in all your saved vouchers.

7. An Apple employee will then inspect the "visual derivatives", and if your photos look like CSAM (more precisely, this employee can't rule out by visual inspection that your photos are CSAM-related), they will proceed to use your secret key X (which they now know) to decrypt the personally revealing information contained in your safety voucher, and report you to the authorities.

Keep in mind that the employee looking at the visual derivative does not, and cannot, know what the original image is supposed to look like. The only judgment they get to make is whether the low-resolution visual derivative of your photo looks like it can plausibly be CSAM-related or not. Plainly speaking, they will check if a small, say 48x48 pixel, thumbnail of your photo looks vaguely like naked people or not.


> The exact details of the algorithm are not public,

The relevant parts are.

> but based on the technical summary that Apple provided, it almost certainly goes something like this.

It doesn’t go like that. You are simply wrong.


Seems like that would rule out using the system to detect ‘tank man’ images.


That bit you quoted seems to be actually correct. It does not mention visual derivatives at all.

That said I think your statement is a bit too strong, but generally true. A hash collision is not going to inherently be visually confusing. However you claim that it is impossible for an image to be both visually confusing and a hash collision, which seems unlikely. The real question is going to be how much more effort it takes to do both.


I didn’t claim it was impossible, just that hash collisions won’t match both.

Also, the information needed to create a full match simply is not available.


Are those not the same statement?

Unless you're relying on it being computationally infeasible, but I'm not sure we know enough to consider that true at this point. Usually when we make statements on those grounds we do so with substantial proof. I don't think we know enough to do so here. I'm not even sure how feasible it is when you throw DL into the mix.


From the docs: “as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, inde- pendent perceptual hash. This independent hash is chosen to reject the unlikely possi- bility that the match threshold was exceeded due to non-CSAM images that were ad- versarially perturbed to cause false NeuralHash matches against the on-device en- crypted CSAM database.”


> Are those not the same statement?

No.


Most people wouldn't of course. In this scenario you'd get someone to download the CSAM unknowingly. If they have iCloud sync it automatically uploads to iCloud, thereby triggering the system. At that point the authorities will be alerted by Apple, and you can inform media outlets. They in turn will ask law enforcement who will confirm the investigation, and the reputation of the person investigated will be tarnished.


Also as dannyw pointed out, you don't even have to send CSAM to trigger the system. If they found you you would still be charged, but not with possession of CSAM.


What exactly would you be charged with? Why would law enforcement even be involved in a case of false positives?


The sender would of course be charged with wasting police efforts, defamation attempts+++. In the case of false positives the receiver of course wouldn't be charged, it's more about the fact that this system can be manipulated with too much ease. Even if you're not charged, an investigation takes time away from already limited law enforcement resources. I'm also not interested in buying products from a company that blatantly spies on me. Today it's CSAM, but as others have pointed out, the hashes can be changed to look for anything.


Do you mean charging the sender of the trick images or the receiver?


Well that depends on the situation. Regardless the sender would be charged if found, but if they were able to get legitimate CSAM on the receiver's phone the receiver could possibly be charged too, or at least investigated. Just the idea of getting investigated in these kinds of attacks, much less being exposed publicly as being under investigation is a horrible thought.


You specifically said someone would be sent known CSAM. How would that get added to their photo library?


I meant * could *. My point is that social engineering is a clear weak link in this system. They can also be sent regular photos whose hash matches the database, or use this repo to transform a regular pornographic photo's hash, making it hard for manual confirmation on Apple's part.


What kind of social engineering would lead an innocent person to save known CSAM to their photo library?


None needed. You could just send a photo to the target through WhatsApp, and the photo would be automatically synced with iCloud.


Wouldn't the photo be scanned for CSAM by WhatsApp first?


Whatsapp messages are e2e encrypted so no.


What kind of social engineering would lead an innocent person to install malware on their devices? Or do you think people like that want to take part in an illegal DDoS botnet?


I think there’s a difference between “I’ll click this totally legit button to protect my computer from viruses” and “I’ll save this picture of a child being raped to my photo library.”

A lot of people may not know how to avoid malware. But I don’t think very many of them would be so inept as to accidentally long press on child porn and tap “Add to Photos”.


... and "I'll save this picture of an hilarious kitten to my photo collection"...

Fixed it for you.

The image to be saved doesn't have to be disturbing at all to trigger a hash collision.

The linked repo has code to modify an image to generate a hash collision with another unrelated image.

That's the whole point.


If some commenters can be believed about their experience with the database, there are a bunch of completely innocuous images in it because they're from the same photosets or distributed alongside CSAM.

Is that enough to cause an investigation? Maybe, maybe not, but I wouldn't want it to be a risk.


Photos in the database are classified for their content. Only images classified as A1 (A: prepubescent minor, 1: sex act) are being included in the hash set on iOS. So this doesn't even include A2 (2: lascivious exhibition), B1 or B2 (B: pubescent minor) let alone images which are in the database and aren't classified as any of A1, A2, B1 or B2.

While I've no doubt that there's a lot of "before and after" images (which are still technically CSAM even if they're not strictly child porn) and possibly many innocuous images, they would not have been flagged as "A1".

I'm sure there's probably still a few images flagged as A1 which shouldn't be in the database at all, but that number is going to be small. How many of these incorrectly flagged images are going to make their way into your photo library? One? Two?

You need 30 in order for your account to be flagged.


If someone is deliberately targeting you with them, 30 isn't very hard to reach.


I think it’s implausible that someone can become aware of 30 images which are miscategorised as A1 CSAM. How would this malicious entity discover them? What’s the likelihood that this random array of odd images could make it into a target‘s photo library?

And what’s the likelihood that a human reviewer will see these 30 odd images and press the “yep it’s CSAM” button?

More likely as soon as Apple’s human review sees these oddball images, they’re going to investigate, mark those hashes as invalid, then contact their upstream data supplier who will fix their data and now those implausible images are now useless.


Lending your phone to someone for a call, then a quick airdrop. Legitimate-looking emails with buttons. There's probably a list somewhere of proven attack vectors.


I posted another comment that was misunderstood as well. Folks, no one is proposing to download actual CSAM images to your photo lib. You could be duped thinking you downloaded an image of a beautiful sunset which was carefully manipulated to match the hash of an actual CSAM image.


The even worse part here is that not only could it impact an image of a beautiful sunset, which would fail the human check, it could impact a low quality version of legal porn, which could easily pass the human check and get passed on to law enforcement.

A sufficiently advanced catfishing attack could probably take advantage of this to get someone raided and have all their electronics confiscated.

Just send someone a zip of photos and let them extract it...


This is the really scary part. Of course getting someone to download blobs that corrolate to CSAM would be one thing, but downloading regular photos that have nefarious hashes is a trend /pol/ could start in an afternoon.


The parent was proposing to “just send known CSAM”.

But OK, say someone sends you a sunset that fools the hasher. Then what? Of course one match won’t do anything, so you’d need to download however many matching sunsets. Then what? The Apple reviewer would see they’re sunsets and you’d challenge the flag saying they’re sunsets. And if somehow NCMEC got involved, they’d see they’re just sunsets. And if law enforcement got involved, they’d see they’re just sunsets.

These proofs of concept might seem interesting from a ML pov, but all they do is just highlight why Apple put so many layers of checking into this.


> But OK, say someone sends you a sunset that fools the hasher. Then what? Of course one match won’t do anything, so you’d need to download however many matching sunsets. Then what?

A real attack would be to take legal porn images and make them collide with illegal images, so when a human goes to review the scaled down derivative images, those images very well look like they could be CSAM. Since there are many of them, they'd get sent to law enforcement. Then law enforcement would raid the victim's home and take all of their electronic devices in order to determine if they can be charged with a crime or not.


This where the "fog of war" kicks in. What with doors being busted down, police departments making press releases, etc. I can easily imagine that the victim could be prosecuted, convicted and sent away because no-one understood the subtlety that their legal porn was not in fact CSAM.


The fog of war is largely in the realm of post-puberty minors, photos of which are not being included in Apple's corpus of hashes. I find it difficult to believe that anyone could mistake or otherwise "fog of war" a photograph of an adult and a prepubescent minor.

And that's assuming someone develops a hash collision which doesn't substantially mangle the photograph like the example offered on Github.

Specifically, only images categorised as "A1" are being included in the hash set on iOS. The category definitions are:

  A = prepubescent minor
  B = pubescent minor
  1 = sex act
  2 = "lascivious exhibition"
The categories are described in further detail (ugh) in this PDF, page 22: https://www.prosecutingattorneys.org/wp-content/uploads/Pres...


> Specifically, only images categorised as "A1" are being included in the hash set on iOS.

Do we know that for sure?

Apple has changed their mind enough times in the last week and a half that I'm convinced they're in full on defensive "wing it and say whatever will get people off our backs!" mode.

You can't read the threat modeling PDF and conclude that it was run through the normal Apple document review process. It reads nothing like a standard Apple document - it reads like a bunch of sleep deprived people were told to whip it up and publish it.


That document is over six years old. It has nothing to do with Apple.


I don't really want to do the research, so I'll take your word for it.

But by fog of war I was thinking more like the victim already has some sleazy (though marginally legal) stuff on their computer, or a search led to a find of pot in their house, or they lied to try and get out of the rap, or perhaps the FBI offered them a deal and they took it because they saw no way out, or perhaps they were simply an unlikable individual who the jury took a dislike to.

Basically that things are not always clear cut, and they come out of the wrong side of things, in a situation created by Apple's surveillance.


Even if I grant all of the above, I don't see how any of that is impacted by the distinction between on-cloud scanning and on-device scanning of photos which are being uploaded to the cloud.

Surveillance is surveillance. It's a bit more obnoxious that a CPU which I paid money for is being used to compute the hashes instead of some CPU in a server farm somewhere (which I indirectly paid for) but the outcome is the same. The risk of being SWAT-ed is the same.


It would still be mentally draining to be accussed of CP. Can you imaging how terrified one would be if they see a warning message with a blurred sunset? I don't know exactly how the system works but from Apple's press release, it hides the image and gives a warning to the user. This would not go well on social media.


Remember, while you are refuting all this to each party, you are actually in the process of defending yourself against one of the worst criminal accusations possible. Your life will be investigated, your devices will be investigated - the amount of stress and reputational harm this causes is insane.


The point isn't to trick NCMEC, but rather create a DoS attack so no actual triggers can get through the noise.


I thought the point was to SWAT some innocent person? The goal keeps changing.


But who would want that?

We all want privacy but it seems odd to try to DoS this, with high risk for yourself and very little to gain.

Might be useful when the system turns into mass political surveillance tho.


As I've commented elsewhere, DoS can be easily mitigated by implementing another layer with basic object recognition to filter out false positive collisions.


> You could be duped thinking you downloaded an image of a beautiful sunset

If it was anything like the image used to demonstrate this technique on Github, it's unlikely that anyone would describe that sunset as "beautiful". They'd be more likely to describe it as "bugger, this JPEG file is corrupted."


Attacks never get worse over time.

It was quite literally less than 24h from "Oh, hey, I can collide this grey blob with a dog!" to "Hey, this thing that looks like cat hashes to the same thing as this dog!"

You really think this is going to end at this proof of concept stage?


Of course it will get better. But it's not going to end at "Hey, this photograph of a sunset is visually unchanged" while now matching CSAM. That's just not plausible. It's not how these classifiers work.

Regardless, this whole thing is moot because there are two classifiers, only one of which has been made public. Before any matches can make it to human review, photos in decrypted vouchers have to pass the CSAM match against a second classifier that Apple keeps to itself.


Match the first classifier, and your file gets uploaded unencrypted to Apple. Which is fine if it's probable CSAM. But what if they switch efforts to combat, say, piracy?


So your concern is that Apple will start doing something evil at any moment without your consent. That's been true of any computer platform since the advent of software updates. You can such hypotheticals with any company you like.


That’s not how the technology works. The files are never decrypted. Instead, if enough hashes match, a “visual derivative” is revealed. What a “visual derivative” is hasn’t been explained, but most people seem to think it’s a low-res version of the file.


Yes but that would be harmless because the visual derivative wouldn’t match.


Except that it isn’t. The hashes don’t enable an attack.


I really hope it won't be another Saigon, the videos coming out of the final days are harrowing. A lot of other countries are also closing their embassies for now, and seem to be more willing to bring home Afghans who've helped. In Norway we're also bringing their families.


Perception >>> Reality

Rest assured, there won't be another Saigon. The videos coming out of Afghan countryside and Kabul will be erased by BigTech under some corporate speak reason or another. The masses will never see them. 'Covid Misinformation' (TM) is just the dry run.


> I really hope it won't be another Saigon, the videos coming out of the final days are harrowing.

Unless Biden changes course, it's going to be another Saigon, whether the pictures show it or not (e.g. they may get the Americans out before the Taliban is too close, and avoid pictures of throngs of desperate allies mobbing the embassy hoping to get evacuated with the last Americans, but most of those allies will still be stranded).


Ye I've already seen images that look exactly like the ones from that evacuation.


Can someone please explain to me how this comparison would work? It seems so trivial to alter any image containing CP slightly such that its hash doesn't compare anymore?


Confusingly, it's a completely different use of the word "hash"

See here: https://en.wikipedia.org/wiki/Perceptual_hashing

The goal of a perceptual hash is to generate a number that will be the same for all "similar" looking images.

Think like what Shazam does, but in the visual domain.


Thanks for the response, I was super confused by this part!


Perceptual hashes are not related to byte-level hashes.


Yeh I must have ignored the "perceptual" part when I read it over



Thanks for this! :)


It's the same excuse the EU are currently using to infringe upon its citizens' privacy and require messaging application providers to install backdoors. It's an appeal to emotion, and since we have to assume that these legislators are intelligent, it's a disgusting overreach.


They are indeed overreaches, but even a valuable child protection law or service might presumably be pushed with an appeal to emotion. It’s better to familiarize yourself with the situation before dismissing it out of hand. While preserving a healthy dose of skepticism, of course.


I agree that one should look into what is being proposed and its implementation, but in both these instances backdoors are being introduced. Once backdoors are in place any government can petition access, malicious actors have attack vectors, and all of this for one proposed quasi-legitimate use-case.


We are in agreement here.


Nice :)


The language in those scam articles is actually perfect, first time I've seen that.


Wow, the Norwegian on those scam web sites is actually perfect. Never seen that before.


That's because it's real content that they have stolen and just republished. In SEO circles one like to say that original content is king. Well, not so much after all.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: