Hacker News new | past | comments | ask | show | jobs | submit login

Youtube generates several minutes of video per wall clock second. Now many of those videos are innocuous, but one must assume that occasionally someone uploads a video of a street fight or something more grotesque that is of interest to law enforcement or intelligence apparatuses.

That's public. You can analyze all of those. The NSA is free to pull them just as much as you and I.

And they don't as far as we can tell. Is it the cost of analyzing that much content? Is it that the NSA doesn't care? Is there something difficult about stripping audio off a video for keyword spotting?

Well I have a theory, and the theory is based off what little comes out of that side of the community. The theory is that the NSA can't meaningfully process the data it ingests. There's too much, it's too hard to query and they hit the same roadblocks of telling the difference between an actual crime and a videogame or fiction story.

So then we must ask, why do they want more? They have more data than they can analyze, why even bother ingesting more? It's not because it helps their mission, it's not because there's some value to it.

Well, why do we see, regular businesses fall into this trap? A billion points of analytics data that they can't make sense of. When I see it, it's because it's easier to blame a lack of data than to explain the difficulty of the problem. You can always say "Well I just don't have enough data" but it's much harder to explain that a bunch of crappy error-filled data isn't good for anything except wild goose chases. Adding more bad data doesn't improve the quality of your data, it just adds more of it.




I think it would feel pretty good to have a database of potentially incriminating evidence against a wide swath of the population that could be used if a person became a high-profile target. For example, if you're in one of those videos and then run for public office 10 years later you better hope the intelligence agencies like your positions and don't want to tank your chances.

So, no, they can't process all of it. But they can more easily trawl it for specific data they need. Especially 10 years from now.


It's a fine supposition, but these suppositions often get passed around as if they're true and self evident. The reality is you don't have distinct information about what the government is collecting. Instead, what you have is information about what's probably possible.

From that standpoint it makes sense to err on the side of caution, and assume it's all being collected. But, while this is an effective risk calculus, it's different from having access to the ground truth.


> And they don't as far as we can tell.

That's wishful thinking which you have no evidence for. But let's assume that you're correct - eventually they will have a way to analyse it en masse.

There are, then, two things we need to bear in mind:

- is the time horizon likely to be close enough that data currently collected will be relevant then - if we allow the collection now, will it be easy to roll back that collection later when the threat is on the horizon

The answer to both of those questions is yes. Similarly, we use high strength encryption now, even if we think 128-bit is fine, because in time it won't be.

The above is theoretical. The next bit isn't - they will _always_ be able to decide that agent A should look at video B from N years ago.

They can't do that for a letter on the hypothetical table, or a message stored with strong encryption that stands the test of time - it won't exist in N years.


> The above is theoretical. The next bit isn't - they will _always_ be able to decide that agent A should look at video B from N years ago.

Unless they didn't collect and store it, as the parent suggests.


Why in the world would the NSA care about a street fight?

I have the opposite opinion: it is trivial and inexpensive to create and store an indexed archive of text from speech in audio, and to run image recognition models on video and pictures. There's value in having that data archived, so that they can go back and go through it should whoever created the data become a target in the future.

However, I doubt the NSA would waste resources investigating a street fight, but I'm pretty sure the video would be mined of any valuable data that could be gleaned from it.


>So then we must ask, why do they want more?

How do you know they want more?

[edit] Or was this meant as a rhetorical? ie, "who would want more in this case?"




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: