Artificial Intelligence Is Setting Up the Internet for a Huge Clash with Europe

Isamu · on July 12, 2016

How to read the title:

"Artificial Intelligence" => algorithms using lots of user data

"the Internet" => companies using this data like Facebook, Google

"Europe" => recent EU regulations

"Huge Clash" => a bit of a dust-up

jzebedee · on July 12, 2016

"Algorithms using lots of user data are setting up companies using this data like Facebook, Google for a bit of a dust-up with recent EU regulations" just doesn't have the same ring to it.

nedsma · on July 12, 2016

Love this, brilliant.

forgottenpass · on July 12, 2016

As the real-world consequences of various types of software grow, society's expectations for them to be carefully built and well understood will grow. [0]

Bowing down to the techno-gods and expecting them to make better decisions than we do has serious ramifications when the decision making is literally inscrutable. Right now the conversation seems to be hovering at the level of "shut up you luddites, and accept it." That doesn't mean the system should be trusted. In fact, it shows the opposite by illustrating those who would use the systems have nothing but contempt for those they would use the system on.

A simple example of how an inscrutable system could be considered benign is making it well-constrained and doing a failure modes analysis. But the question becomes: is the analysis preformed? thorough? and are the results acceptable? For things like self-driving cars: of course. It'd have to be to get on the road. When the failure modes aren't both blindingly obvious and unquestionably unacceptable to the consumer? "lulz, move fast and brake things!"

I fear the only way to improve the systems thinking around machine learning at fly-by-the-seat-of-your-pants companies are things like the EU regulation forcing the issue. That's a bit depressing, isn't it? Hopefully the level of discourse can be improved, even though the providers of such technology are incentivized to paint resistance as simpletons that just don't get it.

[0] Bruce Schneier explains better than I can, but with focus on security https://www.youtube.com/watch?v=KWD2v0SJAM8&feature=youtu.be... through 30:30ish

anexprogrammer · on July 12, 2016

> But the question becomes: are they preformed? thorough? and the results acceptable?

Also, given that such code bases tend to be masses of closed code, who performed the tests or audited them?

Isamu · on July 12, 2016

From the linked paper (http://arxiv.org/pdf/1606.08813v1.pdf)

EU regulations on algorithmic decision-making and a “right to explanation”

Bryce Goodman, Seth Flaxman

We summarize the potential impact that the European Union’s new General Data Protection Regulation will have on the routine use of machine learning algorithms. Slated to take effect as law across the EU in 2018, it will restrict automated individual decision-making (that is, algorithms that make decisions based on user-level predictors) which “significantly affect” users. The law will also create a “right to explanation,” whereby a user can ask for an explanation of an algorithmic decision that was made about them. We argue that while this law will pose large challenges for industry, it highlights opportunities for machine learning researchers to take the lead in designing algorithms and evaluation frameworks which avoid discrimination.

From the regulation:

Article 11. Automated individual decision making

1. Member States shall provide for a decision based solely on automated processing, including profiling, which produces an adverse legal effect concerning the data subject or significantly affects him or her, to be prohibited unless authorised by Union or Member State law to which the controller is subject and which provides appropriate safeguards for the rights and freedoms of the data subject, at least the right to obtain human intervention on the part of the controller.

2. Decisions referred to in paragraph 1 of this Article shall not be based on special categories of personal data referred to in Article 10, unless suitable measures to safeguard the data subject’s rights and freedoms and legitimate interests are in place.

3. Profiling that results in discrimination against natural persons on the basis of special categories of personal data referred to in Article 10 shall be prohibited, in accordance with Union law.

robert_tweed · on July 12, 2016

The paper was itself posted to HN a few days ago. Discussion here:

https://news.ycombinator.com/item?id=12048223

norswap · on July 12, 2016

Personally, I see this as entirely positive. Note the law doesn't preclude automated decision making, it requires explanation of the decision process (which then has to be based on lawful criteria). This is what "fighting against evil IAs" looks like in the real world.

nraynaud · on July 12, 2016

I think there is something very good in that, we are training neural nets to be racist, maybe demanding what's in the box is a good way to limit that.

iraphael · on July 12, 2016

I'm pretty sure these kinds of regulations have existed in the US for a while too. According to [1], having an algorithm be interpretable (which NNs aren't) is a legal requirement in any financial decisions subject to anti-discrimination laws.

In this case, it doesn't necessarily mean there will be a HUGE CLASH. We will either turn to more interpretable AI, or continue developing more and more interpretability techniques for deep learning, which has been an area of very active research in the recent years.

[1] https://www.cs.princeton.edu/picasso/mats/Book_Schapire.pdf page 664

Zak · on July 12, 2016

I'm not sure if anti-discrimination laws require that the outcome be explainable as long as none of the inputs involve legally protected classes.

dragonwriter · on July 12, 2016

> I'm not sure if anti-discrimination laws require that the outcome be explainable as long as none of the inputs involve legally protected classes.

Yes, in general anti-discrimination law in the US, a disparate impact along a protected axis usually means that a decision must be justified on some neutral grounds. If there is no disparate impact attached to -- not just no express input related to -- some protected axis of discrimination, then, no, the outcome doesn't need to be explained.

mvitorino · on July 12, 2016

I think the article is drawing conclusions that may not really be the aim of the regulation. Fail to see how one could claim that the decision to show a particular Facebook Ad (or almost any Ad) would "significantly affect" the person. Just because something uses said "AI", does it mean that it would automatically be covered by this legislation.

Now, a self-driving car probably would be...

nitrogen · on July 12, 2016

Echo chambers significantly affect people on a broad scale.

mvitorino · on July 12, 2016

Agree, but you can't really prove someone was affected by an echo chamber, can you? I mean, it only really matters if you can sue someone for damages of some sort, right?

andrewclunn · on July 12, 2016

If the underlying code behind the neural network is open source, and a user has rights to access their own data, then doesn't that comply in every meaningful way? Considering that Facebook has already embraced open source AI and Google has given users near complete control in viewing and purging their history, I think that these companies are already gearing up for compliance.

darkr · on July 12, 2016

They would also need access to the data set the the neural network was trained with.

iwwr · on July 12, 2016

Part of that data may be the personal data of other people, equally bad to hand over. The whole neural net and would have to be deleted, along with any software component, database, cache etc. resulting from that which can personally identify people.

Neural nets trained on personal data may be illegal wholesale, now or very soon.

mring33621 · on July 12, 2016

Illegal? Setting us up for a nice WAR ON INFORMATION, huh?

Illicit NNs will then become VERY valuable.

jbooth · on July 12, 2016

Which then violates the other N-1 people's privacy, right? Why should Guilliame get access to Wilhelm or Guillermo's data?

I bet this conundrum never crossed the minds of any of the people writing the regulation.

IanCal · on July 12, 2016

> I bet this conundrum never crossed the minds of any of the people writing the regulation.

I expect it has, but they feel the ability for people to have an explanation for why they were denied insurance/credit/a job is more important than allowing companies to use processes they can't explain to make important decisions about your life.

blahi · on July 12, 2016

And I bet you haven't trained a model in your life.

You don't need the data. You need the model. And regulations like these are already in effect in the US. In banking, for example, it is prohibited to use protected classes (like race for example) or attributes that highly correlate with them, in your models.

jbooth · on July 12, 2016

What is it with hacker news commenters and assuming other people don't know things?

The comment I was responding to was talking about training sets, I riffed on that. Depending on your definition of "full explanatory power", the model itself might very well not be enough, especially in the case of neural networks. Could you take a set of weights in a 5-deep neural network, look at an input vector, and have any kind of intuition about the output? It could be that there's some really interesting research that I'm unaware of here, please let me know if there's something I'm missing.

You'll need additional descriptive statistics at the very least, combined with some stochastic meta-models to relate a given record to the "whys" of their output. It's doable without access to training sets, but convincing to us and convincing to a lawyer reading the letter of the law regarding "all inputs to the decision" are 2 different things.

blahi · on July 12, 2016

Where ANNs make life-affecting decisions? I can't say with certainty there aren't companies doing it, but if there are just regulate them out and require transparent models.

Speaking from experience, there is a massive shift away from ANNs in decision making. Companies (esp. in marketing) drank the Kool-Aid of some data scientists but figured out that they aren't all that useful. Turns out people care a lot more about inference and are willing to take a hit on prediction in order to be able to interact with the model. I'm not too familiar with ANNs but from the talks I've seen, I'm not even convinced they perform any better than other models.

So from what I've seen, ANNs have very limited application area which does not overlap with decision making, so I don't think there is a problem with them. And if there is, just ban them in decision making. That's not being luddite. We ban all sort of technologies because they are not appropriate.

jbooth · on July 12, 2016

I'm not sure what the boundary is for "life-affecting", if you're talking about loan origination or anything, then like you said, there's already a boatload of regulations there.

For marketing optimization, FWIW, my experience is the opposite. For a big enough application, any 0.5% improvement is hugely welcome, as long as it works reliably and you're not just gaming the metrics (or, to be honest, even if you are, yay corporate).

blahi · on July 12, 2016

Are you by any chance taking into consideration only adtech? Because there's much more to marketing than ads...

dharma1 · on July 12, 2016

Current ANN's basically do correlation, not causality. I'm not sure how you distill correlations from tens or hundreds of thousands of data points into natural a language explanation about a decision.

j2kun · on July 12, 2016

Why don't these regulations rule out _most_ uses of software to do anything? Map directions are based on location, search results are based on keywords that correlate with user-level predictors, autocomplete is based on what you typed in the past. It seems like every aspect of technology violates these regulations.

IanCal · on July 12, 2016

A good guide to follow is that if something seems just so obviously absurd, it's worth considering that it may be your interpretation that's invalid.

As Isamu quotes from the paper: https://news.ycombinator.com/item?id=12079343

> We summarize the potential impact that the European Union’s new General Data Protection Regulation will have on the routine use of machine learning algorithms. Slated to take effect as law across the EU in 2018, it will restrict automated individual decision-making (that is, algorithms that make decisions based on user-level predictors) which “significantly affect” users. The law will also create a “right to explanation,” whereby a user can ask for an explanation of an algorithmic decision that was made about them. We argue that while this law will pose large challenges for industry, it highlights opportunities for machine learning researchers to take the lead in designing algorithms and evaluation frameworks which avoid discrimination.

Autocompleting a field is not something that feels like it would fall under this. Getting rejected by all your banks for a mortgage because they all use the same prediction company would be.