You can usually come up with an explanation for why you did something. You don't...

TeMPOraL · on Oct 15, 2015

Humans often come up with an explanation for why they did something that's simple, consistent and also completely untrue. They sometimes even believe their own explanations. We may run into the same problem with an AI here - it may be, purposefully or not, providing us with untrue explanations.

david_ar · on Oct 16, 2015

Indeed, people even offer explanations when tricked into believing they had chosen the opposite outcome to the one they actually picked [1].

[1] http://researchgate.net/publication/6745688

> participants fail to notice mismatches between their intended choice and the outcome they are presented with, while nevertheless offering introspectively derived reasons for why they chose the way they did

neltnerb · on Oct 15, 2015

I, for one, think that would be a much more fascinating problem than implementing neural nets where the representation of the underlying data is heavily obscured.

TeMPOraL · on Oct 15, 2015

If only it is possible. The only powerful enough model for reasoning that comes to my mind are bayesian networks, and those will suffer from the same problems as neural nets - nodes, edges and values may not be related to any meaningful symbolic content that would be useful to report. It again seems in line with human mind; we invent symbols to describe some groups of similar concepts, with borders being naturally fuzzy and fluid.

If the AI tells us: "I did it because #:G042 and #:G4285 belong to the same #:G3346 #:G4216, while #:G1556 and #:G48592 #:G4499 #:G22461 #:G48118", I don't think we'll have learned anything useful.