Adversarial Examples, Optical Illusions, and Strange Artificial Intelligence

Abstract

The standard interpretation of adversarial examples being used against image recognition algorithms, as an odd demonstration of erroneous behavior that embarrassingly shows the limitations of these developing technologies, has recently been called into question by the controversial “feature, not a bug” viewpoint advanced by Andrew Ilyas and others. Ilyas et al.’s research suggests a very different interpretation of the phenomenon, that rather than simply a mistake that needs to be overcome, it seems that at least in some cases, there is genuine information being conveyed that is “non-robust”–invisible to humans–and that these networks are responding to in some sense intelligibly. In my study, I discuss the further philosophical implications of this possibility for artificial intelligence, drawing a parallel between adversarial examples and optical illusions to which humans are susceptible. I argue that this suggests our definition of intelligent behavior is too narrowly defined around predictably “human” behavior, and that if we are seriously interested in defining and developing general artificial intelligence, we must make conceptual space and prepare ourselves for that general artificial intelligence to look much more alien than we have so far considered–what I call “strange artificial intelligence.”

Presenters

Kendra Chilson
Student, PhD, University of California, Riverside, California, United States

Details

Presentation Type

Paper Presentation in a Themed Session

Theme

2023 Special Focus: Whose Intelligence? The Corporeality of Thinking Machines

KEYWORDS

Philosophy of AI, Rationality, Neural networks, Ethics of AI, Intelligence

Digital Media

This presenter hasn’t added media.
Request media and follow this presentation.