Artificial Intelligence (AI) systems, often referred to as the “alien minds” among us, are increasingly integrated into our daily lives. They power facial recognition in smartphones, determine credit scores, and even craft poetry. However, their unpredictable nature and the mystery surrounding their inner workings make them difficult to trust.
The unpredictability of AI
Trust is fundamentally built on predictability. When someone or something behaves in a way we anticipate, our trust in them grows. Conversely, unexpected behavior erodes that trust. AI systems, especially those built on deep learning neural networks, are notoriously unpredictable.
These networks, designed to mimic the human brain, consist of interconnected “neurons” with varying strengths. As these networks are exposed to training data, they adjust and “learn” to classify new, unseen data. But with trillions of parameters influencing their decisions, understanding the exact reasoning behind any given decision is nearly impossible. This phenomenon is known as the AI explainability problem, where the decision-making process is a black box.
AI vs. Human decision-making
Consider a scenario where an AI-driven self-driving car faces the ethical dilemma of saving passengers or avoiding a child on the road. While humans can justify their decisions based on ethics and societal norms, AI cannot provide such explanations. It lacks the ability to rationalize its choices, leading to a critical gap in trust. This is the AI predictability problem.
AI behavior and human expectations
Unlike humans who adjust their behavior based on perception, ethical norms, and social dynamics, AI operates with a static representation of the world derived from training data. It cannot adapt to evolving ethical standards or societal expectations. This disconnect between AI’s unchanging model and dynamic human interactions poses challenges in aligning AI behavior with human expectations, creating the AI alignment problem.
The AI alignment challenge
The AI alignment problem becomes apparent when AI behaves in ways contrary to human instincts. For instance, a self-driving car could prioritize hitting a child over swerving to avoid it, a scenario most human drivers would avoid. Aligning AI behavior with human values and expectations remains a formidable challenge.
Critical systems and trust in AI
To bolster trust in AI, the U.S. Department of Defense mandates human involvement in AI decision-making, either in the loop or on the loop. However, as AI adoption grows, nested AI systems may reduce opportunities for human intervention. Resolving explainability and alignment issues before this critical point is essential, particularly in critical systems like electric grids and military systems, where trust is paramount.
The alien nature of AI
AI is fundamentally different from human intelligence. Humans are predictable to each other because they share a common human experience. In contrast, AI is alien, a creation of humans with little insight into its inner workings. Trustworthiness involves predictability and conformity to norms, qualities AI inherently lacks. While research continues to address these challenges, the predictability and normative elements required for trust in AI are yet to be fully realized.
Trusting AI remains a complex and evolving challenge. AI’s unpredictability, lack of explainability, and alignment with human expectations pose significant hurdles. As AI continues to integrate into critical systems, the need to resolve these issues becomes more critical. While AI has become an integral part of our lives, achieving trust in AI systems of the future remains a profound quest that hinges on understanding, predictability, and alignment with human values.