A brief history of artificial intelligence

Despite huge advances in machine learning models, AI challenges remain much the same today as 60 years ago

Contributing Editor, InfoWorld |

A brief history of artificial intelligence — Gremlin / Getty Images

In the early days of artificial intelligence, computer scientists attempted to recreate aspects of the human mind in the computer. This is the type of intelligence that is the stuff of science fiction—machines that think, more or less, like us. This type of intelligence is called, unsurprisingly, intelligibility. A computer with intelligibility can be used to explore how we reason, learn, judge, perceive, and execute mental actions.

Early research on intelligibility focused on modeling parts of the real world and the mind (from the realm of cognitive scientists) in the computer. It is remarkable when you consider that these experiments took place nearly 60 years ago.

Early models of intelligence focused on deductive reasoning to arrive at conclusions. One of the earliest and best known A.I. programs of this type was the Logic Theorist, written in 1956 to mimic the problem-solving skills of a human being. The Logic Theorist soon proved 38 of the first 52 theorems in chapter two of the Principia Mathematica, actually improving one theorem in the process. For the first time, it was clearly demonstrated that a machine could perform tasks that, until this point, were considered to require intelligence and creativity.

Soon research turned toward a different type of thinking, inductive reasoning. Inductive reasoning is what a scientist uses when examining data and trying to come up with a hypothesis to explain it. To study inductive reasoning, researchers created a cognitive model based on the scientists working in a NASA laboratory, helping them to identify organic molecules using their knowledge of organic chemistry. The Dendral program was the first real example of the second feature of artificial intelligence, instrumentality, a set of techniques or algorithms to accomplish an inductive reasoning task, in this case molecule identification.

Dendral was unique because it also included the first knowledge base, a set of if/then rules that captured the knowledge of the scientists, to use alongside the cognitive model. This form of knowledge would later be called an expert system. Having both kinds of “intelligence” available in a single program allowed computer scientists to ask, “What makes certain scientists so much better than others? Do they have superior cognitive skills, or greater knowledge?”

By the late 1960’s the answer was clear. The performance of Dendral was almost completely a function of the amount and quality of knowledge obtained from the experts. The cognitive model was only weakly related to improvements in performance.

This realization led to a major paradigm shift in the artificial intelligence community. Knowledge engineering emerged as a discipline to model specific domains of human expertise using expert systems. And the expert systems they created often exceeded the performance of any single human decision maker. This remarkable success sparked great enthusiasm for expert systems within the artificial intelligence community, the military, industry, investors, and the popular press.

As expert systems became commercially successful, researchers turned their attention to techniques for modeling these systems and making them more flexible across problem domains. It was during this period that object-oriented design and hierarchical ontologies were developed by the AI community and adopted by other parts of the computer community. Today hierarchical ontologies are at the heart of knowledge graphs, which have seen a resurgence in recent years.

As researchers settled on a form of knowledge representation known as “production rules,” a form of first order predicate logic, they discovered that the systems could learn automatically; i.e., the systems coud write or rewrite the rules themselves to improve performance based on additional data. Dendral was modified and given the ability to learn the rules of mass spectrometry based on the empirical data from experiments.

As good as these expert systems were, they did have limitations. They were generally restricted to a particular problem domain, and could not distinguish from multiple plausible alternatives or utilize knowledge about structure or statistical correlation. To address some of these issues, researchers added certainty factors—numerical values that indicated how likely a particular fact is true.

The start of the second paradigm shift in AI occurred when researchers realized that certainty factors could be wrapped into statistical models. Statistics and Bayesian inference could be used to model domain expertise from the empirical data. From this point forward, artificial intelligence would be increasingly dominated by machine learning.

There is a problem, though. Although machine learning techniques such as random forest, neural networks, or GBTs (gradient boosted trees) produce accurate results, they are nearly impenetrable black boxes. Without intelligible output, machine learning models are less useful than traditional models in several respects. For example, with a traditional AI model, a practitioner might ask:

Why did the model make this mistake?
Is the model biased?
Can we demonstrate regulatory compliance?
Why does the model disagree with a domain expert?

The lack of intelligibility has training implications as well. When a model breaks, and cannot explain why, it makes it more difficult to fix. Add more examples? What kind of examples? Although there are some simple trade-offs we can make in the interim, such as accepting less accurate predictions in exchange for intelligibility, the ability to explain machine learning models has emerged as one of the next big milestones to be achieved in AI.

They say that history repeats itself. Early AI research, like that of today, focused on modeling human reasoning and cognitive models. The three main issues facing early AI researchers—knowledge, explanation, and flexibility—also remain central to contemporary discussions of machine learning systems.

Knowledge now takes the form of data, and the need for flexibility can be seen in the brittleness of neural networks, where slight perturbations of data produce dramatically different results. Explainability too has emerged as a top priority for AI researchers. It is somewhat ironic how, 60 years later, we have moved from trying to replicate human thinking to asking the machines how they think.

Next read this:

Steve Nuñez is technologist-turned-executive currently working as a management consultant helping senior executives apply artificial intelligence in a practical, cost effective manner. He takes an incremental approach to AI adoption, emphasizing the organizational change required for analytics and A.I. to become part of the company DNA.

Before moving to consulting Steve led the professional services and technical pre-sales organizations in Asia Pacific for MapR, a “big data unicorn” acquired by HP Enterprise. While leading the field organization, Steve served clients including Toyota, Bank of China, Philips, Samsung, and the government of India in their bio ID program.

Steve has been a contributing editor and reviewer for InfoWorld since 1999.