History of Artificial Intelligence

In 2012, a computer vision competition became the turning point for artificial intelligence. A researcher named Krizhevsky built a neural network called AlexNet that crushed the ImageNet competition, showcasing the raw power of deep learning for image classification [1]. This wasn't just a win.

Transcript

In 2012, a neural network named AlexNet shattered expectations at the ImageNet competition, revolutionizing the field of AI ^[1]. Created by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, AlexNet didn't just win — it demolished previous records for image recognition ^[1]. This wasn't a marginal improvement. The field had been stuck for years. What changed?

The breakthrough came from three converging forces: massive datasets like ImageNet itself, GPU computing power, and innovative training techniques ^[3]. Before AlexNet, these pieces existed separately. But bringing them together created something unexpected — a threshold moment where deep learning shifted from academic curiosity to genuine capability. AlexNet's victory is now considered a Big Bang moment for the field, dramatically improving image classification and sparking widespread interest in deep learning ^[4].

That spark ignited something bigger. Following AlexNet's success, major technology firms began investing heavily in deep learning, establishing dedicated AI laboratories and competing fiercely for top talent ^[6]. The momentum was undeniable. But momentum alone doesn't determine the future of AI. The real innovation kept accelerating.

In 2014, just two years later, Generative Adversarial Networks — or GANs — were introduced ^[5]. This was a conceptual leap. Until then, deep learning excelled at recognition: looking at images and identifying what was in them. GANs flipped the problem. They enabled AI to generate realistic, novel data and images from scratch ^[5]. Instead of analyzing the world, AI could now create it. That distinction matters profoundly because generation requires a deeper kind of understanding.

But the next transformation came from an unexpected direction. The real architecture that would power the modern AI boom emerged from research focused not on images, but on language. A 2017 paper introduced the Transformer architecture — a radically different approach to how neural networks process information. This wasn't incremental tinkering with existing designs. Transformers became the foundational technology behind modern large language models.

The chain reaction had begun. Once researchers understood how to scale these systems, growth accelerated dramatically. The story of how that played out — and what emerges when you push these systems to their limits — is where the real surprises lie.

GANs flipped the problem.

And that acceleration wasn't just academic. The field shifted from theoretical pursuit to industrial priority. What had seemed like specialized computer science suddenly looked like the future of entire industries.

But here's what's crucial: deep learning didn't emerge in a vacuum. To understand why neural networks suddenly became powerful, you have to go back to where the whole field began—and why it nearly died.

In 1950, Alan Turing proposed the "Turing Test" as an early standard for evaluating machine intelligence ^[8]. Rather than asking "Can machines think?" directly, Turing suggested a simpler experiment: if a machine could fool a human into believing it was human through conversation alone, then it deserved to be called intelligent. This wasn't just philosophy. It was a concrete proposal for what machine intelligence might look like, and it shaped how researchers framed their ambitions for decades.

That ambition crystallized six years later. The term "Artificial Intelligence" was coined and the field was formally established at the 1956 Dartmouth workshop by pioneers including John McCarthy and Marvin Minsky ^[9]. This wasn't a quiet academic moment. Researchers gathered convinced they were on the threshold of something transformative—that machines might soon match or exceed human reasoning. The optimism was real.

What followed was the era of Symbolic AI, which focused on logic and formal reasoning, occurring during the 1960s and 1970s ^[10]. The bet was straightforward: if you could encode knowledge as symbols and rules, machines could manipulate those symbols to solve problems. One striking example: in 1961, James Slagle developed a symbolic AI program called SAINT for his dissertation at MIT ^[11], which could solve calculus problems by applying symbolic rules. These were genuine achievements. But they masked a fundamental limitation. Symbolic systems couldn't learn. They couldn't adapt to situations their designers hadn't anticipated. And they needed exhaustive, hand-coded knowledge to do anything at all.

Reality collided with expectation. In 1974, criticism from the Lighthill Report and pressure from the US Congress led the US and British Governments to stop funding undirected AI research, contributing to the first AI winter ^[12]. Promises had been made. They weren't kept. The first "AI Winter," a period of reduced funding and interest, took place from the mid-1970s into the 1980s due to unmet promises and computational limits ^[13]. Funding evaporated. Researchers scattered. It seemed the dream was over.

Then something unexpected happened. A second period of high investment occurred in the 1980s, when business investment in AI grew from a few million dollars in 1980 to billions by 1988, driven by expert systems ^[15]. Companies saw practical value in capturing human expertise in rule-based programs. The field was alive again—though still shackled by its symbolic past. That constraint wouldn't break until neural networks learned to scale.

Thanks for listening to this VocaCast briefing. Until next time.

Make your own briefing in 30 seconds

Transcript

Sources

History of Artificial Intelligence

Make your own briefing in 30 seconds

Transcript

Sources

Related Briefings