The Real History of AI, Part 1: Perceptrons, Symbolic AI, and the First Winter (1943–1980)
AI did not arrive with ChatGPT. The first mathematical neuron was described in 1943, a working perceptron was running in 1958, and by 1969 neural networks had already been buried for fifteen years. Where modern AI actually comes from.
Artificial intelligence as an academic field was named in summer 1956 at the Dartmouth conference. The first working learning neural network - Rosenblatt's perceptron - ran in 1958 on an IBM 704. By the time ChatGPT launched in November 2022, the idea of an artificial neuron was 79 years old. Most of the math underneath what people today call 'AI' was laid down between the 1940s and the 1980s.
Key facts
- 1943: Warren McCulloch and Walter Pitts formalized the first mathematical neuron - 13 years before the term 'artificial intelligence' even existed.
- 1958: Rosenblatt's Perceptron, running on an IBM 704, classified 20×20-pixel images at roughly 88% accuracy after training.
- 1965: Stanford's DENDRAL automated mass-spectrum chemical analysis - the first successful expert system.
- 1969: Minsky and Papert's book 'Perceptrons' proved a single-layer perceptron cannot compute XOR - neural network funding evaporated for nearly 15 years.
- 1973: The Lighthill report in the UK shut down government AI programs - the formal start of the first AI winter.
What Existed Before ChatGPT
When ChatGPT opened to the public in November 2022, much of the audience came away convinced AI had just been born. It had not. By that launch date the artificial neuron was 79 years old, the term "artificial intelligence" was 66 years old, the first working learning neural network was 64 years old, and the first commercial expert system - one saving its vendor tens of millions of dollars a year - was 40 years old.
This is part one of a five-part series on the actual history of AI - from the mathematical neuron of 1943 to the moment a chat interface was wrapped around mature technology and the world suddenly "discovered AI." This installment runs from the birth of the idea to the first AI winter of the 1970s.
1943: A Neuron on Paper Before the Computer Existed
In 1943, neurophysiologist Warren McCulloch and logician Walter Pitts published "A Logical Calculus of the Ideas Immanent in Nervous Activity." They formalized the biological neuron as a Boolean function: weighted inputs, a threshold, a binary output. This was pure mathematics - no working general-purpose computer existed yet (ENIAC would not run until 1945).
Their core claim: networks of these simplified neurons could in principle compute any logical function. The brain could be described as a logic machine, and a logic machine could - in principle - be built. This paper is the foundation of everything later called a neural network.
1956: Dartmouth and the Birth of the Term
In summer 1956, a conference at Dartmouth College was organized by John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon. The 1955 proposal contained McCarthy's first written use of the phrase "artificial intelligence."
The stated goal sounds, by today's standards, both naive and ambitious: ten people, two summer months, machine intelligence as a problem they would meaningfully advance. The advance did not happen. What did happen was bigger: the field got a name, a community formed, and most Dartmouth attendees set the agenda for the next twenty years of AI research.
1958: Rosenblatt's Perceptron - A Machine That Learned to See
In 1958, psychologist Frank Rosenblatt at the Cornell Aeronautical Laboratory unveiled the Perceptron - the first learning neural network implemented in hardware. The Mark I Perceptron took a 20×20-pixel image from a photocell array, ran it through a layer of artificial neurons, and classified it. The connection weights were stored in an array of physical potentiometers that the training algorithm literally turned.
Rosenblatt demonstrated a machine that learned from examples - not from hand-coded rules but from data. After a few hundred examples it reliably distinguished simple shapes. The New York Times in 1958 wrote that the Perceptron would learn "to walk, talk, see, write, reproduce itself, and be conscious of its existence."
This was the first public demonstration of learning by example. The modern philosophy of "data plus compute plus a simple update rule" is a direct descendant of the Perceptron.
1964–1972: The Expert Systems Era
Parallel to connectionism, a more pragmatic branch grew: symbolic AI. Instead of neurons and weights, symbolic AI ran on rules: "if symptom X and lab result Y, then diagnosis Z with confidence 0.7."
Three landmark systems:
- ELIZA (Joseph Weizenbaum, MIT, 1964-1966) - a rule-based imitation of a Rogerian psychotherapist. Many users genuinely believed they were talking to a person - the first documented "ELIZA effect."
- DENDRAL (Stanford, 1965 onward) - automatic analysis of mass spectra to determine molecular structure. The first expert system that worked on real laboratory data.
- MYCIN (Stanford, 1972-1974) - diagnosis of bacterial infections and antibiotic selection. Blind tests showed MYCIN matching infectious-disease specialists.
Symbolic AI was commercially successful long before deep learning. XCON at DEC (early 1980s) automatically configured VAX computer orders and saved the company about $40 million a year. By the mid-1980s the expert-systems market was estimated in the billions.
1969: The Book That Froze Neural Networks for 15 Years
In 1969 Marvin Minsky and Seymour Papert published "Perceptrons." They proved mathematically that a single-layer Rosenblatt perceptron could not, in principle, compute XOR - an elementary logical operation. The book was written sharply and persuasively.
The proof was technically correct for single-layer networks. Multilayer networks with non-linear activations could solve XOR fine, but no stable training algorithm for deep networks existed in 1969 (it would arrive only in 1986 with backpropagation). Funding for neural-network research collapsed on both sides of the Atlantic. Rosenblatt died in a boating accident in 1971; the field had almost no defenders left.
This was the first recorded case of technically correct mathematics effectively killing a sound engineering idea for fifteen years. Symbolic AI took almost all of the field's resources.
1973–1980: The First AI Winter
By the early 1970s the field had stacked up systemic failures:
- Machine translation never reached commercial quality. The famous anecdote about translating "the spirit is willing, but the flesh is weak" into "the vodka is good but the meat is rotten" is undocumented, but it captures the late-1960s mood exactly.
- General intelligence did not emerge in 10 or 20 years, as the Dartmouth optimists had promised.
- The Lighthill report (1973), commissioned by the British government, concluded that AI research was failing to deliver and funding should be cut.
- DARPA in the US slashed speech-recognition programs after the failure of the Speech Understanding Research project.
From roughly 1974 to 1980, government money for AI nearly stopped. This is the first AI winter. Researchers migrated to adjacent fields - statistics, computational linguistics, knowledge bases - but the ideas never disappeared. They were simply waiting for hardware and better algorithms.
A Personal Anecdote: My First Intelligent Opponent
My first conscious encounter with what I later learned was "artificial intelligence" happened in the early 1990s. A neighbor had a ZX Spectrum running a chess program - I think it was Cyrus. I was a kid; the program reliably beat me; I was convinced something genuinely smart lived inside the machine.
Years later I learned what actually lived inside: the minimax algorithm with alpha-beta pruning - a 1950s idea, refined by Donald Knuth in 1975. No neural network, no learning. Pure tree search of move sequences with a heuristic position evaluator. And that simple 1950s technique, running on an 8-bit processor in a 1982-vintage ZX Spectrum, was enough to make ten-year-old me believe in machine intelligence.
That, I think, is the central lesson of this part of the story. What we call "machine intelligence" rarely tracks how new the technology actually is. It tracks how well the technology is dressed up as a counterpart. The 1958 perceptron, ELIZA in 1966, and a chess program on a 1980s Spectrum produced exactly the same "magic" reaction in their contemporaries as ChatGPT did in 2022. The difference is that ChatGPT has eighty more years of accumulated mathematics standing behind it.
What to Take From This Era
Three claims that matter for everything that follows:
- By 1969, neural networks already learned from data. They did not work well - but the paradigm (inputs → weights → output → adjust by error) was already in place. Everything that happened in the 2010s was scaling that paradigm onto hardware that did not exist in 1969.
- Symbolic AI was commercially successful before neural networks were. Expert systems made real money in the 1970s and 1980s, when neural nets were an academic niche. Worth remembering whenever someone today claims "commercial AI began with GPT."
- AI winters happen not because ideas run out, but because results fall short of promises. The 1973 funding cuts came not because neural networks failed - but because researchers had promised "general intelligence in 10 years" and shipped ELIZA. The same pattern would repeat in the late 1980s. It may be waiting for us today.
In Part 2: the 1980s and 1990s - the return of neural networks via backpropagation, the SVM era, the OCR systems that quietly read your mail at the post office for decades, and why nobody called any of it AI.
Frequently Asked Questions
When did AI actually start?
The phrase 'artificial intelligence' was coined by John McCarthy in 1955 in the proposal for the 1956 Dartmouth conference. But the McCulloch-Pitts mathematical neuron was published in 1943, and Rosenblatt's working learning perceptron ran in 1958. By the time ChatGPT launched in November 2022, the trainable neural network as an idea was over sixty years old.
What was the first AI winter and why did it happen?
The first AI winter ran roughly 1974-1980, when DARPA in the US and UK government programs sharply cut research funding. Three causes: the perceptron underperformed its hype (Minsky and Papert's 1969 critique), machine translation never reached commercial quality, and the 1973 Lighthill report flatly called general AI goals unachievable. The money left, the researchers scattered into adjacent fields.
If the 1958 perceptron was so weak, why did we come back to neural networks?
Rosenblatt's perceptron was single-layer and genuinely could not solve linearly inseparable problems like XOR. But multilayer networks with backpropagation (1986) removed that limit, and GPU-scale training in the 2010s supplied the compute Rosenblatt never had. The idea was right; two pieces were missing - a learning algorithm for deep nets, and the hardware to run it.
Was symbolic AI ever commercially successful?
Yes, very. The XCON expert system at DEC (early 1980s) saved the manufacturer roughly $40 million a year configuring VAX orders. MYCIN matched infectious-disease physicians on antibiotic recommendations. By the late 1980s the expert-systems market was measured in billions - the first big commercial wave of AI, long before deep learning.
Why is this history so rarely told?
Because it doesn't fit the marketing arc that AI was born in November 2022. Journalists, investors, and startups all benefit from a clean rupture narrative: nothing existed before ChatGPT. The actual line never broke: McCulloch-Pitts → Rosenblatt → Minsky → Hinton → Vaswani → OpenAI. Each layer sits on the previous one. Without the 1960s, there is no 2020s.
Keep reading
The Real History of AI, Part 2: Backprop, SVM, and the Second Winter (1980–2000)
In 1986 neural networks got a working learning algorithm - and most of the industry didn't notice. While the world watched expert systems collapse, OCR was already reading your mail at the post office, and SVMs were quietly winning every benchmark. The story of 'hidden AI' between the two winters.
The Real History of AI, Part 5: From the Transformer to ChatGPT (2017–2022) and a GPT-2 Case Study
ChatGPT is not the arrival of AI. It is the arrival of UX on top of a technology that had been growing for five years: BERT, GPT-1, GPT-2, GPT-3, InstructGPT. I know because in 2019 I built a commercial news-rewriting product on GPT-2 - three and a half years before the world 'discovered AI.'
The Real History of AI, Part 4: The Deep-Learning Big Bang (2012–2017)
On September 30, 2012, deep learning stopped being an academic niche. AlexNet won ImageNet by a margin nobody had ever seen in the contest. Between that day and the December 2017 paper 'Attention Is All You Need' fit five years that contain almost all of modern AI's architectural magic - from word2vec to AlphaGo to GANs.