New Learning’s Updates

Reinforcement Learning: From Behaviorism to Artificial Intelligence (and Back Again)

Media embedded July 14, 2025


This video provides a detailed historical overview of learning theories, primarily focusing on behaviorism, and connects them to the three distinct ages of artificial intelligence (AI) and their applications in education. The speakers, Bill Cope and Mary Kalantzis, argue that while behaviorism fell out of favor in psychology, its core principles have been resurrected and are now fundamental to how modern AI systems learn.

The History of Behaviorism
The video begins by tracing the origins and development of behaviorism, a school of thought that emphasizes observable behaviors over internal mental states.

  • Ivan Pavlov ([5:09]): A Russian physiologist who pioneered the concept of the conditioned reflex. His famous experiments showed that dogs could be conditioned to salivate at the sound of a bell, demonstrating that physiological responses could be learned through association. This introduced the idea ofreinforcement ([6:59]).
  • Edward Thorndike ([7:46]): Coined the "law of effect," which posits that behaviors followed by successful outcomes are more likely to be repeated, while those followed by errors are less likely.
  • John B. Watson ([8:16]): Advocated for a psychology that only studies observable behavior, famously stating he could train any infant to become any type of specialist ([9:35]). His controversial "Little Albert" experiment ([10:49]) demonstrated that fear could be a conditioned response. Watson later applied these principles to the advertising industry ([11:35]).
  • B.F. Skinner ([12:38]): Developed the concept of operant conditioning, using rewards and punishments to shape behavior in animals and humans. He created devices like the "Skinner box" for his experiments and later designed a "teaching machine" ([22:58]) that applied these principles to education through a system of questions and reinforcements.

The Cognitive Revolution
The speakers explain how behaviorism was eventually challenged and largely replaced by cognitivism, which brought the focus back to the mind.

  • Edward C. Tolman ([26:14]): His experiments showed that even rats in a maze develop mental maps, suggesting that learning involves more than just stimulus-response.
  • Noam Chomsky ([27:23]): Delivered a critical blow to behaviorism with his review of Skinner's book Verbal Behavior. Chomsky argued that the complexity of language could not be acquired through simple reinforcement alone and proposed that humans have an innate grammatical structure in their minds.

Philosophical Roots ([29:44]): The debate is framed by long-standing philosophical ideas: John Locke's "blank slate" (nurture) versus the ideas of René Descartes ("I think, therefore I am") and Immanuel Kant, who emphasized the mind's inherent structures (nature).

The Three Ages of AI in Education
The video then connects this history to the evolution of AI, arguing that what was once considered too mechanistic for humans is now perfect for machines.

  • First Age: Symbolic, Rule-Based AI

This era was defined by expert systems where humans programmed explicit rules for the machine to follow.

The PLATO System ([44:39]): Developed at the University of Illinois, this was the first computer-mediated learning system. It used programmed learning, a computerized version of Skinner's teaching machine, to deliver instruction ([47:18]).
Seymour Papert's Logo ([54:42]): As a counter-movement, Papert created Logo, a programming language that allowed children to control a "turtle" on the screen. This was a shift towards constructionism, where the learner programs the machine, rather than the other way around.

  • Second Age: Statistical, Data-Driven AI

This paradigm shift moved away from explicit rules to statistical pattern recognition in large datasets.

ImageNet ([58:43]): A massive project led by Fei-Fei Li that involved labeling millions of images. The AI learned to identify objects by finding statistical patterns in the pixel data, a process known as supervised machine learning.
Automated Essay Grading ([1:02:18]): Systems like ETS's e-rater were trained on thousands of human-graded essays. The AI learned to assign a grade by identifying statistical patterns in the text, without understanding the content itself.

  • Third Age: Text-Semantic, Generative AI

This is the current age of large language models (LLMs) like GPT.

How it Learns ([1:08:26]): These models are pre-trained on a massive corpus of text and learn the probabilistic relationships between words. The video highlights five ways humans are involved in this learning process, from creating the initial data to providing ongoing feedback.
Reinforcement Learning with Human Feedback (RLHF) ([1:17:13]): This is a key process where humans fine-tune the AI to align its responses with desired values, preventing it from generating harmful or inappropriate content.

The Future of Learning with AI
The speakers conclude by discussing the implications of these technologies for modern education. They caution that many current educational apps represent a return to simplistic behaviorism ([1:41:26]). As an alternative, they present their own platform, Cyber Scholar ([1:28:04]), which aims for deeper learning by:

  • Using a curated knowledge base to ensure the AI provides feedback based on reliable sources selected by the teacher.
  • Employing rubric agents to align AI feedback with specific learning goals.
  • Providing tools that prompt students to think critically rather than doing the work for them.
  • Tracking student progress and their interaction with the AI to promote agency and prevent cognitive offloading.

The central message is that AI will not replace teachers, but the role of the teacher will evolve. Educators must learn to harness these powerful tools to design creative and productive learning environments that go beyond simple reinforcement ([1:39:50]).