Why is learning effortful? Why do we struggle to learn calculus but easily learn our mother tongue? How can we make hard skills easier to learn? Cognitive load theory is a powerful framework from psychology for making sense of these questions.
Cognitive load theory, developed in the 1980s by psychologist John Sweller, has become a dominant paradigm for the design of teaching materials. In this essay, I explain the theory, some of its key predictions, and potential applications for your learning.
Why is Most Learning Hard?
The central concept in cognitive load theory is that we have limited mental bandwidth for dealing with new information, but no such limitations when dealing with previously mastered material.
For example, the first time you saw an algebraic expression (e.g., 4 + x = 7), you might have been a bit confused by the “x.” The idea of moving statements probably seemed strange—before that, you just had to calculate what was on the other side of the equals sign.
However, notice what wasn’t confusing: You already knew the numbers. You knew what “+” meant. These things probably didn’t stand out at all since you already understood them. Imagine how much harder it would be to understand algebra if you didn’t already know these things.
This phenomenon explains why we can struggle with challenging classes. Suppose we are missing foundational patterns in long-term memory. In that case, instruction may require us to juggle too many new pieces of information simultaneously. These will slip out of working memory, and we’ll fail to learn.
Why are Some Subjects Learned Effortlessly?
The working memory system is a form of conscious learning. But not all learning is conscious. Psychologists have long marveled at children’s ability to acquire perfect pronunciation in their first language or recognize faces. People socialize into cultures without always being able to articulate those cultures’ rules.
Cognitive load theorists argue that we’re evolutionarily predisposed to learn certain patterns of information. Some of these skills and subjects are acquired without effortful cognitive processing.1
Other skills (such as literacy and numeracy) have not been around long enough for us to have innate learning mechanisms. Instead, we learn these skills by relying on other, innate learning mechanisms (letter recognition seems to co-opt parts of the brain designed for recognizing faces) and more general-purpose learning mechanisms that involve conscious processing.2
This distinction helps explain why we learn some things effortlessly, while other subjects require years of specialized training.
Three Types of Cognitive Load
Cognitive load theory separates three different demands that learning puts on our limited working memory capacity:
- Intrinsic load. The combined attention that’s necessary to learn the pattern that will be put into long-term memory.
- Extraneous load. Unnecessary load distracts from learning the pattern. Obvious distractions that eat up working memory, such as television in the background, make learning harder. But extraneous load also includes mental work needed to learn a subject that isn’t necessary. Poorly organized study materials can increase cognitive load. Examples of this include having to flip between pages to understand a diagram, or making students figure out a pattern that could be taught explicitly.
- Germane load. Efforts that improve learning outcomes but are not strictly necessary to learn the pattern. Some forms of germane load include self-explanations and retrieval practice, both of which are effortful but increase the ability to recall a pattern later.34
Initially, I found germane load confusing. If excessive cognitive load impedes learning, isn’t the category of “germane” load just a sneaky way of saying sometimes it doesn’t?
Not quite. Working memory has a fixed capacity. If the intrinsic load fills the entire available space, any additional load will be harmful. However, if intrinsic load is not near the maximum, the “spare” capacity can be used for activities that deepen learning.
Consider variable practice, the idea of practicing a skill with an increased range of problems and in different contexts. It’s harder than practice which occurs in only a narrow range of problems. Yet, there’s evidence that variable practice leads to better learning and transfer.5
However, the learning benefit of variable practice only occurs when cognitive load isn’t overwhelmed. If it is, then simpler forms of practice become preferable.6
Key Experiments in Cognitive Load Theory
Over the past few decades, cognitive load theory has amassed a lot of interesting experimental effects with catchy-sounding names. Here are a few:
1. The Worked-Example Effect
Traditionally, math education has focused on having students solve problems to get good at math. Sweller and Cooper pushed back against this idea, showing that studying worked examples (problems, along with detailed solutions) is often more efficient.7
Worked examples have since been shown to be powerful tools in many domains. The rationale is that problem solving is a cognitively demanding activity. This creates a lot of extraneous load, making it harder to abstract what the general solution procedure involves.
Sweller and Cooper, of course, agree that practice is helpful. But they argue in favor of presenting lots of examples first. In their model, practice should start with access to examples so students can emulate the pattern. Finally, practice without the solutions available becomes helpful when the material is learned well enough that retrieval efforts are germane load rather than just too much.
2. The Goal-Free Effect
One reason problem solving is difficult is that it requires you to keep in mind the goal you’re trying to reach, how far you are from the goal, and potential operations to move forward. This creates a lot of cognitive load that makes it harder to identify the solution procedure.
Removing an explicit goal can also reduce cognitive load. For example, a classic trigonometry problem might ask a student to find a particular angle. A “goal-free” way to present this would be to ask students to find as many angles as possible.
Research shows that early, goal-free problems result in greater learning, consistent with cognitive load theory.8
The downside of goal-free practice, however, is that if there are too many possible actions, most of those explored will be useless. Solving a trigonometry puzzle with several unknowns is helpful. But learning to program by randomly typing in commands is not. Worked examples tend to be a more general tool, since they enable useful patterns to be learned rather than guessed at.
3. The Split-Attention Effect
Cognitive load isn’t just found in problem solving. Badly designed instructional materials can increase cognitive load by requiring learners to move their attention around to understand them.9
Consider these two flashcards for learning Chinese characters. The first creates extra cognitive load since the pairing between sound and character requires more spatial manipulation. Learning is enhanced when instructional materials are organized so that information doesn’t require any manipulation to be understood.
4. The Expertise-Reversal Effect
Cognitive load theory predicts that for novices exposed to information for the first time, worked examples are better than problem solving. But, interestingly, this effect reverses as you gain more experience.10
One explanation for this is in terms of redundancy. If the solution pattern is already stored in your long-term memory, making sense of a worked example doesn’t help much. In this case, it is better to retrieve the answer directly from memory without distracting yourself with the example.
Another explanation is that if the problems are reasonably easy to solve, worked examples may not provoke deep enough processing. Solving a problem yourself is a kind of germane load akin to retrieval practice.
Applying Cognitive Load Theory to Your Studies
Cognitive load theory’s principal applications are in instructional design. How should a subject be taught so that students will efficiently master the patterns of knowledge it contains? Cognitive load theory favors direct instruction, quick feedback and plenty of practice.
However, as learners, we’re often just given instructional materials. What can we do to optimize cognitive load, given that perfect explanations and studying resources aren’t always available?
Here are a few suggestions:
1. Study examples before solving problems.
While some amount of “figuring things out” is often the only path available, this can make it harder to grasp the key concepts. There are a few tools you can apply, as a learner, to make this easier:
- Look for examples online. Khan Academy and many other websites offer detailed instructions and worked examples for common problems.
- Look for problem sets with solutions. This was a big part of my MIT Challenge. Copious problem sets with solutions let you shift between studying the steps of a worked solution and practicing it yourself. This approach tends to beat instructions that only talk about problem solving at a general level (and omit the specifics of a worked example). It also allows you to shift to solving problems yourself once you’ve gotten a good grasp of the material.
- Self-explain your homework when given feedback. In a traditional class, solutions often aren’t provided until long after the homework assignment. In this case, after you get the solutions, spend the time to thoroughly explain to yourself the solution to problems you found difficult. Self-explanations are a germane load that ensures your homework feedback is put to good use.
This approach applies to non-technical subjects as well. When learning to paint, I made heavy use of video tutorials where I worked on the same painting as the instructor. I’d usually watch the video through once, then work alongside the instructor on a second pass.
2. If a class confuses you, slow it down early.
In my experience, the Feynman Technique mainly works by slowing things down. A concept can be confusing in a lecture because critical assumptions aren’t made explicit or intervening steps are skipped. Walking through the explanation yourself lets you figure out exactly where you get lost.
A difficult class is one where cognitive load is near your maximum. Sometimes it will go too far, and you’ll get lost. Catching these moments early and fixing them is a big part of staying on top of your studies. Since omitted knowledge is often reused in later parts of the class, failing to understand something important in an early lecture can mean the rest of the class time is wasted.
3. Build your prerequisite knowledge and procedural fluency.
Cognitive load theory is most important in domains where there is great element interactivity. This means that many different pieces of information all need to be in place before you can understand the problem. In contrast, a subject might have extensive difficulty. In this case, there may be a large body of information to learn, but you rarely need all of it at once.
Math and science tend to have high element interactivity, which is why mastery of them is seen as a sign of intelligence. Working memory is associated with intelligence, and those with slightly more working memory can handle slightly greater element interactivity. While this creates only a modest advantage in the short term, greater ease in learning basic concepts can accumulate into a considerable advantage in the long run.
If you’re struggling in a subject with high element interactivity, the key is to go back and invest in more practice in the underlying skills. Doing this will make you more fluent in the component knowledge, which frees up more working memory for handling the new topics.
My Changing Views on Cognitive Load
I’ll confess, I didn’t fully appreciate cognitive load theory when I first encountered it. I tended to equate “problem solving” with “practice.” Since practice is essential for learning, I reasoned that problem solving must be equally important. Real life involves a lot of problem solving, so why shouldn’t you practice it?
There seem to be two good answers to my misconception:
- Problem solving isn’t a skill. The way we get good at solving problems is by having (a) knowledge that assists in solving the problem and (b) automatic procedural components that help in solving problems. There are probably no general problem solving methods that work for every domain. Heuristics for problem solving within a domain might exist. Still, the significance of these is overwhelmed by the power of having tons of learned patterns in memory. This explains why transfer is elusive and why expertise tends to be specific.
- Practice improves fluency, but it’s most efficient to have the right method first. It is critical for complex skills with many interacting parts. Figuring out what works through trial and error is inefficient. Worked examples, clear instructions, and background knowledge all help to put practice on the right tracks.
When I discussed these revelations with a friend, he asked how they might have changed my previous learning projects. I can think of a few places where I made mistakes:
- During my portrait drawing challenge, I initially focused on getting lots of practice with feedback. However, taking the class with Vitruvian Studios made the most significant difference. A good method can save countless hours of practice.
- Cognitive load theory helps me make sense of the optimal time to start immersion when learning a language. For Vat and I, the ~50 hours we spent on Spanish was enough to get going relatively smoothly. Yet even 100 hours in Chinese was still a bit of a grind for me when we first arrived. For Korean, we ended up doing most of the preparatory work in Seoul, which was a somewhat wasted opportunity. Cognitive load theory helps explain how the design choices Vat and I made on the trip made some parts more successful than others. (For instance, Google Translate was a great way to alleviate cognitive load in speaking situations that otherwise would have been above our level.)
- The cognitive load was too high in my quantum mechanics project. Part of this was the several-year gap I had since using calculus and differential equations. The components weren’t as fresh, so I was relearning a little too much. But a bigger part was that I didn’t have as many problem sets with solutions as I would have liked. If I had more, I could have used the first batch as worked examples rather than needing to use them sparingly. In the future, I’d probably do some warm-up to refresh my prerequisite skills and seek out a textbook with tons of sample problems and solutions so I could study with a tighter feedback loop.
Even after fifteen years of obsessing about the topic, I’m always working to refine my learning process. As always, I’ll continue to share what I find with you.
- Geary, David C., and David C. Geary. “Educating the evolved mind.” Educating the evolved mind (2007): 1-99.
- Paulo Ventura,”Let’s Face It: Reading Acquisition, Face and Word Processing,” Frontiers in Psychology 5 (2014): 787.
- Michelene T. H. Chi, Nicholas De Leeuw, Mei-Hung Chiu, and Christian LaVancher, “Eliciting Self-Explanations Improves Understanding,” Cognitive Science 18, no. 3 (1994): 439-477.
- Jeffrey D. Karpicke, and Janell R. Blunt, “Retrieval Practice Produces More Learning than Elaborative Studying with Concept Mapping.” Science 331, no. 6018 (2011): 772-775.
- Jeroen J. G. Van Merriënboer, Marcel B. M. de Croock, and Otto Jelsma, “The Transfer Paradox: Effects of Contextual Interference on Retention and Transfer Performance of a Complex Cognitive Skill.” Perceptual and Motor Skills 84, no. 3 (1997): 784-786.
- Vicki Likourezos, Slava Kalyuga, and John Sweller, “The Variability Effect: When Instructional Variability is Advantageous.” Educational Psychology Review 31, no. 2 (2019): 479-497.
- John Sweller, and Graham A. Cooper, “The Use of Worked Examples as a Substitute for Problem Solving in Learning Algebra,” Cognition and Instruction 2, no. 1 (1985): 59-89.
- Fred Paas, and Femke Kirschner, “The Goal-Free Effect,” in Encyclopedia of the Sciences of Learning, ed. N. M. Seel,(Boston: Springer, 2012). https://doi.org/10.1007/978-1-4419-1428-6_299
- Paul Chandler, and John Sweller, “The Split‐Attention Effect as a Factor in the Design of Instruction,” British Journal of Educational Psychology 62, no. 2 (1992): 233-246.
- Slava Kalyuga, “The expertise reversal effect,” in Managing Cognitive Load in Adaptive Multimedia Learning, pp. 58-80. IGI Global, 2009.