Say you need to learn a complicated skill: physics, French or computer programming. How much time should you spend building your background knowledge before you start practicing the actual skill?

Consider machine learning. One way to learn this field would be to master first the underlying math. Then, when you encountered the programming commands for certain mathematical functions, you’d know what they are doing behind the scenes.

Another way to approach this field would be to make machine learning models for things that interest you. You could get reasonably far with this without knowing calculus and linear algebra. Perhaps then the math is best saved for later?

The question of how much prerequisite work is necessary before attempting the real task occurs in many spaces. How much time should you spend in school before doing real-world work? How long should you study a language before trying to speak it? How many books should you read before starting a business?

Let’s look at some of the cognitive science for some guidance.

## What are the Prerequisites to Learning a Skill?

Obviously, if specific concepts are used to teach a subject, you need to learn those concepts first.

When I took MIT’s economics classes, they were taught using calculus. Not knowing calculus would have made solving the problems for the exam impossible. It also would have made many of the conceptual explanations harder to grok, since they were framed in terms of calculus.

However, this answer isn’t too satisfying because subjects can be taught at different levels of depth. For instance, when I took economics in my alma mater, they didn’t use calculus.

I feel like I understood economics better the MIT way. But that’s a bit of a silly argument because, of course, it’s easier to understand something when you have a deeper presentation. The question is whether it makes sense to ask everyone who wants to learn economics to first study calculus.

## Cognitive Skill Acquisition

At a basic level, we can contrast two ways of learning. One way is to memorize, by rote, the answer to every possible question in a domain. The other way is to learn a process for generating solutions in the domain.

Consider basic addition. The memorizing approach would involve memorizing all pairs of one-digit addition facts by heart (e.g., 7+4=11, 3+6=9, etc.). The procedural approach might involve counting: Pick the bigger number, then count up by one the number of times equal to the lower number (e.g., 7 + 4 = 8… 9… 10… 11!).

Interestingly, children seem to do exactly this when learning arithmetic.^{1} ^{[1]} They begin with a procedure, like counting. As they gain experience, they memorize more and more of the exact answers. Eventually, they are able to solve most arithmetic problems through recall alone, and the procedure of counting fades away.

Four things are worth noting here:

**Eventually, many answers are memorized**. This results in fast, reliable access and helps us perform complicated skills smoothly. You won’t perform well if you need to reason from first principles in everyday situations.**The procedure of counting is more compact than an array of memorized facts**. Thus the method is learned first, with fluently recalled answers coming only after much more experience.**The counting procedure can act as a backup**. Say you haven’t memorized 24 + 3. The counting procedure is slow, but it can help you answer the question (25… 26… 27!). If you have memorized other facts, you may use a different ad-hoc procedure (4+3 = 7, add 20).^{2}^{[2]}**The choice to use memorization or a procedure to find an answer depends on the effort needed to perform the procedure, the reliability of the procedure, and incentives surrounding accuracy**. Children tend to choose low-effort tools like guessing and retrieval unless they are required to use a more effortful procedure.

When performing skills, we use a variety of methods, from following a procedure to retrieving an answer from memory. With increased practice, the memory component becomes dominant for routine situations. Even when we can’t get the exact solution from memory, we may find parts of the answer which we can use to solve the problem faster.

This suggests that digging deeper has two benefits:

- It can provide a strategy to obtain the correct answer when memory fails. This backup is essential in the early phases of learning when many patterns haven’t been stored yet.
- It can assist in non-routine situations where no answer is known.

However, this analysis also shows a limit to background knowledge. Since fluent performance of a skill is mostly driven by recalling direct experiences and examples, deeper and deeper knowledge mostly helps in cases where direct experience is missing or insufficient. This becomes more and more important as you reach increasing levels of expertise, but it may not be helpful for routine performance.

## Should You Dig Deep or Dive Right In?

The evidence from skill acquisition paints a mixed picture. On the one hand, methods that directly assist with learning a domain are necessary prerequisites. Even if a brute-force approach might work, good methods are more reliable than trial-and-error.

On the other hand, routine performance is largely handled by drawing on direct experience—not working from first principles. Thus, if the principles don’t actually lay out the actions needed for routine situations, then they are mainly helpful in non-routine cases. These principles will probably only be relevant as your experience within a field grows.

What are your experiences? Do you prefer to step back and dig deep before trying to practice a skill? Or do you prefer diving in and practicing directly? I’m interested to hear your thoughts.

#### Footnotes

- Robert S. Siegler and Christopher Shipley, “Variation, Selection, and Cognitive Change.” In Developing Cognitive Competence: New Approaches to Process Modeling, eds. T. J. Simon & G. S. Halford, (Lawrence Erlbaum Associates, Inc., 1995), 31-76.
- Something like this seems to be at the heart of the debate over teaching reading. Phonics works better than whole word methods because teaching sound-to-spelling correspondence is a powerful backup method for learning new words. However, it’s also clear that most fluent readers rarely spell out familiar words, recognizing entire words as a chunk from memory.
^{[3]}