chunk-based learning

Quick summary

Chunks are built in long term memory by practising similar but different questions. 

The chunk, once built, holds linked information about a process both

  • information about the trigger or "perceptual cue" which helps us decide which chunks might be useful,

  • information about the process itself e.g. the order of the steps of process etc.

Much more powerfully we can build chunks of chunks etc.

What experts do - through deliberate practice and meta-thinking is build up chunks, chunks of chunks etc. which help them use their limited working memory to appear to have a huge capacity of working memory. They don't have more working memory, they just make very good use of their chunks, and are good at building new chunks, especially from existing chunks. Experts make excellent use  of "the more you know, the more you can learn".

timely practice wants low attaining learners to be able to become more expert in their learning - but unlike the expert, who self guides their learning, we want the teacher to guide their learners' learning.

Gobet 2005

Here I summarise and quote from Gobet, F. (2005). Chunking models of expertise: Implications for education. Applied Cognitive Psychology, 19, 183-204.

When I first came across Gobet and his summary of many researchers over many years in machine based learning - often applied to chess - I thought  "how useful is that likely to be?" I was quite dismissive of using research about how best to computers to play chess to think about how best to teach low attaining learners about maths, but bear with me.

Gobet says

Without variation, schemata (or chunks) cannot be created. For example, in the case of elementary mathematics, presenting a narrow range of problems will hamper the acquisition of a sufficient variety of chunks and links connecting them, and, consequently, schemata are not likely to be formed. Chunk-based models actually warn us against any excess of optimism in the use of new technologies, as long as they do not help circumvent the key limiting constants of human cognition (i.e. attention, STM = working memory , and learning rates).

Our limited understanding of the complex dynamics underlying expert behaviour does not belittle the fact that individual differences are important in instruction. At the least, it has been shown that feedback tailored to individual students provides better instruction than feedback given in a classroom (Bloom, 1984). What do chunk-based theories have to say about this question? We may mention two important implications for education 

First, while individual differences tend to be diluted by large amounts of practice, they play a large role in the early stages of studying a domain, which characterizes much of classroom instruction.

Second, as seen earlier, taking into account individual differences may lead to better instruction, because instruction can be optimized for each student, including feedback on progress, organization of material, and choice of learning strategies to be taught.

Presenting components of the right size and difficulty will help students direct attention to the important features of the material, and in turn help the acquisition of perceptual chunks that are appropriate, given the task at hand.

Hence timely practice's focus on a spiral of gently rising attainment.

Gobet goes on to talk about identifying the parts and the order to teach efficiently - which very much chimes with the most recent Ofsted obsession of "curriculum".

A curriculum is only as good as the embedded learning resulting from it. If learners only retain a small proportion of what is taught, they are not following our carefully crafted curriculum - and their curriculum (with added holes), may well be making learning harder for the learner. 

Another important role for teachers is to provide feedback, an obvious way to highlight the important features of a problem, and thus favour the acquisition of correct knowledge. Clearly, this is easier to do with private instruction than in the classroom (Bloom, 1984)

I’ve found these guidelines very useful despite the fact that some of it feels like its talking about teaching chess.

Teach from the simple to the complex

Teach from the known to the unknown

The elements to be learnt should be clearly identified

Use an ‘improving spiral,’ where you come back to the same concepts and ideas and add increasingly more complex new information

Focus on a limited number of types of standard problem situations, and teach the various methods in these positions thoroughly

Repetition is necessary. Go over the same material several times, using varying points of view and a wide range of examples

At the beginning, don’t encourage students to carry out their own analysis of well-known problem situations, as they do not possess the key concepts yet

Encourage students to find a balance between rote learning and understanding

From Chunking Mechanisms and Learning Gobet, F. & Lane, P. (2012),

By 2012 the chunk theory has been refined to become the template theory.

The template theory proposes that frequently used chunks become “templates”, a type of schema. A template consists of a core, which contains constant information, and slots, where variable information can be stored. The presence of templates considerably expands experts’ memory capability.


A first implication of chunk-based theories is that acquiring a new chunk has a time cost, and therefore time at the task is essential, be it in mathematics or dancing. As documented by research into deliberate practice, practice must be tailored to the goal of improving performance. 


Templates are created when the context offers both constant and variable information. As a consequence, and as is well established in the educational literature, it is essential to have variability during learning if templates are to be created.


Finally, chunk-based theories are fairly open to the possibility of large individual differences in people’s cognitive abilities. In particular, while they postulate fixed parameters for short-term memory capacity and learning rates, it is plausible that these parameters vary between individuals. In addition, differences in knowledge will lead to individual differences in performance.


A clear prediction of chunk-based theories is that individual differences play a large role in the early stages of learning, as is typical of classroom instruction, but tend to be less important after large amounts of knowledge have been acquired through practice and study.

What Gobet and Lane say about individual differences and working memory is hugely pertinent to teaching low attaining learners maths

A clear prediction of chunk-based theories is that individual differences play a large role in the early stages of learning

Chunk-based models actually warn us against any excess of optimism in the use of new technologies, as long as they do not help circumvent the key limiting constants of human cognition (i.e. attention, STM = working memory , and learning rates).

What we do with timely practice, is offer a reliable way for low attaining learners to build chunks, thus by the extra effort of reliable retrieval practice we circumvent the problem of limited working memory capacity.