Introduction
The cognitive theory of multimedia learning was popularized by the work of Richard E. Mayer and other cognitive researchers who argue that multimedia supports the way that the human brain learns. They assert that people learn more deeply from words and pictures than from words alone, which is referred to as the multimedia principle (Mayer 2005a). Multimedia researchers generally define multimedia as the combination of text and pictures; and suggest that multimedia learning occurs when we build mental representations from these words and pictures (Mayer, 2005b). The words can be spoken or written, and the pictures can be any form of graphical imagery including illustrations, photos, animation, or video. Multimedia instructional design attempts to use cognitive research to combine words and pictures in ways that maximize learning effectiveness.
Task
Cognitive Theory of Multimedia Learning
2
The theoretical foundation for the cognitive theory of multimedia learning (CTML)draws from several cognitive theories including Baddeley’s model of working memory, Paivio’s dual coding theory, and Sweller’s Theory of Cognitive Load. As a cognitive theory of learning, it falls under the larger framework of cognitive science and the information-processing model of cognition. The information processing model suggests several information stores (memory) that are governed by processes that convert stimuli to information (Moore, Burton & Myers, 2004). Cognitive science studies the nature of the brain and how it learns by drawing from research in a number of areas including psychology, neuroscience, artificial intelligence, computer science, linguistics, philosophy, and biology. The term cognitive refers to perceiving and knowing. Cognitive scientists seek to understand mental processes such as perceiving, thinking, remembering, understanding language, and learning (Stillings, Weisler, Chase, Feinstein, Garfield, & Rissland, 1995). As such, cognitive science can provide powerful insight into human nature, and, more importantly, the potential of humans to develop more efficient methods using instructional technology (Sorden, 2005).
Key Elements of the Theory
The cognitive theory of multimedia learning (CTML) centers on the idea that learners attempt to build meaningful connections between words and pictures and that they learn more deeply than they could have with words or pictures alone (Mayer, 2009). According to CTML, one of the principle aims of multimedia instruction is to encourage the learner to build a coherent mental representation from the presented material. The learner’s job is to make sense of the presented material as an active participant, ultimately constructing new knowledge.
According to Mayer and Moreno (1998) and Mayer (2003), CTML is based on three assumptions: the dual-channel assumption, the limited capacity assumption, and the active processing assumption. The dual-channel assumption is that working memory has auditory and visual channels based on Baddeley’s (1986) theory of working memory and Paivio’s (1986; Clark and Paivio, 1991) dual coding theory. Second, the limited capacity assumption is based on cognitive load theory (Sweller, 1988,1994) and states that each subsystem of working memory has a limited capacity. The third assumption is the active processing assumption which suggests that people construct knowledge in meaningful ways when they pay attention to the relevant material, organize it into a coherent mental
Cognitive Theory of Multimedia Learning
3
structure, and integrate it with their prior knowledge (Mayer, 1996, 1999).
The Three Store Structure of Memory in CTML
CTML accepts a model that includes three memory stores known as sensory memory, working memory, and long-term memory. Sweller (2005) defines sensory memory as the cognitive structure that permits us to perceive new information, working memory as the cognitive structure in which we consciously process information, and long-term memory as the cognitive structure that stores our knowledge base. We are only conscious of information in long-term memory when it has been transferred to working memory. Mayer (2005a) states that sensory memory has a visual sensory memory that briefly holds pictures and printed text as visual images; and auditory memory that briefly holds spoken words and sounds as auditory images. Schnotz (2005) refers to sensory memory as sensory registers or sensory channels and points out that though we tend to view the dual channel sensors as eye-to-visual working memory and ear-to-auditory working memory, that it is possible for other sensory channels to introduce information to working memory such as “reading” with the fingers through Braille or a deaf person being able to “hear” by reading lips.
Working memory attends to, or selects information from sensory memory for processing and integration. Sensory memory holds an exact sensory copy of what was presented for less than .25 of a second, while working memory holds a processed version of what was presented for generally less than thirty seconds and can process only a few pieces of material at any one time (Mayer 2010a). Long-term memory holds the entire store of a person’s knowledge for an indefinite amount of time. Figure 1 is a representation of how memory works according to Mayer’s cognitive theory of multimedia learning.
Figure 1 Mayer’s Cognitive Theory of Multimedia Learning (Mayer 2010a)
Mayer (2005a) states that there are also five forms of representation of words and pictures that occur as information is
Cognitive Theory of Multimedia Learning
4
processed by memory. Each form represents a particular stage of processing in the three memory stores model of multimedia learning. The first form of representation is the words and pictures in the multimedia presentation itself. The second form is the acoustic representation (sounds) and iconic representation (images) in sensory memory. The third form is the sounds and images in working memory. The fourth form of representation is the verbal and pictorial models which are also found in working memory. The fifth form is prior knowledge, or schemas, which are stored in long-term memory.
According to CTML, content knowledge is contained in schemas which are cognitive constructs that organize information for storage in long term memory. Schemas organize simpler elements that can then act as elements in higher order schemas. As learning occurs, increasingly sophisticated schemas are developed and learned procedures are transferred from controlled to automatic processing. Automation frees capacity in working memory for other functions. This process of developing increasingly complicated schemas that build on each other is also similar to the explanation given by Chi, Glaser, and Rees (1982) for the transition from novice to expert in a domain.
The Development of the Theory of Working Memory
The current conception of working memory in CTML grew out of Atkinson & Shiffrin’s (1968) model of short term memory. The Atkinson & Shiffrin model was viewed primarily as a structure for temporarily storing information before it passed to long-term memory. Eventually, researchers began to question some of the assumptions of short-term memory and a few started to look for better explanations. Baddeley and Hitch (1974) subsequently proposed a more complex model of short-term memory which they called working memory. Their model for working memory was a system with subcomponents that not only held temporary information, but processed it so that several pieces of verbal or visual information could be stored and integrated.
Baddeley (1986, 1999) later proposed that there was an additional component in working memory called the central executive. According to the theory, the central executive controlled the two subcomponents of working memory, known as the visuo-spatial sketch pad and the phonological loop. The central executive also was responsible for controlling the overall system and engaging in problem solving tasks and focusing attention. Baddeley theorized that the central executive could transfer storage tasks to the two subcomponent systems in working memory, so that the central
Cognitive Theory of Multimedia Learning
5
executive would continue to have capacity for performing more demanding selection and information processing tasks.
The visuo-spatial sketch pad is assumed to maintain and manipulate visual images. The phonological loop stores and rehearses verbal information. It has also been suggested that the phonological loop has an important function of facilitating the acquisition of language by maintaining a new word in working memory until it can be learned (Baddeley, Gathercole, &Papagno, 1998). Baddeley (2002) eventually proposed the addition of a third subsystem known as the episodic buffer, which has acquired some of the tasks that were originally attributed to the central executive (now seen as a purely attentional system). The episodic buffer functions as a storage structure which acts as a limited capacity interface to integrate multiple sources of information from other slave systems.
Sweller (2005) and Yuan, Steedle, Shavelson, Alonzo & Oppezo (2006) suggest that while there is strong evidence for the two main subcomponents in working memory, that there is less evidence for a central executive that consciously attends to information in sensory memory. Rather, Sweller suggests that schemas which exist in long-term memory serve as the executive function, ultimately directing working memory to attend to information that fits pre-existing schemas. Schemas determine which information enters working memory because we tend to pay attention to information that fits the knowledge that we already have. This would support the idea that our paradigms cause us to focus on information that fits our existing beliefs, while ignoring information that does not fit neatly into our understanding of the world.
Meaningful Learning
Mayer (2010a) argues that meaningful learning from words and pictures happens when the learner engages in five cognitive processes:
1. selecting relevant words for processing in verbal working memory
2. selecting relevant images for processing in visual working memory
3. organizing selected words into a verbal model
4. organizing selected images into a pictorial model
5. integrating the verbal and pictorial representations with each other and with prior knowledge.
Cognitive Theory of Multimedia Learning
6
These cognitive processes in working memory determine which information is attended to or selected, which knowledge is retrieved from long term memory and integrated with new the information to construct new knowledge, and ultimately, which bits of new knowledge are transferred to long-term memory. Knowledge that is constructed in working memory is transferred to long-term memory through the process of encoding (Mayer, 2008b). However, Dwyer & Dwyer (2006) caution that proper encoding requires rehearsal and since rehearsal takes time, the multimedia lesson must allow an adequate period for incubation or it can be ineffective. Hasler, Kersten, & Sweller (2007) add that this is why learner control is important when using animation in multimedia learning.
Mayer (2009) distinguishes meaningful learning from “no learning” and “rote learning” and describes it as active learning where the learner constructs knowledge. Meaningful learning is demonstrated when the learner can apply what is presented in new situations, and students perform better on problem-solving transfer tests when they learn with words and pictures. Mayer (2008b) also identifies two types of transfer: transfer of learning and problem-solving transfer. Transfer of learning occurs when previous learning affects new learning. Problem solving transfer occurs when previous learning affects the ability to solve new problems. Mayer defines learning as a “change in knowledge attributable to experience” (2009, p. 59). Learning is personal and cannot be directly observed because it happens with the learner’s cognitive system. It must be inferred through a change in behavior such as performance on a task or test.
Cognitive Load
The limited capacity assumption states that there is a limit to the amount of information that can be processed at one time by working memory. In other words, learning is hindered when cognitive overload occurs and working memory capacity is exceeded (De Jong, 2010).DeLeeuw & Mayer (2008) theorize that there are three types of cognitive processing (essential, extraneous, and generative)and place them in the triarchic model of cognitive load. Mayer (2009) made this model the organizing framework for the cognitive theory of multimedia learning and stated that a major goal of multimedia learning and instruction is to “manage essential processing, reduce extraneous processing and foster generative processing” ( p. 57).The model is heavily based on Sweller’s cognitive load theory (Chandler & Sweller, 1991; Sweller, 1988, 1994).
Cognitive Theory of Multimedia Learning
7
According to Sweller, Van Merrienboer, and Paas (1998), there are three types of cognitive load: intrinsic, extraneous, and germane. Intrinsic cognitive load occurs during the interaction between the nature of the material being learned and the expertise of the learner. The second type, extraneous cognitive load, is caused by factors that aren’t central to the material to be learned, such as presentation methods or activities that split attention between multiple sources of information, and these should be minimized as much as possible. The third type of cognitive load, germane cognitive load, enhances learning and results in task resources being devoted to schema acquisition and automation. Intrinsic cognitive load cannot be manipulated, but extraneous and germane cognitive load can.
In the triarchic model of cognitive load, essential processing (intrinsic load) relates to the essential material or information to be learned. Extraneous processing (extrinsic load) does not serve the instructional goal or purpose and reduces the chances that transfer of learning will occur. Generative processing (germane cognitive load) is aimed at making sense of the presented material. It is the activity of organizing and integrating information in working memory.
De Jong (2010) has called into question whether there is truly a distinction between intrinsic (essential) and germane (generative) cognitive load, writing that if “intrinsic load and germane load are defined in terms of relatively similar learning processes, the difference between the two seems to be very much a matter of degree, and possibly non-existent” (p. 111).Deleeuw and Mayer (2008), however, did report finding that extraneous, essential, and generative processing appear to be able to be measured by different assessment instruments, suggesting that they are three distinct constructs.
The Science of Instruction
The previous sections describe what Mayer (2009) calls the science of learning, while this section explains what Mayer calls the science of instruction and defines as the “creation of evidence-based principles for helping people learn” (2009, pp. 29), or more simply as the “scientific study of how to help people learn” (Mayer, 2010a, p. 543).Mayer insists that research on multimedia instruction must be theory-grounded and evidence-based. Theory-grounded means that each principle, method and concept is derived from a theory of multimedia learning. Evidence-based means that each principle, method and concept is supported by an empirical base of replicated findings from rigorous and appropriate research
Cognitive Theory of Multimedia Learning
8
studies, which yields testable predictions. Mayer (2011a) subsequently adds the science of assessment to the sciences of learning and instruction to form what he calls the “Big Three” (p. 2).
As part of his evidence-seeking efforts for the science of instruction, Mayer (2009) identifies the following twelve multimedia instructional principles which were developed from nearly 100 studies over the past two decades:
• Coherence Principle – People learn better when extraneous material is excluded rather than included.
• Signaling Principle – People learn better when cues that highlight the organization of the essential material are added.
• Redundancy Principle – People learn better from graphics and narration than from graphics, narration, and printed text.
• Spatial Contiguity Principle – People learn better when corresponding words and pictures are placed near each other rather than far from each other on the page or screen.
• Temporal Contiguity Principle – People learn better when corresponding words and pictures are presented at the same time rather than in succession.
• Segmenting Principal – People learn better when a multimedia lesson is presented in user-paced segments rather than as a continuous unit.
• Pre-training Principle – People learn more deeply from a multimedia message when they receive pre-training in the names and characteristics of key components.
• Modality Principle – People learn better from graphics and narration than from graphics and printed text.
• Multimedia Principle – People learn better from words and pictures than from words alone.
• Personalization Principle – People learn better from a multimedia presentation when the words are in conversational style rather than in formal style.
• Voice Principle – People learn better when the words in a multimedia message are spoken by a friendly human voice rather than a machine voice.
• Image Principle – People do not necessarily learn more deeply from a multimedia presentation when the speaker’s image is on the screen rather than not on the screen.
Cognitive Theory of Multimedia Learning
9
As mentioned earlier, these twelve principles are grouped ina framework based on the three types of cognitive load (Mayer 2009):
• reducing extraneous processing – coherence, signaling, redundancy, spatial contiguity, temporal contiguity
• managing essential processing – segmenting, pre-training, modality
• fostering generative processing – multimedia, personalization, voice, image
In addition to these instructional principles, Mayer (2009)includes boundary conditions that can determine the effectiveness of some of the principles. These boundary conditions are a recent addition to the theory, and they suggest that the instructional principles in CTML are not universal, absolute rules. Some have criticized the existence of boundary conditions in CTML as an indicator that the theory has inconsistencies (De Jong, 2010), but Mayer (2010b) views boundary conditions as a healthy evolution in CTML that allows the theory to continue to develop and be implemented realistically, rather than as a set of immutable rules that have to be followed in all situations.
One example of a boundary condition is that of individual-differences, which states that some instructional methods or principles may be more effective for low-knowledge learners than for high-knowledge learners (Mayer 2009; Schnotz and Bannert, 2003). Kalyuga, Ayres, Chandler & Sweller (2003) have called this the expertise-reversal effect. Paas, Renkl, & Sweller (2004, pp.2-3) similarly stated this from a CLT point of view when they wrote: “A cognitive load that is germane for a novice may be extraneous for an expert. In other words, information that is relevant to the process of schema construction for a beginning learner may hinder this process for a more advanced learner.” Another example of a boundary condition is the complexity and pacing condition, which suggests that some of these methods may be more effective when the material of the lesson is complex or the pace of the presentation is fast. Each principle in CTML is subject to boundary conditions as illustrated by Mayer (2009).
Although they haven’t appeared in recent CTML literature, Mayer suggests several “advanced” principles for multimedia learning in his 2005 book, The Cambridge Handbook of Multimedia Learning, which are listed as chapters by various authors. These should be considered as possible areas for future CTML research and not necessarily evidence-based principles.
Cognitive Theory of Multimedia Learning
10
• Animation and interactivity principles – People don’t necessarily learn better from animation than from static diagrams.
• Cognitive aging principle – Instructional design principles that effectively expand the capacity of working memory are particularly helpful for older learners.
• Collaboration principle – People learn better when involved in collaborative online learning activities.
• Guided-discovery principle – People learn better when guidance is incorporated into discovery-based multimedia environments.
• Navigation principles – People learn better in environments where appropriate navigational aids are provided.
• Prior knowledge principle – Instructional principles that are effective in increasing multimedia learning for novices may have the opposite effect on more expert learners.
• Self-explanation principle – People learn better when they are encouraged to generate self-explanations during learning.
• Site map principle – People learn better in an online environment when presented with a map showing where they are in a lesson.
• Worked-out example principle – People learn better when worked-out examples are given in initial skill learning.
In addition to the twelve principles and the advanced principles listed in this chapter, Mayer (2011a) discusses several more principles that have appeared in CTML literature over the years. This demonstrates once again that the cognitive theory of multimedia learning is dynamic. Therefore, the twelve principles should not be taken as a rigid canon, but rather a starting point for discussion. Mayer (2011b), for example, only lists ten principles just two years after he published the twelve principles, having dropped the multimedia and image principles. In fact, this number seems to vary from publication to publication, so the focus should be on understanding what the latest research suggests about the effectiveness of the various instructional methods, rather than memorizing a codified set of twelve, or any other number of principles.
Development of the Theory
The evolution of CTML literature and research is evident in the body of work published by Mayer and his colleagues over the past twenty years (Mayer, 2005a). Mayer reminds us that even though the name has changed over the years, the underlying elements of
Cognitive Theory of Multimedia Learning
11
the theory have not changed. In fact, the theory appears to have matured as it enters its third decade of active research and is finally reaching a consistently recognizable state.
See Moore, Burton, & Myers (2004) for an excellent overall accounting of the theoretical and research foundations of multimedia learning and Yuan et al. (2006) for the extensive history of working memory. The actual cognitive theory of multimedia learning first begins to emerge as a distinct theory at the end of the 1980s when Mayer (1989) introduced the theory as the “model of meaningful learning” and then shortly thereafter as the “cognitive conditions for effective illustrations” (Mayer & Gallini, 1990). It has also been called the “dual-coding model” (Mayer & Anderson, 1991, 1992), “generative theory” (Mayer, Steinhoff, Bower, & Mars, 1995), the “generative theory of multimedia learning” (Mayer, 1997: Plass, Chun, Mayer, & Leutner, 1998), and the “dual-processing model of multimedia learning” (Mayer & Moreno, 1998).
The name “cognitive theory of multimedia learning” was first used in Mayer, Bove, Bryman, Mars, and Tapangco (1996), but didn’t become the standard name for Mayer’s theory until the year 2000 and beyond. The various models over the years focused on different aspects of the current model, but the underlying assumptions remained unchanged. Elements such as cognitive processes and mental representations were slowly added and refined until we have the model currently described by Mayer (2009).
It is important to note that before her death, Roxana Moreno, a former student of Mayer’s, had begun to develop a cognitive-affective theory of learning with media.(Moreno 2005; 2006; 2007). Moreno (2005) includes factors of self-regulation and motivation in this theory and explained that this new model extends the cognitive theory of multimedia learning by “integrating assumptions regarding the relationship between cognition, metacognition and motivation and affect” (2007, p. 767). Moreno & Mayer (2007) assert that the cognitive-affective theory of learning with media (CATLM) “expands the cognitive theory of multimedia learning to media such as virtual reality, agent-based, and case-based learning environments” (p. 313).
Moreno’s model integrates three assumptions. The first assumption is that humans have a limited working memory capacity (Baddeley, 1992). The second assumption is that long-term memory consists of past experiences and general domain
Cognitive Theory of Multimedia Learning
12
knowledge, which is similar to Tulving’s (1977) distinction between episodic and semantic memory systems. The third assumption is that motivational factors affect learning by increasing or decreasing cognitive engagement (Pintrich, 2003).Paas (1992) discussed a similar distinction between mental load and mental effort from a CLT perspective nearly two decades ago.
Evaluation
There is no one single measurement instrument that is associated with CTML research. Mayer (2009) states that since the goal is to make a causal claim about instructional effectiveness, that one of the most useful approaches in CTML research is quantitative experimental comparisons, with random assignment and experimental control being two important features. The main question in this type of research is whether a particular instructional method is effective. CTML researchers generally try to identify instructional methods that cause large effect sizes of .8 or greater across many different experimental comparisons. Learning is generally measured through tests of retention and transfer, and much of the recent research has focused on the instructional methods discussed earlier in this chapter.
Because of its central role in CTML research, cognitive load theory research is also of interest. De Jong (2010) provides a lengthy criticism of the instruments and tests of measurement in cognitive load theory. He points out that one of the most frequently used methods for measuring CLT is self-reporting in a one-item questionnaire where learners indicate their perceived amount of mental effort. De Jong asserts that this approach often leads to inconsistency in the outcomes of studies that use this type of questionnaire. Another way that cognitive load has been measured is physiologically using indicators such as heart rate, blood pressure, and pupillary reactions. A third way of measuring cognitive load has been through the dual-task or secondary-task approach which indicates increased consumption of cognitive resources in the primary task when slower or inaccurate performance on the on the secondary task occurs (Brünken, Plass, & Leutner, 2003). De Jong criticizes the measurement of cognitive load as a single construct, as most of these approaches tend to do. He calls for the development of better instruments and multidimensional scales that can reliably measure intrinsic, extraneous, and germane load separately.
Applying the Cognitive Theory of Multimedia Instruction
Cognitive Theory of Multimedia Learning
13
Once we understand the science of learning and the science of instruction, the next question becomes how to apply the principles in order to foster meaningful learning. See Mayer’s (2011a) Applying the Science of Learning for a good overview of what to consider when applying the methods described in this chapter, as well as others,
This section looks at what to keep in mind as the instructional methods in CTLM are implemented. In addition to applying the twelve principles and the advanced principles presented in this chapter and in Mayer (2005a, 2009, 2011b), the instructional designer should be aware of the information presented in this section when creating multimedia instruction. These theories come from the cognitive theory of multimedia learning, cognitive load theory, and cognitive science in general. It should be remembered that they are theories, and as such should be applied with caution, but all of them have research and a theoretical background that make them worth considering as guidelines for creating better instruction.
The principles of multimedia learning should be viewed as instructional methods whose primary goal is to foster meaningful learning. An instructional method is a way of presenting a lesson; it does not change the content of the lesson—the covered content is the same. As discussed previously, the principles should not be viewed as absolute rules that have to be applied equally in every situation. They are guidelines that should be adjusted depending on the intended audience, the goals of the instruction, and boundary conditions such as the expertise level of the learner. Most important, the theory is a learner-centered learning theory (Mayer, 2009).
Learner-Centered Focus
A critical perspective to maintain while designing multimedia lessons according to CTML is that the multimedia instructional methods are learner-centered—they are not technology-centered approaches. Mayer (2009) reminds us that multimedia can be as simple as a still image with words and that it is the instructional method, not the technology that matters. Multimedia instructional designers often fall victim to letting the technology drive the instructional design, rather than looking at the design from the perspective, and limitations, of the learner.
Moreno (2006a) expressed this idea when she distinguished between a method-affects-learning hypothesis versus a media-affects-learning hypothesis. A media-affects-learning approach
Cognitive Theory of Multimedia Learning
14
could best be described as what occurred in the 20th Century when state-of-the-art technologies such as radio, television, computers, and the Internet were introduced into education with the assumption that they would improve education simply because they were better tools than had previously been available.
Managing Cognitive Load
Because the principles of CTML are organized around the three types of cognitive load, designing instruction according to cognitive load theory (CLT) research findings is important if you are designing according to CTML. Mayer, Fennell, Farmer, and Campbell (2004) cite evidence that two important ways to promote meaningful learning are to design activities that reduce cognitive load, which frees working memory capacity for deep cognitive processing during learning, and to increase the learner’s interest, which encourages the learner to use this freed capacity for deep processing during learning. CLT suggests that for instruction to be effective, care must be taken to design instruction in a way as to not overload the brain’s capacity for processing information.
CLT suggests that instructional techniques that require students to engage in activities that aren’t directed at schema acquisition and automation can quickly exceed the limited capacity of working memory and hinder learning objectives. In simple terms, this means that you shouldn’t create unnecessary activities in connection with a lesson that require excessive attention or concentration that may overload working memory and prevent one from acquiring the essential information that is to be learned. This is an important guideline in any form of instruction, but it is an essential rule in multimedia instruction because of the ease with which distractions can be incorporated. Instructional designers should not fill this limited capacity with unnecessary, flashy bells and whistles (Sorden, 2005).
An example of what this means for multimedia instructional design is that the layout should be visually appealing and intuitive, but that the activities should remain focused on the concepts to be learned, rather than trying too much to entertain. This is especially true if the entertainment is time consuming to construct and complicated for the learner to master. Working memory can be overloaded by the entertainment or activity before the learner ever gets to the concept or skill to be learned. Mayer (2009) states that effective “instructional design depends on techniques for reducing extraneous processing, managing essential processing, and fostering generative processing” (p. 57).
Cognitive Theory of Multimedia Learning
15
Schnotz and Kürschner (2007) echo this idea by stating that techniques to simply reduce cognitive load can be counterproductive. They argue that learning tasks should be adapted to the learner’s zone of proximal development which in turn depends on the learner’s level of expertise, and that intrinsic and germane cognitive load should be promoted while extraneous cognitive load is reduced. De Jong (2010) states that the three main recommendations that cognitive load theory has contributed to the field of instructional design are: “present material that aligns with the prior knowledge of the learner (intrinsic load), avoid non-essential and confusing information (extraneous load), and stimulate processes that lead to conceptually rich and deep knowledge (germane load)” (p. 111). These cognitive load processes occur simultaneously in working memory, are limited in capacity, and can only occur at the expense of the other two. If true, this creates important considerations for multimedia learning.
Task Analysis
Task analysis is tied to the concepts of schemas and levels of expertise. The multimedia lesson should try to ensure that the learner has sufficiently automated key core knowledge or tasks. The learner should do this before trying to tackle an overall task that may be beyond the learner’s current ability range, which could cause unnecessary frustration and possibly even cause the learner to drop out of the activity. The theories of Vygotsky’s Zone of Proximal Development and Piaget’s concept of scaffolding can be applied here. This suggests that a task analysis should be done during the instructional design of a multimedia lesson in order to breakdown the skills and information that are needed to learn or perform the educational objective.
Guided Instruction
According to CTML, guided instruction and worked examples are preferable to discovery learning, even though other learning theories often support discovery learning as a useful component of multimedia instruction. Mayer (2004; 2011a) and Kirschner, Sweller, & Clark (2006) caution against using discovery learning and argue that guided instruction is much more effective. Mayer (2011a) presents four principles for “studying by practicing” that support this idea. The four principles supporting guided instruction are spacing, feedback, worked example, and guided discovery.
Interactivity
While the principle of interactivity still requires more research, much of the literature suggests that infusing interactivity such as learner control, feedback, and guidance into a multimedia lesson
Cognitive Theory of Multimedia Learning
16
will increase the affective conditions that will improve learning transfer and performance (Mayer, 2009; Piaget, 1969; Renkl & Atkinson, 2007; Wittrock, 1990). Domagk, Schwartz, and Plass (2010) define interactivity as “reciprocal activity between a learner and a multimedia learning system, in which the [re]action of the learner is dependent up the [re]-action of the system and vice versa” (p. 1025). They propose a model of interactivity called the Integrated Model of Multimedia Interactivity (INTERACT) which consists of six principal components of an integrated learning system: the learning environment, behavioral activities, cognitive and metacognitive activities, motivation and emotion, learner variables, and the learner’s mental model (learning outcomes). Moreno & Mayer (2007) also describe an interactive multimodal environment that is based on the cognitive affective theory of learning with media (CATLM) and include five design principles of guided activity, reflection, feedback, pacing, and pre-training.
Animation and Screencasts
Hasler, Kersten, & Sweller (2007) suggest that animation can be more effective when learners are allowed to stop and start the animation instead of having it just play through in one pass, however this still leaves the question of whether still images are ultimately just as affective and much easier and cheaper to produce.
Regarding the use of animation to improve student achievement, Dwyer & Dwyer (2006) suggest that animation is not a viable instructional tool for improving achievement when the content to be learned is hierarchically structured. They go on to state that previous research does indicate that animation can be effectively used to teach both factual and conceptual types of information, but that this content can be taught equally well at less cost with other instructional strategies. Schnotz (2008) raises similar questions. This does not necessarily discount CTML studies, as CTML researchers have argued that simple graphical images can be highly effective when combined with words, and have already called into question whether animation is superior to still images in the “advanced” principles of animation and interactivity (Betrancourt, 2005).
Evaluation of the Theory
Validation
Theories are meant to be advanced upon and ultimately cast aside as new information is integrated and new understanding is developed. Moreno (2006a), for example, writes that “we should concede as cognitive scientists, that valid criticisms can be raised
Cognitive Theory of Multimedia Learning
17
against any existing theory of cognition and that such criticism is essential to progress. Theories and constructs are useful only as long as they evolve in their heuristic, explanatory, and predictive functions” (p. 179). While the cognitive theory of multimedia learning has generally met with acceptance, there remains questions by various learning and education theorists in certain quarters about its validity, as well as the validity of other cognitive theories upon which it is based. Mayer and his colleagues, however, counter that there is an extensive body of research that does validate this theory.
In recent years there have been several prominent researchers who have continued to develop the cognitive theories of multimedia learning and cognitive load. Among these are Richard E. Mayer, Roxana Moreno, John Sweller, Jan Plass, and Wolfgang Schnotz. Significant studies have included Mayer & Anderson (1991); Moreno & Mayer (2000); Schnotz & Bannert (2003); Pass, Renkl & Sweller (2004); and Plass, Chun, Mayer, & Leutner (2004). Gall (2004) points out that much of Mayer’s research has been published in top peer-reviewed journals such as the Journal of Educational Psychology and is available for deeper study and critique. Dacosta (2008) provides a detailed table of almost 70 published studies by CTML researchers on instructional principles, along with the number of experiments and the particular principle each study measured. For a substantial listing of dozens of CTML studies that support each of the twelve multimedia instructional principles presented in this chapter, see Mayer (2009). Finally, Yuan et al. (2006) also cite a series of studies that suggest that working memory performance correlates with cognitive abilities and academic achievement.
Mayer (2009) states that his research goal is to contribute to the cognitive theory of multimedia learning, and ultimately to practical applied instructional practice. While criticizing the technology-centered use of multimedia for instruction and the misapplication of cognitive load theory, Mayer (2005b), Ballantyne (2008), and Schnotz (2008) have all stated that it is the instructional method that is important, not the technology, no matter how sophisticated. Ultimately, the validation of the theory lies in the fact that it has a large body of studies and literature to support it, that it has exhibited “staying power” and that it continues to demand attention and exert influence in the fields of education and training. The power of the theory lies in its dynamic structure, in which it is expected and even driven to constantly change and morph as new information is discovered and tested in the field of cognitive science.
Conclusion
Conclusion
The cognitive theory of multimedia learning has progressed over the past two decades and is poised to become a mature, robust theory as it enters its third decade. Fortunately, the theoretical cognitive foundations upon which the theory is based go much further back and have contributed heavily to its framework of the “big three” sciences, as well as the structure given to its principles by the triarchic theory of cognitive load. Together, these two areas of study form what we generally understand today to be the cognitive theory of multimedia learning.
The theory is expanding into exciting new areas that will allow it to continue to evolve. Its learner-centered and cognitive-constructivist orientation makes it very relevant in current educational applications. The fact that it focuses on finding effective instructional methods rather than a specific technology makes it a dynamic theory that will allow it to expand well beyond the life cycle of any particular technology.
While the theory continues to have problematic and unanswered areas, the researchers acknowledge this and expect that the theory will continue to develop and change as new and better research techniques are developed for the study of how we learn and how the human brain works. It is an exciting field that is developing very quickly due to advances in technology and neuroscience, and there is a great need for new researchers to contribute new scientific studies to the development of the theory, the principles, the boundary conditions, and finally, the “big three” sciences of learning, instruction, and assessment themselves