• Open access
  • Published: 19 July 2015

The role of visual representations in scientific practices: from conceptual understanding and knowledge generation to ‘seeing’ how science works

  • Maria Evagorou 1 ,
  • Sibel Erduran 2 &
  • Terhi Mäntylä 3  

International Journal of STEM Education volume  2 , Article number:  11 ( 2015 ) Cite this article

72k Accesses

78 Citations

13 Altmetric

Metrics details

The use of visual representations (i.e., photographs, diagrams, models) has been part of science, and their use makes it possible for scientists to interact with and represent complex phenomena, not observable in other ways. Despite a wealth of research in science education on visual representations, the emphasis of such research has mainly been on the conceptual understanding when using visual representations and less on visual representations as epistemic objects. In this paper, we argue that by positioning visual representations as epistemic objects of scientific practices, science education can bring a renewed focus on how visualization contributes to knowledge formation in science from the learners’ perspective.

This is a theoretical paper, and in order to argue about the role of visualization, we first present a case study, that of the discovery of the structure of DNA that highlights the epistemic components of visual information in science. The second case study focuses on Faraday’s use of the lines of magnetic force. Faraday is known of his exploratory, creative, and yet systemic way of experimenting, and the visual reasoning leading to theoretical development was an inherent part of the experimentation. Third, we trace a contemporary account from science focusing on the experimental practices and how reproducibility of experimental procedures can be reinforced through video data.

Conclusions

Our conclusions suggest that in teaching science, the emphasis in visualization should shift from cognitive understanding—using the products of science to understand the content—to engaging in the processes of visualization. Furthermore, we suggest that is it essential to design curriculum materials and learning environments that create a social and epistemic context and invite students to engage in the practice of visualization as evidence, reasoning, experimental procedure, or a means of communication and reflect on these practices. Implications for teacher education include the need for teacher professional development programs to problematize the use of visual representations as epistemic objects that are part of scientific practices.

During the last decades, research and reform documents in science education across the world have been calling for an emphasis not only on the content but also on the processes of science (Bybee 2014 ; Eurydice 2012 ; Duschl and Bybee 2014 ; Osborne 2014 ; Schwartz et al. 2012 ), in order to make science accessible to the students and enable them to understand the epistemic foundation of science. Scientific practices, part of the process of science, are the cognitive and discursive activities that are targeted in science education to develop epistemic understanding and appreciation of the nature of science (Duschl et al. 2008 ) and have been the emphasis of recent reform documents in science education across the world (Achieve 2013 ; Eurydice 2012 ). With the term scientific practices, we refer to the processes that take place during scientific discoveries and include among others: asking questions, developing and using models, engaging in arguments, and constructing and communicating explanations (National Research Council 2012 ). The emphasis on scientific practices aims to move the teaching of science from knowledge to the understanding of the processes and the epistemic aspects of science. Additionally, by placing an emphasis on engaging students in scientific practices, we aim to help students acquire scientific knowledge in meaningful contexts that resemble the reality of scientific discoveries.

Despite a wealth of research in science education on visual representations, the emphasis of such research has mainly been on the conceptual understanding when using visual representations and less on visual representations as epistemic objects. In this paper, we argue that by positioning visual representations as epistemic objects, science education can bring a renewed focus on how visualization contributes to knowledge formation in science from the learners’ perspective. Specifically, the use of visual representations (i.e., photographs, diagrams, tables, charts) has been part of science and over the years has evolved with the new technologies (i.e., from drawings to advanced digital images and three dimensional models). Visualization makes it possible for scientists to interact with complex phenomena (Richards 2003 ), and they might convey important evidence not observable in other ways. Visual representations as a tool to support cognitive understanding in science have been studied extensively (i.e., Gilbert 2010 ; Wu and Shah 2004 ). Studies in science education have explored the use of images in science textbooks (i.e., Dimopoulos et al. 2003 ; Bungum 2008 ), students’ representations or models when doing science (i.e., Gilbert et al. 2008 ; Dori et al. 2003 ; Lehrer and Schauble 2012 ; Schwarz et al. 2009 ), and students’ images of science and scientists (i.e., Chambers 1983 ). Therefore, studies in the field of science education have been using the term visualization as “the formation of an internal representation from an external representation” (Gilbert et al. 2008 , p. 4) or as a tool for conceptual understanding for students.

In this paper, we do not refer to visualization as mental image, model, or presentation only (Gilbert et al. 2008 ; Philips et al. 2010 ) but instead focus on visual representations or visualization as epistemic objects. Specifically, we refer to visualization as a process for knowledge production and growth in science. In this respect, modeling is an aspect of visualization, but what we are focusing on with visualization is not on the use of model as a tool for cognitive understanding (Gilbert 2010 ; Wu and Shah 2004 ) but the on the process of modeling as a scientific practice which includes the construction and use of models, the use of other representations, the communication in the groups with the use of the visual representation, and the appreciation of the difficulties that the science phase in this process. Therefore, the purpose of this paper is to present through the history of science how visualization can be considered not only as a cognitive tool in science education but also as an epistemic object that can potentially support students to understand aspects of the nature of science.

Scientific practices and science education

According to the New Generation Science Standards (Achieve 2013 ), scientific practices refer to: asking questions and defining problems; developing and using models; planning and carrying out investigations; analyzing and interpreting data; using mathematical and computational thinking; constructing explanations and designing solutions; engaging in argument from evidence; and obtaining, evaluating, and communicating information. A significant aspect of scientific practices is that science learning is more than just about learning facts, concepts, theories, and laws. A fuller appreciation of science necessitates the understanding of the science relative to its epistemological grounding and the process that are involved in the production of knowledge (Hogan and Maglienti 2001 ; Wickman 2004 ).

The New Generation Science Standards is, among other changes, shifting away from science inquiry and towards the inclusion of scientific practices (Duschl and Bybee 2014 ; Osborne 2014 ). By comparing the abilities to do scientific inquiry (National Research Council 2000 ) with the set of scientific practices, it is evident that the latter is about engaging in the processes of doing science and experiencing in that way science in a more authentic way. Engaging in scientific practices according to Osborne ( 2014 ) “presents a more authentic picture of the endeavor that is science” (p.183) and also helps the students to develop a deeper understanding of the epistemic aspects of science. Furthermore, as Bybee ( 2014 ) argues, by engaging students in scientific practices, we involve them in an understanding of the nature of science and an understanding on the nature of scientific knowledge.

Science as a practice and scientific practices as a term emerged by the philosopher of science, Kuhn (Osborne 2014 ), refers to the processes in which the scientists engage during knowledge production and communication. The work that is followed by historians, philosophers, and sociologists of science (Latour 2011 ; Longino 2002 ; Nersessian 2008 ) revealed the scientific practices in which the scientists engage in and include among others theory development and specific ways of talking, modeling, and communicating the outcomes of science.

Visualization as an epistemic object

Schematic, pictorial symbols in the design of scientific instruments and analysis of the perceptual and functional information that is being stored in those images have been areas of investigation in philosophy of scientific experimentation (Gooding et al. 1993 ). The nature of visual perception, the relationship between thought and vision, and the role of reproducibility as a norm for experimental research form a central aspect of this domain of research in philosophy of science. For instance, Rothbart ( 1997 ) has argued that visualizations are commonplace in the theoretical sciences even if every scientific theory may not be defined by visualized models.

Visual representations (i.e., photographs, diagrams, tables, charts, models) have been used in science over the years to enable scientists to interact with complex phenomena (Richards 2003 ) and might convey important evidence not observable in other ways (Barber et al. 2006 ). Some authors (e.g., Ruivenkamp and Rip 2010 ) have argued that visualization is as a core activity of some scientific communities of practice (e.g., nanotechnology) while others (e.g., Lynch and Edgerton 1988 ) have differentiated the role of particular visualization techniques (e.g., of digital image processing in astronomy). Visualization in science includes the complex process through which scientists develop or produce imagery, schemes, and graphical representation, and therefore, what is of importance in this process is not only the result but also the methodology employed by the scientists, namely, how this result was produced. Visual representations in science may refer to objects that are believed to have some kind of material or physical existence but equally might refer to purely mental, conceptual, and abstract constructs (Pauwels 2006 ). More specifically, visual representations can be found for: (a) phenomena that are not observable with the eye (i.e., microscopic or macroscopic); (b) phenomena that do not exist as visual representations but can be translated as such (i.e., sound); and (c) in experimental settings to provide visual data representations (i.e., graphs presenting velocity of moving objects). Additionally, since science is not only about replicating reality but also about making it more understandable to people (either to the public or other scientists), visual representations are not only about reproducing the nature but also about: (a) functioning in helping solving a problem, (b) filling gaps in our knowledge, and (c) facilitating knowledge building or transfer (Lynch 2006 ).

Using or developing visual representations in the scientific practice can range from a straightforward to a complicated situation. More specifically, scientists can observe a phenomenon (i.e., mitosis) and represent it visually using a picture or diagram, which is quite straightforward. But they can also use a variety of complicated techniques (i.e., crystallography in the case of DNA studies) that are either available or need to be developed or refined in order to acquire the visual information that can be used in the process of theory development (i.e., Latour and Woolgar 1979 ). Furthermore, some visual representations need decoding, and the scientists need to learn how to read these images (i.e., radiologists); therefore, using visual representations in the process of science requires learning a new language that is specific to the medium/methods that is used (i.e., understanding an X-ray picture is different from understanding an MRI scan) and then communicating that language to other scientists and the public.

There are much intent and purposes of visual representations in scientific practices, as for example to make a diagnosis, compare, describe, and preserve for future study, verify and explore new territory, generate new data (Pauwels 2006 ), or present new methodologies. According to Latour and Woolgar ( 1979 ) and Knorr Cetina ( 1999 ), visual representations can be used either as primary data (i.e., image from a microscope). or can be used to help in concept development (i.e., models of DNA used by Watson and Crick), to uncover relationships and to make the abstract more concrete (graphs of sound waves). Therefore, visual representations and visual practices, in all forms, are an important aspect of the scientific practices in developing, clarifying, and transmitting scientific knowledge (Pauwels 2006 ).

Methods and Results: Merging Visualization and scientific practices in science

In this paper, we present three case studies that embody the working practices of scientists in an effort to present visualization as a scientific practice and present our argument about how visualization is a complex process that could include among others modeling and use of representation but is not only limited to that. The first case study explores the role of visualization in the construction of knowledge about the structure of DNA, using visuals as evidence. The second case study focuses on Faraday’s use of the lines of magnetic force and the visual reasoning leading to the theoretical development that was an inherent part of the experimentation. The third case study focuses on the current practices of scientists in the context of a peer-reviewed journal called the Journal of Visualized Experiments where the methodology is communicated through videotaped procedures. The three case studies represent the research interests of the three authors of this paper and were chosen to present how visualization as a practice can be involved in all stages of doing science, from hypothesizing and evaluating evidence (case study 1) to experimenting and reasoning (case study 2) to communicating the findings and methodology with the research community (case study 3), and represent in this way the three functions of visualization as presented by Lynch ( 2006 ). Furthermore, the last case study showcases how the development of visualization technologies has contributed to the communication of findings and methodologies in science and present in that way an aspect of current scientific practices. In all three cases, our approach is guided by the observation that the visual information is an integral part of scientific practices at the least and furthermore that they are particularly central in the scientific practices of science.

Case study 1: use visual representations as evidence in the discovery of DNA

The focus of the first case study is the discovery of the structure of DNA. The DNA was first isolated in 1869 by Friedrich Miescher, and by the late 1940s, it was known that it contained phosphate, sugar, and four nitrogen-containing chemical bases. However, no one had figured the structure of the DNA until Watson and Crick presented their model of DNA in 1953. Other than the social aspects of the discovery of the DNA, another important aspect was the role of visual evidence that led to knowledge development in the area. More specifically, by studying the personal accounts of Watson ( 1968 ) and Crick ( 1988 ) about the discovery of the structure of the DNA, the following main ideas regarding the role of visual representations in the production of knowledge can be identified: (a) The use of visual representations was an important part of knowledge growth and was often dependent upon the discovery of new technologies (i.e., better microscopes or better techniques in crystallography that would provide better visual representations as evidence of the helical structure of the DNA); and (b) Models (three-dimensional) were used as a way to represent the visual images (X-ray images) and connect them to the evidence provided by other sources to see whether the theory can be supported. Therefore, the model of DNA was built based on the combination of visual evidence and experimental data.

An example showcasing the importance of visual representations in the process of knowledge production in this case is provided by Watson, in his book The Double Helix (1968):

…since the middle of the summer Rosy [Rosalind Franklin] had had evidence for a new three-dimensional form of DNA. It occurred when the DNA 2molecules were surrounded by a large amount of water. When I asked what the pattern was like, Maurice went into the adjacent room to pick up a print of the new form they called the “B” structure. The instant I saw the picture, my mouth fell open and my pulse began to race. The pattern was unbelievably simpler than those previously obtained (A form). Moreover, the black cross of reflections which dominated the picture could arise only from a helical structure. With the A form the argument for the helix was never straightforward, and considerable ambiguity existed as to exactly which type of helical symmetry was present. With the B form however, mere inspection of its X-ray picture gave several of the vital helical parameters. (p. 167-169)

As suggested by Watson’s personal account of the discovery of the DNA, the photo taken by Rosalind Franklin (Fig.  1 ) convinced him that the DNA molecule must consist of two chains arranged in a paired helix, which resembles a spiral staircase or ladder, and on March 7, 1953, Watson and Crick finished and presented their model of the structure of DNA (Watson and Berry 2004 ; Watson 1968 ) which was based on the visual information provided by the X-ray image and their knowledge of chemistry.

X-ray chrystallography of DNA

In analyzing the visualization practice in this case study, we observe the following instances that highlight how the visual information played a role:

Asking questions and defining problems: The real world in the model of science can at some points only be observed through visual representations or representations, i.e., if we are using DNA as an example, the structure of DNA was only observable through the crystallography images produced by Rosalind Franklin in the laboratory. There was no other way to observe the structure of DNA, therefore the real world.

Analyzing and interpreting data: The images that resulted from crystallography as well as their interpretations served as the data for the scientists studying the structure of DNA.

Experimenting: The data in the form of visual information were used to predict the possible structure of the DNA.

Modeling: Based on the prediction, an actual three-dimensional model was prepared by Watson and Crick. The first model did not fit with the real world (refuted by Rosalind Franklin and her research group from King’s College) and Watson and Crick had to go through the same process again to find better visual evidence (better crystallography images) and create an improved visual model.

Example excerpts from Watson’s biography provide further evidence for how visualization practices were applied in the context of the discovery of DNA (Table  1 ).

In summary, by examining the history of the discovery of DNA, we showcased how visual data is used as scientific evidence in science, identifying in that way an aspect of the nature of science that is still unexplored in the history of science and an aspect that has been ignored in the teaching of science. Visual representations are used in many ways: as images, as models, as evidence to support or rebut a model, and as interpretations of reality.

Case study 2: applying visual reasoning in knowledge production, the example of the lines of magnetic force

The focus of this case study is on Faraday’s use of the lines of magnetic force. Faraday is known of his exploratory, creative, and yet systemic way of experimenting, and the visual reasoning leading to theoretical development was an inherent part of this experimentation (Gooding 2006 ). Faraday’s articles or notebooks do not include mathematical formulations; instead, they include images and illustrations from experimental devices and setups to the recapping of his theoretical ideas (Nersessian 2008 ). According to Gooding ( 2006 ), “Faraday’s visual method was designed not to copy apparent features of the world, but to analyse and replicate them” (2006, p. 46).

The lines of force played a central role in Faraday’s research on electricity and magnetism and in the development of his “field theory” (Faraday 1852a ; Nersessian 1984 ). Before Faraday, the experiments with iron filings around magnets were known and the term “magnetic curves” was used for the iron filing patterns and also for the geometrical constructs derived from the mathematical theory of magnetism (Gooding et al. 1993 ). However, Faraday used the lines of force for explaining his experimental observations and in constructing the theory of forces in magnetism and electricity. Examples of Faraday’s different illustrations of lines of magnetic force are given in Fig.  2 . Faraday gave the following experiment-based definition for the lines of magnetic forces:

a Iron filing pattern in case of bar magnet drawn by Faraday (Faraday 1852b , Plate IX, p. 158, Fig. 1), b Faraday’s drawing of lines of magnetic force in case of cylinder magnet, where the experimental procedure, knife blade showing the direction of lines, is combined into drawing (Faraday, 1855, vol. 1, plate 1)

A line of magnetic force may be defined as that line which is described by a very small magnetic needle, when it is so moved in either direction correspondent to its length, that the needle is constantly a tangent to the line of motion; or it is that line along which, if a transverse wire be moved in either direction, there is no tendency to the formation of any current in the wire, whilst if moved in any other direction there is such a tendency; or it is that line which coincides with the direction of the magnecrystallic axis of a crystal of bismuth, which is carried in either direction along it. The direction of these lines about and amongst magnets and electric currents, is easily represented and understood, in a general manner, by the ordinary use of iron filings. (Faraday 1852a , p. 25 (3071))

The definition describes the connection between the experiments and the visual representation of the results. Initially, the lines of force were just geometric representations, but later, Faraday treated them as physical objects (Nersessian 1984 ; Pocovi and Finlay 2002 ):

I have sometimes used the term lines of force so vaguely, as to leave the reader doubtful whether I intended it as a merely representative idea of the forces, or as the description of the path along which the power was continuously exerted. … wherever the expression line of force is taken simply to represent the disposition of forces, it shall have the fullness of that meaning; but that wherever it may seem to represent the idea of the physical mode of transmission of the force, it expresses in that respect the opinion to which I incline at present. The opinion may be erroneous, and yet all that relates or refers to the disposition of the force will remain the same. (Faraday, 1852a , p. 55-56 (3075))

He also felt that the lines of force had greater explanatory power than the dominant theory of action-at-a-distance:

Now it appears to me that these lines may be employed with great advantage to represent nature, condition, direction and comparative amount of the magnetic forces; and that in many cases they have, to the physical reasoned at least, a superiority over that method which represents the forces as concentrated in centres of action… (Faraday, 1852a , p. 26 (3074))

For giving some insight to Faraday’s visual reasoning as an epistemic practice, the following examples of Faraday’s studies of the lines of magnetic force (Faraday 1852a , 1852b ) are presented:

(a) Asking questions and defining problems: The iron filing patterns formed the empirical basis for the visual model: 2D visualization of lines of magnetic force as presented in Fig.  2 . According to Faraday, these iron filing patterns were suitable for illustrating the direction and form of the magnetic lines of force (emphasis added):

It must be well understood that these forms give no indication by their appearance of the relative strength of the magnetic force at different places, inasmuch as the appearance of the lines depends greatly upon the quantity of filings and the amount of tapping; but the direction and forms of these lines are well given, and these indicate, in a considerable degree, the direction in which the forces increase and diminish . (Faraday 1852b , p.158 (3237))

Despite being static and two dimensional on paper, the lines of magnetic force were dynamical (Nersessian 1992 , 2008 ) and three dimensional for Faraday (see Fig.  2 b). For instance, Faraday described the lines of force “expanding”, “bending,” and “being cut” (Nersessian 1992 ). In Fig.  2 b, Faraday has summarized his experiment (bar magnet and knife blade) and its results (lines of force) in one picture.

(b) Analyzing and interpreting data: The model was so powerful for Faraday that he ended up thinking them as physical objects (e.g., Nersessian 1984 ), i.e., making interpretations of the way forces act. Of course, he made a lot of experiments for showing the physical existence of the lines of force, but he did not succeed in it (Nersessian 1984 ). The following quote illuminates Faraday’s use of the lines of force in different situations:

The study of these lines has, at different times, been greatly influential in leading me to various results, which I think prove their utility as well as fertility. Thus, the law of magneto-electric induction; the earth’s inductive action; the relation of magnetism and light; diamagnetic action and its law, and magnetocrystallic action, are the cases of this kind… (Faraday 1852a , p. 55 (3174))

(c) Experimenting: In Faraday's case, he used a lot of exploratory experiments; in case of lines of magnetic force, he used, e.g., iron filings, magnetic needles, or current carrying wires (see the quote above). The magnetic field is not directly observable and the representation of lines of force was a visual model, which includes the direction, form, and magnitude of field.

(d) Modeling: There is no denying that the lines of magnetic force are visual by nature. Faraday’s views of lines of force developed gradually during the years, and he applied and developed them in different contexts such as electromagnetic, electrostatic, and magnetic induction (Nersessian 1984 ). An example of Faraday’s explanation of the effect of the wire b’s position to experiment is given in Fig.  3 . In Fig.  3 , few magnetic lines of force are drawn, and in the quote below, Faraday is explaining the effect using these magnetic lines of force (emphasis added):

Picture of an experiment with different arrangements of wires ( a , b’ , b” ), magnet, and galvanometer. Note the lines of force drawn around the magnet. (Faraday 1852a , p. 34)

It will be evident by inspection of Fig. 3 , that, however the wires are carried away, the general result will, according to the assumed principles of action, be the same; for if a be the axial wire, and b’, b”, b”’ the equatorial wire, represented in three different positions, whatever magnetic lines of force pass across the latter wire in one position, will also pass it in the other, or in any other position which can be given to it. The distance of the wire at the place of intersection with the lines of force, has been shown, by the experiments (3093.), to be unimportant. (Faraday 1852a , p. 34 (3099))

In summary, by examining the history of Faraday’s use of lines of force, we showed how visual imagery and reasoning played an important part in Faraday’s construction and representation of his “field theory”. As Gooding has stated, “many of Faraday’s sketches are far more that depictions of observation, they are tools for reasoning with and about phenomena” (2006, p. 59).

Case study 3: visualizing scientific methods, the case of a journal

The focus of the third case study is the Journal of Visualized Experiments (JoVE) , a peer-reviewed publication indexed in PubMed. The journal devoted to the publication of biological, medical, chemical, and physical research in a video format. The journal describes its history as follows:

JoVE was established as a new tool in life science publication and communication, with participation of scientists from leading research institutions. JoVE takes advantage of video technology to capture and transmit the multiple facets and intricacies of life science research. Visualization greatly facilitates the understanding and efficient reproduction of both basic and complex experimental techniques, thereby addressing two of the biggest challenges faced by today's life science research community: i) low transparency and poor reproducibility of biological experiments and ii) time and labor-intensive nature of learning new experimental techniques. ( http://www.jove.com/ )

By examining the journal content, we generate a set of categories that can be considered as indicators of relevance and significance in terms of epistemic practices of science that have relevance for science education. For example, the quote above illustrates how scientists view some norms of scientific practice including the norms of “transparency” and “reproducibility” of experimental methods and results, and how the visual format of the journal facilitates the implementation of these norms. “Reproducibility” can be considered as an epistemic criterion that sits at the heart of what counts as an experimental procedure in science:

Investigating what should be reproducible and by whom leads to different types of experimental reproducibility, which can be observed to play different roles in experimental practice. A successful application of the strategy of reproducing an experiment is an achievement that may depend on certain isiosyncratic aspects of a local situation. Yet a purely local experiment that cannot be carried out by other experimenters and in other experimental contexts will, in the end be unproductive in science. (Sarkar and Pfeifer 2006 , p.270)

We now turn to an article on “Elevated Plus Maze for Mice” that is available for free on the journal website ( http://www.jove.com/video/1088/elevated-plus-maze-for-mice ). The purpose of this experiment was to investigate anxiety levels in mice through behavioral analysis. The journal article consists of a 9-min video accompanied by text. The video illustrates the handling of the mice in soundproof location with less light, worksheets with characteristics of mice, computer software, apparatus, resources, setting up the computer software, and the video recording of mouse behavior on the computer. The authors describe the apparatus that is used in the experiment and state how procedural differences exist between research groups that lead to difficulties in the interpretation of results:

The apparatus consists of open arms and closed arms, crossed in the middle perpendicularly to each other, and a center area. Mice are given access to all of the arms and are allowed to move freely between them. The number of entries into the open arms and the time spent in the open arms are used as indices of open space-induced anxiety in mice. Unfortunately, the procedural differences that exist between laboratories make it difficult to duplicate and compare results among laboratories.

The authors’ emphasis on the particularity of procedural context echoes in the observations of some philosophers of science:

It is not just the knowledge of experimental objects and phenomena but also their actual existence and occurrence that prove to be dependent on specific, productive interventions by the experimenters” (Sarkar and Pfeifer 2006 , pp. 270-271)

The inclusion of a video of the experimental procedure specifies what the apparatus looks like (Fig.  4 ) and how the behavior of the mice is captured through video recording that feeds into a computer (Fig.  5 ). Subsequently, a computer software which captures different variables such as the distance traveled, the number of entries, and the time spent on each arm of the apparatus. Here, there is visual information at different levels of representation ranging from reconfiguration of raw video data to representations that analyze the data around the variables in question (Fig.  6 ). The practice of levels of visual representations is not particular to the biological sciences. For instance, they are commonplace in nanotechnological practices:

Visual illustration of apparatus

Video processing of experimental set-up

Computer software for video input and variable recording

In the visualization processes, instruments are needed that can register the nanoscale and provide raw data, which needs to be transformed into images. Some Imaging Techniques have software incorporated already where this transformation automatically takes place, providing raw images. Raw data must be translated through the use of Graphic Software and software is also used for the further manipulation of images to highlight what is of interest to capture the (inferred) phenomena -- and to capture the reader. There are two levels of choice: Scientists have to choose which imaging technique and embedded software to use for the job at hand, and they will then have to follow the structure of the software. Within such software, there are explicit choices for the scientists, e.g. about colour coding, and ways of sharpening images. (Ruivenkamp and Rip 2010 , pp.14–15)

On the text that accompanies the video, the authors highlight the role of visualization in their experiment:

Visualization of the protocol will promote better understanding of the details of the entire experimental procedure, allowing for standardization of the protocols used in different laboratories and comparisons of the behavioral phenotypes of various strains of mutant mice assessed using this test.

The software that takes the video data and transforms it into various representations allows the researchers to collect data on mouse behavior more reliably. For instance, the distance traveled across the arms of the apparatus or the time spent on each arm would have been difficult to observe and record precisely. A further aspect to note is how the visualization of the experiment facilitates control of bias. The authors illustrate how the olfactory bias between experimental procedures carried on mice in sequence is avoided by cleaning the equipment.

Our discussion highlights the role of visualization in science, particularly with respect to presenting visualization as part of the scientific practices. We have used case studies from the history of science highlighting a scientist’s account of how visualization played a role in the discovery of DNA and the magnetic field and from a contemporary illustration of a science journal’s practices in incorporating visualization as a way to communicate new findings and methodologies. Our implicit aim in drawing from these case studies was the need to align science education with scientific practices, particularly in terms of how visual representations, stable or dynamic, can engage students in the processes of science and not only to be used as tools for cognitive development in science. Our approach was guided by the notion of “knowledge-as-practice” as advanced by Knorr Cetina ( 1999 ) who studied scientists and characterized their knowledge as practice, a characterization which shifts focus away from ideas inside scientists’ minds to practices that are cultural and deeply contextualized within fields of science. She suggests that people working together can be examined as epistemic cultures whose collective knowledge exists as practice.

It is important to stress, however, that visual representations are not used in isolation, but are supported by other types of evidence as well, or other theories (i.e., in order to understand the helical form of DNA, or the structure, chemistry knowledge was needed). More importantly, this finding can also have implications when teaching science as argument (e.g., Erduran and Jimenez-Aleixandre 2008 ), since the verbal evidence used in the science classroom to maintain an argument could be supported by visual evidence (either a model, representation, image, graph, etc.). For example, in a group of students discussing the outcomes of an introduced species in an ecosystem, pictures of the species and the ecosystem over time, and videos showing the changes in the ecosystem, and the special characteristics of the different species could serve as visual evidence to help the students support their arguments (Evagorou et al. 2012 ). Therefore, an important implication for the teaching of science is the use of visual representations as evidence in the science curriculum as part of knowledge production. Even though studies in the area of science education have focused on the use of models and modeling as a way to support students in the learning of science (Dori et al. 2003 ; Lehrer and Schauble 2012 ; Mendonça and Justi 2013 ; Papaevripidou et al. 2007 ) or on the use of images (i.e., Korfiatis et al. 2003 ), with the term using visuals as evidence, we refer to the collection of all forms of visuals and the processes involved.

Another aspect that was identified through the case studies is that of the visual reasoning (an integral part of Faraday’s investigations). Both the verbalization and visualization were part of the process of generating new knowledge (Gooding 2006 ). Even today, most of the textbooks use the lines of force (or just field lines) as a geometrical representation of field, and the number of field lines is connected to the quantity of flux. Often, the textbooks use the same kind of visual imagery than in what is used by scientists. However, when using images, only certain aspects or features of the phenomena or data are captured or highlighted, and often in tacit ways. Especially in textbooks, the process of producing the image is not presented and instead only the product—image—is left. This could easily lead to an idea of images (i.e., photos, graphs, visual model) being just representations of knowledge and, in the worse case, misinterpreted representations of knowledge as the results of Pocovi and Finlay ( 2002 ) in case of electric field lines show. In order to avoid this, the teachers should be able to explain how the images are produced (what features of phenomena or data the images captures, on what ground the features are chosen to that image, and what features are omitted); in this way, the role of visualization in knowledge production can be made “visible” to students by engaging them in the process of visualization.

The implication of these norms for science teaching and learning is numerous. The classroom contexts can model the generation, sharing and evaluation of evidence, and experimental procedures carried out by students, thereby promoting not only some contemporary cultural norms in scientific practice but also enabling the learning of criteria, standards, and heuristics that scientists use in making decisions on scientific methods. As we have demonstrated with the three case studies, visual representations are part of the process of knowledge growth and communication in science, as demonstrated with two examples from the history of science and an example from current scientific practices. Additionally, visual information, especially with the use of technology is a part of students’ everyday lives. Therefore, we suggest making use of students’ knowledge and technological skills (i.e., how to produce their own videos showing their experimental method or how to identify or provide appropriate visual evidence for a given topic), in order to teach them the aspects of the nature of science that are often neglected both in the history of science and the design of curriculum. Specifically, what we suggest in this paper is that students should actively engage in visualization processes in order to appreciate the diverse nature of doing science and engage in authentic scientific practices.

However, as a word of caution, we need to distinguish the products and processes involved in visualization practices in science:

If one considers scientific representations and the ways in which they can foster or thwart our understanding, it is clear that a mere object approach, which would devote all attention to the representation as a free-standing product of scientific labor, is inadequate. What is needed is a process approach: each visual representation should be linked with its context of production (Pauwels 2006 , p.21).

The aforementioned suggests that the emphasis in visualization should shift from cognitive understanding—using the products of science to understand the content—to engaging in the processes of visualization. Therefore, an implication for the teaching of science includes designing curriculum materials and learning environments that create a social and epistemic context and invite students to engage in the practice of visualization as evidence, reasoning, experimental procedure, or a means of communication (as presented in the three case studies) and reflect on these practices (Ryu et al. 2015 ).

Finally, a question that arises from including visualization in science education, as well as from including scientific practices in science education is whether teachers themselves are prepared to include them as part of their teaching (Bybee 2014 ). Teacher preparation programs and teacher education have been critiqued, studied, and rethought since the time they emerged (Cochran-Smith 2004 ). Despite the years of history in teacher training and teacher education, the debate about initial teacher training and its content still pertains in our community and in policy circles (Cochran-Smith 2004 ; Conway et al. 2009 ). In the last decades, the debate has shifted from a behavioral view of learning and teaching to a learning problem—focusing on that way not only on teachers’ knowledge, skills, and beliefs but also on making the connection of the aforementioned with how and if pupils learn (Cochran-Smith 2004 ). The Science Education in Europe report recommended that “Good quality teachers, with up-to-date knowledge and skills, are the foundation of any system of formal science education” (Osborne and Dillon 2008 , p.9).

However, questions such as what should be the emphasis on pre-service and in-service science teacher training, especially with the new emphasis on scientific practices, still remain unanswered. As Bybee ( 2014 ) argues, starting from the new emphasis on scientific practices in the NGSS, we should consider teacher preparation programs “that would provide undergraduates opportunities to learn the science content and practices in contexts that would be aligned with their future work as teachers” (p.218). Therefore, engaging pre- and in-service teachers in visualization as a scientific practice should be one of the purposes of teacher preparation programs.

Achieve. (2013). The next generation science standards (pp. 1–3). Retrieved from http://www.nextgenscience.org/ .

Google Scholar  

Barber, J, Pearson, D, & Cervetti, G. (2006). Seeds of science/roots of reading . California: The Regents of the University of California.

Bungum, B. (2008). Images of physics: an explorative study of the changing character of visual images in Norwegian physics textbooks. NorDiNa, 4 (2), 132–141.

Bybee, RW. (2014). NGSS and the next generation of science teachers. Journal of Science Teacher Education, 25 (2), 211–221. doi: 10.1007/s10972-014-9381-4 .

Article   Google Scholar  

Chambers, D. (1983). Stereotypic images of the scientist: the draw-a-scientist test. Science Education, 67 (2), 255–265.

Cochran-Smith, M. (2004). The problem of teacher education. Journal of Teacher Education, 55 (4), 295–299. doi: 10.1177/0022487104268057 .

Conway, PF, Murphy, R, & Rath, A. (2009). Learning to teach and its implications for the continuum of teacher education: a nine-country cross-national study .

Crick, F. (1988). What a mad pursuit . USA: Basic Books.

Dimopoulos, K, Koulaidis, V, & Sklaveniti, S. (2003). Towards an analysis of visual images in school science textbooks and press articles about science and technology. Research in Science Education, 33 , 189–216.

Dori, YJ, Tal, RT, & Tsaushu, M. (2003). Teaching biotechnology through case studies—can we improve higher order thinking skills of nonscience majors? Science Education, 87 (6), 767–793. doi: 10.1002/sce.10081 .

Duschl, RA, & Bybee, RW. (2014). Planning and carrying out investigations: an entry to learning and to teacher professional development around NGSS science and engineering practices. International Journal of STEM Education, 1 (1), 12. doi: 10.1186/s40594-014-0012-6 .

Duschl, R., Schweingruber, H. A., & Shouse, A. (2008). Taking science to school . Washington DC: National Academies Press.

Erduran, S, & Jimenez-Aleixandre, MP (Eds.). (2008). Argumentation in science education: perspectives from classroom-based research . Dordrecht: Springer.

Eurydice. (2012). Developing key competencies at school in Europe: challenges and opportunities for policy – 2011/12 (pp. 1–72).

Evagorou, M, Jimenez-Aleixandre, MP, & Osborne, J. (2012). “Should we kill the grey squirrels?” A study exploring students’ justifications and decision-making. International Journal of Science Education, 34 (3), 401–428. doi: 10.1080/09500693.2011.619211 .

Faraday, M. (1852a). Experimental researches in electricity. – Twenty-eighth series. Philosophical Transactions of the Royal Society of London, 142 , 25–56.

Faraday, M. (1852b). Experimental researches in electricity. – Twenty-ninth series. Philosophical Transactions of the Royal Society of London, 142 , 137–159.

Gilbert, JK. (2010). The role of visual representations in the learning and teaching of science: an introduction (pp. 1–19).

Gilbert, J., Reiner, M. & Nakhleh, M. (2008). Visualization: theory and practice in science education . Dordrecht, The Netherlands: Springer.

Gooding, D. (2006). From phenomenology to field theory: Faraday’s visual reasoning. Perspectives on Science, 14 (1), 40–65.

Gooding, D, Pinch, T, & Schaffer, S (Eds.). (1993). The uses of experiment: studies in the natural sciences . Cambridge: Cambridge University Press.

Hogan, K, & Maglienti, M. (2001). Comparing the epistemological underpinnings of students’ and scientists’ reasoning about conclusions. Journal of Research in Science Teaching, 38 (6), 663–687.

Knorr Cetina, K. (1999). Epistemic cultures: how the sciences make knowledge . Cambridge: Harvard University Press.

Korfiatis, KJ, Stamou, AG, & Paraskevopoulos, S. (2003). Images of nature in Greek primary school textbooks. Science Education, 88 (1), 72–89. doi: 10.1002/sce.10133 .

Latour, B. (2011). Visualisation and cognition: drawing things together (pp. 1–32).

Latour, B, & Woolgar, S. (1979). Laboratory life: the construction of scientific facts . Princeton: Princeton University Press.

Lehrer, R, & Schauble, L. (2012). Seeding evolutionary thinking by engaging children in modeling its foundations. Science Education, 96 (4), 701–724. doi: 10.1002/sce.20475 .

Longino, H. E. (2002). The fate of knowledge . Princeton: Princeton University Press.

Lynch, M. (2006). The production of scientific images: vision and re-vision in the history, philosophy, and sociology of science. In L Pauwels (Ed.), Visual cultures of science: rethinking representational practices in knowledge building and science communication (pp. 26–40). Lebanon, NH: Darthmouth College Press.

Lynch, M. & S. Y. Edgerton Jr. (1988). ‘Aesthetic and digital image processing representational craft in contemporary astronomy’, in G. Fyfe & J. Law (eds), Picturing Power; Visual Depictions and Social Relations (London, Routledge): 184 – 220.

Mendonça, PCC, & Justi, R. (2013). An instrument for analyzing arguments produced in modeling-based chemistry lessons. Journal of Research in Science Teaching, 51 (2), 192–218. doi: 10.1002/tea.21133 .

National Research Council (2000). Inquiry and the national science education standards . Washington DC: National Academies Press.

National Research Council (2012). A framework for K-12 science education . Washington DC: National Academies Press.

Nersessian, NJ. (1984). Faraday to Einstein: constructing meaning in scientific theories . Dordrecht: Martinus Nijhoff Publishers.

Book   Google Scholar  

Nersessian, NJ. (1992). How do scientists think? Capturing the dynamics of conceptual change in science. In RN Giere (Ed.), Cognitive Models of Science (pp. 3–45). Minneapolis: University of Minnesota Press.

Nersessian, NJ. (2008). Creating scientific concepts . Cambridge: The MIT Press.

Osborne, J. (2014). Teaching scientific practices: meeting the challenge of change. Journal of Science Teacher Education, 25 (2), 177–196. doi: 10.1007/s10972-014-9384-1 .

Osborne, J. & Dillon, J. (2008). Science education in Europe: critical reflections . London: Nuffield Foundation.

Papaevripidou, M, Constantinou, CP, & Zacharia, ZC. (2007). Modeling complex marine ecosystems: an investigation of two teaching approaches with fifth graders. Journal of Computer Assisted Learning, 23 (2), 145–157. doi: 10.1111/j.1365-2729.2006.00217.x .

Pauwels, L. (2006). A theoretical framework for assessing visual representational practices in knowledge building and science communications. In L Pauwels (Ed.), Visual cultures of science: rethinking representational practices in knowledge building and science communication (pp. 1–25). Lebanon, NH: Darthmouth College Press.

Philips, L., Norris, S. & McNab, J. (2010). Visualization in mathematics, reading and science education . Dordrecht, The Netherlands: Springer.

Pocovi, MC, & Finlay, F. (2002). Lines of force: Faraday’s and students’ views. Science & Education, 11 , 459–474.

Richards, A. (2003). Argument and authority in the visual representations of science. Technical Communication Quarterly, 12 (2), 183–206. doi: 10.1207/s15427625tcq1202_3 .

Rothbart, D. (1997). Explaining the growth of scientific knowledge: metaphors, models and meaning . Lewiston, NY: Mellen Press.

Ruivenkamp, M, & Rip, A. (2010). Visualizing the invisible nanoscale study: visualization practices in nanotechnology community of practice. Science Studies, 23 (1), 3–36.

Ryu, S, Han, Y, & Paik, S-H. (2015). Understanding co-development of conceptual and epistemic understanding through modeling practices with mobile internet. Journal of Science Education and Technology, 24 (2-3), 330–355. doi: 10.1007/s10956-014-9545-1 .

Sarkar, S, & Pfeifer, J. (2006). The philosophy of science, chapter on experimentation (Vol. 1, A-M). New York: Taylor & Francis.

Schwartz, RS, Lederman, NG, & Abd-el-Khalick, F. (2012). A series of misrepresentations: a response to Allchin’s whole approach to assessing nature of science understandings. Science Education, 96 (4), 685–692. doi: 10.1002/sce.21013 .

Schwarz, CV, Reiser, BJ, Davis, EA, Kenyon, L, Achér, A, Fortus, D, et al. (2009). Developing a learning progression for scientific modeling: making scientific modeling accessible and meaningful for learners. Journal of Research in Science Teaching, 46 (6), 632–654. doi: 10.1002/tea.20311 .

Watson, J. (1968). The Double Helix: a personal account of the discovery of the structure of DNA . New York: Scribner.

Watson, J, & Berry, A. (2004). DNA: the secret of life . New York: Alfred A. Knopf.

Wickman, PO. (2004). The practical epistemologies of the classroom: a study of laboratory work. Science Education, 88 , 325–344.

Wu, HK, & Shah, P. (2004). Exploring visuospatial thinking in chemistry learning. Science Education, 88 (3), 465–492. doi: 10.1002/sce.10126 .

Download references

Acknowledgements

The authors would like to acknowledge all reviewers for their valuable comments that have helped us improve the manuscript.

Author information

Authors and affiliations.

University of Nicosia, 46, Makedonitissa Avenue, Egkomi, 1700, Nicosia, Cyprus

Maria Evagorou

University of Limerick, Limerick, Ireland

Sibel Erduran

University of Tampere, Tampere, Finland

Terhi Mäntylä

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Maria Evagorou .

Additional information

Competing interests.

The authors declare that they have no competing interests.

Authors’ contributions

ME carried out the introductory literature review, the analysis of the first case study, and drafted the manuscript. SE carried out the analysis of the third case study and contributed towards the “Conclusions” section of the manuscript. TM carried out the second case study. All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( https://creativecommons.org/licenses/by/4.0 ), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article.

Evagorou, M., Erduran, S. & Mäntylä, T. The role of visual representations in scientific practices: from conceptual understanding and knowledge generation to ‘seeing’ how science works. IJ STEM Ed 2 , 11 (2015). https://doi.org/10.1186/s40594-015-0024-x

Download citation

Received : 29 September 2014

Accepted : 16 May 2015

Published : 19 July 2015

DOI : https://doi.org/10.1186/s40594-015-0024-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Visual representations
  • Epistemic practices
  • Science learning

visual representation about

What is visual representation?

In the vast landscape of communication, where words alone may fall short, visual representation emerges as a powerful ally. In a world inundated with information, the ability to convey complex ideas, emotions, and data through visual means is becoming increasingly crucial. But what exactly is visual representation, and why does it hold such sway in our understanding?

Defining Visual Representation:

Visual representation is the act of conveying information, ideas, or concepts through visual elements such as images, charts, graphs, maps, and other graphical forms. It’s a means of translating the abstract into the tangible, providing a visual language that transcends the limitations of words alone.

The Power of Images:

The adage “a picture is worth a thousand words” encapsulates the essence of visual representation. Images have an unparalleled ability to evoke emotions, tell stories, and communicate complex ideas in an instant. Whether it’s a photograph capturing a poignant moment or an infographic distilling intricate data, images possess a unique capacity to resonate with and engage the viewer on a visceral level.

Facilitating Understanding:

One of the primary functions of visual representation is to enhance understanding. Humans are inherently visual creatures, and we often process and retain visual information more effectively than text. Complex concepts that might be challenging to grasp through written explanations can be simplified and clarified through visual aids. This is particularly valuable in fields such as science, where intricate processes and structures can be elucidated through diagrams and illustrations.

Visual representation also plays a crucial role in education. In classrooms around the world, teachers leverage visual aids to facilitate learning, making lessons more engaging and accessible. From simple charts that break down historical timelines to interactive simulations that bring scientific principles to life, visual representation is a cornerstone of effective pedagogy.

Data Visualization:

In an era dominated by big data, the importance of data visualization cannot be overstated. Raw numbers and statistics can be overwhelming and abstract, but when presented visually, they transform into meaningful insights. Graphs, charts, and maps are powerful tools for conveying trends, patterns, and correlations, enabling decision-makers to glean actionable intelligence from vast datasets.

Consider the impact of a well-crafted infographic that distills complex research findings into a visually digestible format. Data visualization not only simplifies information but also allows for more informed decision-making in fields ranging from business and healthcare to social sciences and environmental studies.

Cultural and Artistic Expression:

Visual representation extends beyond the realm of information and education; it is also a potent form of cultural and artistic expression. Paintings, sculptures, photographs, and other visual arts serve as mediums through which individuals can convey their emotions, perspectives, and cultural narratives. Artistic visual representation has the power to transcend language barriers, fostering a shared human experience that resonates universally.

Conclusion:

In a world inundated with information, visual representation stands as a beacon of clarity and understanding. Whether it’s simplifying complex concepts, conveying data-driven insights, or expressing the depth of human emotion, visual elements enrich our communication in ways that words alone cannot. As we navigate an increasingly visual society, recognizing and harnessing the power of visual representation is not just a skill but a necessity for effective communication and comprehension. So, let us embrace the visual language that surrounds us, unlocking a deeper, more nuanced understanding of the world.

Blog Mindomo

  • X (Twitter)

Painting Pictures with Data: The Power of Visual Representations

visual representation

Picture this. A chaotic world of abstract concepts and complex data, like a thousand-piece jigsaw puzzle. Each piece, a different variable, a unique detail.

Alone, they’re baffling, nearly indecipherable.

But together? They’re a masterpiece of visual information, a detailed illustration.

American data pioneer Edward Tufte , a notable figure in the graphics press, believed that the art of seeing is not limited to the physical objects around us. He stated, “The commonality between science and art is in trying to see profoundly – to develop strategies of seeing and showing.”

It’s in this context that we delve into the world of data visualization. This is a process where you create visual representations that foster understanding and enhance decision making.

It’s the transformation of data into visual formats. The information could be anything from theoretical frameworks and research findings to word problems. Or anything in-between. And it has the power to change the way you learn, work, and more.

And with the help of modern technology, you can take advantage of data visualization easier than ever today.

What are Visual Representations?

Think of visuals, a smorgasbord of graphical representation, images, pictures, and drawings. Now blend these with ideas, abstract concepts, and data.

You get visual representations . A powerful, potent blend of communication and learning.

As a more formal definition, visual representation is the use of images to represent different types of data and ideas.

They’re more than simply a picture. Visual representations organize information visually , creating a deeper understanding and fostering conceptual understanding. These can be concrete objects or abstract symbols or forms, each telling a unique story. And they can be used to improve understanding everywhere, from a job site to an online article. University professors can even use them to improve their teaching.

But this only scratches the surface of what can be created via visual representation.

Types of Visual Representation for Improving Conceptual Understanding

Graphs, spider diagrams, cluster diagrams – the list is endless!

Each type of visual representation has its specific uses. A mind map template can help you create a detailed illustration of your thought process. It illustrates your ideas or data in an engaging way and reveals how they connect.

Here are a handful of different types of data visualization tools that you can begin using right now.

1. Spider Diagrams

spider diagram - visual representation example

Spider diagrams , or mind maps, are the master web-weavers of visual representation.

They originate from a central concept and extend outwards like a spider’s web. Different ideas or concepts branch out from the center area, providing a holistic view of the topic.

This form of representation is brilliant for showcasing relationships between concepts, fostering a deeper understanding of the subject at hand.

2. Cluster Diagrams

cluster diagram - visual representation example

As champions of grouping and classifying information, cluster diagrams are your go-to tools for usability testing or decision making. They help you group similar ideas together, making it easier to digest and understand information.

They’re great for exploring product features, brainstorming solutions, or sorting out ideas.

3. Pie Charts

Pie chart- visual representation example

Pie charts are the quintessential representatives of quantitative information.

They are a type of visual diagrams that transform complex data and word problems into simple symbols. Each slice of the pie is a story, a visual display of the part-to-whole relationship.

Whether you’re presenting survey results, market share data, or budget allocation, a pie chart offers a straightforward, easily digestible visual representation.

4. Bar Charts

Bar chart- visual representation example

If you’re dealing with comparative data or need a visual for data analysis, bar charts or graphs come to the rescue.

Bar graphs represent different variables or categories against a quantity, making them perfect for representing quantitative information. The vertical or horizontal bars bring the data to life, translating numbers into visual elements that provide context and insights at a glance.

Visual Representations Benefits

1. deeper understanding via visual perception.

Visual representations aren’t just a feast for the eyes; they’re food for thought. They offer a quick way to dig down into more detail when examining an issue.

They mold abstract concepts into concrete objects, breathing life into the raw, quantitative information. As you glimpse into the world of data through these visualization techniques , your perception deepens.

You no longer just see the data; you comprehend it, you understand its story. Complex data sheds its mystifying cloak, revealing itself in a visual format that your mind grasps instantly. It’s like going from a two dimensional to a three dimensional picture of the world.

2. Enhanced Decision Making

Navigating through different variables and relationships can feel like walking through a labyrinth. But visualize these with a spider diagram or cluster diagram, and the path becomes clear. Visual representation is one of the most efficient decision making techniques .

Visual representations illuminate the links and connections, presenting a fuller picture. It’s like having a compass in your decision-making journey, guiding you toward the correct answer.

3. Professional Development

Whether you’re presenting research findings, sharing theoretical frameworks, or revealing historical examples, visual representations are your ace. They equip you with a new language, empowering you to convey your message compellingly.

From the conference room to the university lecture hall, they enhance your communication and teaching skills, propelling your professional development. Try to create a research mind map and compare it to a plain text document full of research documentation and see the difference.

4. Bridging the Gap in Data Analysis

What is data visualization if not the mediator between data analysis and understanding? It’s more than an actual process; it’s a bridge.

It takes you from the shores of raw, complex data to the lands of comprehension and insights. With visualization techniques, such as the use of simple symbols or detailed illustrations, you can navigate through this bridge effortlessly.

5. Enriching Learning Environments

Imagine a teaching setting where concepts are not just told but shown. Where students don’t just listen to word problems but see them represented in charts and graphs. This is what visual representations bring to learning environments.

They transform traditional methods into interactive learning experiences, enabling students to grasp complex ideas and understand relationships more clearly. The result? An enriched learning experience that fosters conceptual understanding.

6. Making Abstract Concepts Understandable

In a world brimming with abstract concepts, visual representations are our saving grace. They serve as translators, decoding these concepts into a language we can understand.

Let’s say you’re trying to grasp a theoretical framework. Reading about it might leave you puzzled. But see it laid out in a spider diagram or a concept map, and the fog lifts. With its different variables clearly represented, the concept becomes tangible.

Visual representations simplify the complex, convert the abstract into concrete, making the inscrutable suddenly crystal clear. It’s the power of transforming word problems into visual displays, a method that doesn’t just provide the correct answer. It also offers a deeper understanding.

How to Make a Cluster Diagram?

Ready to get creative? Let’s make a cluster diagram.

First, choose your central idea or problem. This goes in the center area of your diagram. Next, think about related topics or subtopics. Draw lines from the central idea to these topics. Each line represents a relationship.

how to create a visual representation

While you can create a picture like this by drawing, there’s a better way.

Mindomo is a mind mapping tool that will enable you to create visuals that represent data quickly and easily. It provides a wide range of templates to kick-start your diagramming process. And since it’s an online site, you can access it from anywhere.

With a mind map template, creating a cluster diagram becomes an effortless process. This is especially the case since you can edit its style, colors, and more to your heart’s content. And when you’re done, sharing is as simple as clicking a button.

A Few Final Words About Information Visualization

To wrap it up, visual representations are not just about presenting data or information. They are about creating a shared understanding, facilitating learning, and promoting effective communication. Whether it’s about defining a complex process or representing an abstract concept, visual representations have it all covered. And with tools like Mindomo , creating these visuals is as easy as pie.

In the end, visual representation isn’t just about viewing data, it’s about seeing, understanding, and interacting with it. It’s about immersing yourself in the world of abstract concepts, transforming them into tangible visual elements. It’s about seeing relationships between ideas in full color. It’s a whole new language that opens doors to a world of possibilities.

The correct answer to ‘what is data visualization?’ is simple. It’s the future of learning, teaching, and decision-making.

Keep it smart, simple, and creative! The Mindomo Team

Related Posts

fishbone diagram template

Top 5 Fishbone Diagram Templates You Need To Know About!

visualization techniques

Mastering Your Mind: Exploring Effective Visualization Techniques

idea map

The Power of an Idea Map: Your Guide to Creative Thinking & Organizing Ideas

mind mapping vs brainstorming

Innovation Unleashed: Mind Mapping vs Brainstorming in the Generation of Game-Changing Ideas

key to success

The Key to Success with Ingredients for a Fulfilling Life

creative thinking

Cracking the Code to Creative Thinking: Ignite Your Brain and Unleash Your Ideas

Write a comment cancel reply.

Save my name, email, and website in this browser for the next time I comment.

  • Reviews / Why join our community?
  • For companies
  • Frequently asked questions

5. Visual Representation

How can you design computer displays that are as meaningful as possible to human viewers? Answering this question requires understanding of visual representation - the principles by which markings on a surface are made and interpreted. The analysis in this article addresses the most important principles of visual representation for screen design, introduced with examples from the early history of graphical user interfaces . In most cases, these principles have been developed and elaborated within whole fields of study and professional skill - typography , cartography, engineering and architectural draughting, art criticism and semiotics . Improving on the current conventions requires serious skill and understanding. Nevertheless, interaction designers should be able, when necessary, to invent new visual representations.

Introduction to Visual Representation by Alan Blackwell

Alan Blackwell on applying theories of Visual Representation

  • 5.1 Typography and text

For many years, computer displays resembled paper documents. This does not mean that they were simplistic or unreasonably constrained. On the contrary, most aspects of modern industrial society have been successfully achieved using the representational conventions of paper, so those conventions seem to be powerful ones. Information on paper can be structured using tabulated columns, alignment, indentation and emphasis , borders and shading. All of those were incorporated into computer text displays. Interaction conventions, however, were restricted to operations of the typewriter rather than the pencil. Each character typed would appear at a specific location. Locations could be constrained, like filling boxes on a paper form. And shortcut command keys could be defined using onscreen labels or paper overlays. It is not text itself, but keyboard interaction with text that is limited and frustrating compared to what we can do with paper (Sellen and Harper 2001).

But despite the constraints on keyboard interaction, most information on computer screens is still represented as text. Conventions of typography and graphic design help us to interpret that text as if it were on a page, and human readers benefit from many centuries of refinement in text document design. Text itself, including many writing systems as well as specialised notations such as algebra, is a visual representation that has its own research and educational literature. Documents that contain a mix of bordered or coloured regions containing pictures, text and diagrammatic elements can be interpreted according to the conventions of magazine design, poster advertising, form design, textbooks and encyclopaedias. Designers of screen representations should take care to properly apply the specialist knowledge of those graphic and typographic professions. Position on the page, use of typographic grids, and genre-specific illustrative conventions should all be taken into account.

Contemporary example from the grid system website

Author/Copyright holder: Unknown (pending investigation). Copyright terms and licence: Unknown (pending investigation). See section "Exceptions" in the copyright terms below.

Figure 5.1 : Contemporary example from the grid system website

Example of a symbolic algebra expression (the single particle solution to Schrodinger's equation)

Figure 5.2 : Example of a symbolic algebra expression (the single particle solution to Schrodinger's equation)

Table layout of funerals from the plague in London in 1665

Figure 5.3 : Table layout of funerals from the plague in London in 1665

Tabular layout of the first page of the Gutenberg Bible: Volume 1, Old Testament, Epistle of St. Jerome. The Gutenberg Bible was printed by Johannes Gutenberg, in Mainz, Germany in the 1450s

Figure 5.4 : Tabular layout of the first page of the Gutenberg Bible: Volume 1, Old Testament, Epistle of St. Jerome. The Gutenberg Bible was printed by Johannes Gutenberg, in Mainz, Germany in the 1450s

  • 5.1.1 Summary

Most screen-based information is interpreted according to textual and typographic conventions, in which graphical elements are arranged within a visual grid, occasionally divided or contained with ruled and coloured borders. Where to learn more:

thegridsystem.org

Resnick , Elizabeth (2003): Design for Communication: Conceptual Graphic Design Basics. Wiley

  • 5.2 Maps and graphs

The computer has, however, also acquired a specialised visual vocabulary and conventions. Before the text-based computer terminal (or 'glass teletype') became ubiquitous, cathode ray tube displays were already used to display oscilloscope waves and radar echoes. Both could be easily interpreted because of their correspondence to existing paper conventions. An oscilloscope uses a horizontal time axis to trace variation of a quantity over time, as pioneered by William Playfair in his 1786 charts of the British economy. A radar screen shows direction and distance of objects from a central reference point, just as the Hereford Mappa Mundi of 1300 organised places according to their approximate direction and distance from Jerusalem. Many visual displays on computers continue to use these ancient but powerful inventions - the map and the graph. In particular, the first truly large software project, the SAGE air defense system, set out to present data in the form of an augmented radar screen - an abstract map, on which symbols and text could be overlaid. The first graphics computer, the Lincoln Laboratory Whirlwind, was created to show maps, not text.

The technique invented by William Playfair, for visual representation of time series data.

Figure 5.5 : The technique invented by William Playfair, for visual representation of time series data.

Time series data as shown on an oscilloscope screen

Author/Copyright holder: Courtesy of Premek. V. Copyright terms and licence: pd (Public Domain (information that is common property and contains no original authorship)).

Figure 5.6 : Time series data as shown on an oscilloscope screen

Early radar screen from HMS Belfast built in 1936

Author/Copyright holder: Courtesy of Remi Kaupp. Copyright terms and licence: CC-Att-SA (Creative Commons Attribution-ShareAlike 3.0 Unported)

Figure 5.7 : Early radar screen from HMS Belfast built in 1936

Early weather radar - Hurricane Abby approaching the coast of British Honduras in 1960

Author/Copyright holder: Courtesy of NOAA's National Weather Service. Copyright terms and licence: pd (Public Domain (information that is common property and contains no original authorship)).

Figure 5.8 : Early weather radar - Hurricane Abby approaching the coast of British Honduras in 1960

The Hereford Mappa Mundi of 1300 organised places according to their approximate direction and distance from Jerusalem

Figure 5.9 : The Hereford Mappa Mundi of 1300 organised places according to their approximate direction and distance from Jerusalem

The SAGE system in use. The SAGE system used light guns as interaction devices.

Author/Copyright holder: Courtesy of Wikipedia. Copyright terms and licence: Unknown (pending investigation). See section "Exceptions" in the copyright terms below.

Figure 5.10 : The SAGE system in use. The SAGE system used light guns as interaction devices.

The Whirlwind computer at the MIT Lincoln Laboratory

Author/Copyright holder: The MITRE Corporation. Copyright terms and licence: All Rights Reserved. Reproduced with permission. See section "Exceptions" in the copyright terms below.

Figure 5.11 : The Whirlwind computer at the MIT Lincoln Laboratory

  • 5.2.1 Summary

Basic diagrammatic conventions rely on quantitative correspondence between a direction on the surface and a continuous quantity such as time or distance. These should follow established conventions of maps and graphs.

Where to learn more:

MacEachren , Alan M. (2004): How Maps Work: Representation, Visualization, and Design. The Guilford Press

  • 5.3 Schematic drawings

Ivan Sutherland's groundbreaking PhD research with Whirlwind's successor TX-2 introduced several more sophisticated alternatives (Sutherland 1963). The use of a light pen allowed users to draw arbitrary lines, rather than relying on control keys to select predefined options. An obvious application, in the engineering context of Massachusetts Institute of Technology (MIT) where Sutherland worked, was to make engineering drawings such as the girder bridge in Figure 13. Lines on the screen are scaled versions of the actual girders, and text information can be overlaid to give details of force calculations. Plans of this kind, as a visual representation, are closely related to maps. However, where the plane of a map corresponds to a continuous surface, engineering drawings need not be continuous. Each set of connected components must share the same scale, but white space indicates an interpretive break, so that independent representations can potentially share the same divided surface - a convention introduced in Diderot's encyclopedia of 1772, which showed pictures of multiple objects on a page, but cut them loose from any shared pictorial context.

The TX-2 graphics computer, running Ivan Sutherland's Sketchpad software

Author/Copyright holder: Courtesy of Ivan Sutherland. Copyright terms and licence: CC-Att-SA-3 (Creative Commons Attribution-ShareAlike 3.0).

Figure 5.12 : The TX-2 graphics computer, running Ivan Sutherland's Sketchpad software

An example of a force diagram created using Sutherland's Sketchpad

Figure 5.13 : An example of a force diagram created using Sutherland's Sketchpad

A page from the Encyclopédie of Diderot and d'Alembert, combining pictorial elements with diagrammatic lines and categorical use of white space.

Figure 5.14 : A page from the Encyclopédie of Diderot and d'Alembert, combining pictorial elements with diagrammatic lines and categorical use of white space.

  • 5.3.1 Summary

Engineering drawing conventions allow schematic views of connected components to be shown in relative scale, and with text annotations labelling the parts. White space in the representation plane can be used to help the reader distinguish elements from each other rather than directly representing physical space. Where to learn more:

Engineering draughting textbooks

Ferguson , Eugene S. (1994): Engineering and the Mind's Eye. MIT Press

  • 5.4 Pictures

The examples so far may seem rather abstract. Isn't the most 'natural' visual representation simply a picture of the thing you are trying to represent? In that case, what is so hard about design? Just point a camera, and take the picture. It seems like pictures are natural and intuitive, and anyone should be able to understand what they mean. Of course, you might want the picture to be more or less artistic, but that isn't a technical concern, is it? Well, Ivan Sutherland also suggested the potential value that computer screens might offer as artistic tools. His Sketchpad system was used to create a simple animated cartoon of a winking girl. We can use this example to ask whether pictures are necessarily 'natural', and what design factors are relevant to the selection or creation of pictures in an interaction design context.

We would not describe Sutherland's girl as 'realistic', but it is an effective representation of a girl. In fact, it is an unusually good representation of a winking girl, because all the other elements of the picture are completely abstract and generic. It uses a conventional graphic vocabulary of lines and shapes that are understood in our culture to represent eyes, mouths and so on - these elements do not draw attention to themselves, and therefore highlight the winking eye. If a realistic picture of an actual person was used instead, other aspects of the image (the particular person) might distract the viewer from this message.

Sutherland's 'Winking Girl' drawing, created with the Sketchpad system

Figure 5.15 : Sutherland's 'Winking Girl' drawing, created with the Sketchpad system

It is important, when considering the design options for pictures, to avoid the 'resemblance fallacy', i.e. that drawings are able to depict real object or scenes because the viewer's perception of the flat image simulates the visual perception of a real scene. In practice, all pictures rely on conventions of visual representation, and are relatively poor simulations of natural engagement with physical objects, scenes and people. We are in the habit of speaking approvingly of some pictures as more 'realistic' than others (photographs, photorealistic ray-traced renderings, 'old master' oil paintings), but this simply means that they follow more rigorously a particular set of conventions. The informed designer is aware of a wide range of pictorial conventions and options.

As an example of different pictorial conventions, consider the ways that scenes can be rendered using different forms of artistic perspective. The invention of linear perspective introduced a particular convention in which the viewer is encouraged to think of the scene as perceived through a lens or frame while holding his head still, so that nearby objects occupy a disproportionate amount of the visual field. Previously, pictorial representations more often varied the relative size of objects according to their importance - a kind of 'semantic' perspective. Modern viewers tend to think of the perspective of a camera lens as being most natural, due to the ubiquity of photography, but we still understand and respect alternative perspectives, such as the isometric perspective of the pixel art group eBoy, which has been highly influential on video game style.

Example of an early work by Masaccio, demonstrating a 'perspective' in which relative size shows symbolic importance

Author/Copyright holder: Courtesy of Masaccio (1401-1428). Copyright terms and licence: pd (Public Domain (information that is common property and contains no original authorship))

Figure 5.16 : Example of an early work by Masaccio, demonstrating a 'perspective' in which relative size shows symbolic importance

Example of the strict isometric perspective used by the eBoy group

Author/Copyright holder: eBoy.com. Copyright terms and licence: All Rights Reserved. Reproduced with permission. See section "Exceptions" in the copyright terms below.

Figure 5.17 : Example of the strict isometric perspective used by the eBoy group

Masaccio's mature work The Tribute Money, demonstrating linear perspective

Author/Copyright holder: Courtesy of Masaccio (1401-1428). Copyright terms and licence: pd (Public Domain (information that is common property and contains no original authorship)).

Figure 5.18 : Masaccio's mature work The Tribute Money, demonstrating linear perspective

As with most conventions of pictorial representation, new perspective rendering conventions are invented and esteemed for their accuracy by critical consensus, and only more slowly adopted by untrained readers. The consensus on preferred perspective shifts across cultures and historical periods. It would be naïve to assume that the conventions of today are the final and perfect product of technical evolution. As with text, we become so accustomed to interpreting these representations that we are blind to the artifice. But professional artists are fully aware of the conventions they use, even where they might have mechanical elements - the way that a photograph is framed changes its meaning, and a skilled pencil drawing is completely unlike visual edge-detection thresholds. A good pictorial representation need not simulate visual experience any more than a good painting of a unicorn need resemble an actual unicorn. When designing user interfaces, all of these techniques are available for use, and new styles of pictorial rendering are constantly being introduced.

  • 5.4.1 Summary

Pictorial representations, including line drawings, paintings, perspective renderings and photographs rely on shared interpretive conventions for their meaning. It is naïve to treat screen representations as though they were simulations of experience in the physical world. Where to learn more:

Micklewright , Keith (2005): Drawing: Mastering the Language of Visual Expression. Harry N. Abrams

Stroebel , Leslie, Todd , Hollis and Zakia , Richard (1979): Visual Concepts for Photographers. Focal Press

  • 5.5 Node-and-link diagrams

The first impulse of a computer scientist, when given a pencil, seems to be to draw boxes and connect them with lines. These node and link diagrams can be analysed in terms of the graph structures that are fundamental to the study of algorithms (but unrelated to the visual representations known as graphs or charts). A predecessor of these connectivity diagrams can be found in electrical circuit schematics, where the exact location of components, and the lengths of the wires, can be arranged anywhere, because they are irrelevant to the circuit function. Another early program created for the TX-2, this time by Ivan Sutherland's brother Bert, allowed users to create circuit diagrams of this kind. The distinctive feature of a node-and-link connectivity diagram is that, since the position of each node is irrelevant to the operation of the circuit, it can be used to carry other information. Marian Petre's research into the work of electronics engineers (Petre 1995) catalogued the ways in which they positioned components in ways that were meaningful to human readers, but not to the computer - like the blank space between Diderot's objects this is a form of 'secondary notation' - use of the plane to assist the reader in ways not related to the technical content.

Circuit connectivity diagrams have been most widely popularised through the London Underground diagram, an invention of electrical engineer Henry Beck. The diagram clarified earlier maps by exploiting the fact that most underground travellers are only interested in order and connectivity, not location, of the stations on the line. (Sadly, the widespread belief that a 'diagram' will be technical and hard to understand means that most people describe this as the London Undergound 'map', despite Beck's insistence on his original term).

Henry Beck's London Underground Diagram (1933)

Author/Copyright holder: Courtesy of Harry C. Beck and possibly F. H. Stingemore, born 1890, died 1954. Stingmore designed posters for the Underground Group and London Transport 1914-1942. Copyright terms and licence: Unknown (pending investigation). See section "Exceptions" in the copyright terms below.

Figure 5.19 : Henry Beck's London Underground Diagram (1933)

Node and link diagram of the kind often drawn by computing professionals

Author/Copyright holder: Computer History Museum, Mountain View, CA, USA. Copyright terms and licence: All Rights Reserved. Reproduced with permission. See section "Exceptions" in the copyright terms below.

Figure 5.20 : Node and link diagram of the kind often drawn by computing professionals

Map of the London Underground network, as it was printed before the design of Beck's diagram (1932)

Figure 5.21 : Map of the London Underground network, as it was printed before the design of Beck's diagram (1932)

  • 5.5.1 Summary

Node and link diagrams are still widely perceived as being too technical for broad acceptance. Nevertheless, they can present information about ordering and relationships clearly, especially if consideration is given to the value of allowing human users to specify positions. Where to learn more:

Diagrammatic representation books

Lowe , Ric (1992): Successful Instructional Diagram.

  • 5.6 Icons and symbols

Maps frequently use symbols to indicate specific kinds of landmark. Sometimes these are recognisably pictorial (the standard symbols for tree and church), but others are fairly arbitrary conventions (the symbol for a railway station). As the resolution of computer displays increased in the 1970s, a greater variety of symbols could be differentiated, by making them more detailed, as in the MIT SDMS (Spatial Data Management System) that mapped a naval battle scenario with symbols for different kinds of ship. However, the dividing line between pictures and symbols is ambiguous. Children's drawings of houses often use conventional symbols (door, four windows, triangle roof and chimney) whether or not their own house has two storeys, or a fireplace. Letters of the Latin alphabet are shapes with completely arbitrary relationship to their phonetic meaning, but the Korean phonetic alphabet is easier to learn because the forms mimic the shape of the mouth when pronouncing those sounds. The field of semiotics offers sophisticated ways of analysing the basis on which marks correspond to meanings. In most cases, the best approach for an interaction designer is simply to adopt familiar conventions. When these do not exist, the design task is more challenging.

It is unclear which of the designers working on the Xerox Star coined the term 'icon' for the small pictures symbolising different kinds of system object. David Canfield Smith winningly described them as being like religious icons, which he said were pictures standing for (abstract) spiritual concepts. But 'icon' is also used as a technical term in semiotics. Unfortunately, few of the Xerox team had a sophisticated understanding of semiotics. It was fine art PhD Susan Kare's design work on the Apple Macintosh that established a visual vocabulary which has informed the genre ever since. Some general advice principles are offered by authors such as Horton (1994), but the successful design of icons is still sporadic. Many software publishers simply opt for a memorable brand logo, while others seriously misjudge the kinds of correspondence that are appropriate (my favourite blooper was a software engineering tool in which a pile of coins was used to access the 'change' command).

It has been suggested that icons, being pictorial, are easier to understand than text, and that pre-literate children, or speakers of different languages, might thereby be able to use computers without being able to read. In practice, most icons simply add decoration to text labels, and those that are intended to be self-explanatory must be supported with textual tooltips. The early Macintosh icons, despite their elegance, were surprisingly open to misinterpretation. One PhD graduate of my acquaintance believed that the Macintosh folder symbol was a briefcase (the folder tag looked like a handle), which allowed her to carry her files from place to place when placed inside it. Although mistaken, this belief never caused her any trouble - any correspondence can work, so long as it is applied consistently.

In art, the term Icon (from Greek, eikon,

Copyright terms and licence: pd (Public Domain (information that is common property and contains no original authorship)).

Figure 5.22 : In art, the term Icon (from Greek, eikon, "image") commonly refers to religious paintings in Eastern Orthodox, Oriental Orthodox, and Eastern-rite Catholic jurisdictions. Here a 6th-century encaustic icon from Saint Catherine's Monastery, Mount Sinai

In computing, David Canfield Smith described computer icons as being like religious icons, which he said were pictures standing for (abstract) spiritual concepts.

Author/Copyright holder: Apple Computer, Inc. Copyright terms and licence: All Rights Reserved. Reproduced with permission. See section "Exceptions" in the copyright terms below.

Figure 5.23 : In computing, David Canfield Smith described computer icons as being like religious icons, which he said were pictures standing for (abstract) spiritual concepts.

  • 5.6.1 Summary

The design of simple and memorable visual symbols is a sophisticated graphic design skill. Following established conventions is the easiest option, but new symbols must be designed with an awareness of what sort of correspondence is intended - pictorial, symbolic, metonymic (e.g. a key to represent locking), bizarrely mnemonic, but probably not monolingual puns. Where to learn more:

Napoles , Veronica (1987): Corporate Identity Design.

  • 5.7 Visual metaphor

The ambitious graphic designs of the Xerox Star/Alto and Apple Lisa/Macintosh were the first mass-market visual interfaces. They were marketed to office professionals, making the 'cover story' that they resembled an office desktop a convenient explanatory device. Of course, as was frequently noted at the time, these interfaces behaved nothing like a real desktop. The mnemonic symbol for file deletion (a wastebasket) was ridiculous if interpreted as an object placed on a desk. And nobody could explain why the desk had windows in it (the name was derived from the 'clipping window' of the graphics architecture used to implement them - it was at some later point that they began to be explained as resembling sheets of paper on a desk). There were immediate complaints from luminaries such as Alan Kay and Ted Nelson that strict analogical correspondence to physical objects would become obstructive rather than instructive. Nevertheless, for many years the marketing story behind the desktop metaphor was taken seriously, despite the fact that all attempts to improve the Macintosh design with more elaborate visual analogies , as in General Magic and Microsoft Bob, subsequently failed.

The 'desktop' can be far more profitably analysed (and extended) by understanding the representational conventions that it uses. The size and position of icons and windows on the desktop has no meaning, they are not connected, and there is no visual perspective, so it is neither a map, graph nor picture. The real value is the extent to which it allows secondary notation, with the user creating her own meaning by arranging items as she wishes. Window borders separate areas of the screen into different pictorial, text or symbolic contexts as in the typographic page design of a textbook or magazine. Icons use a large variety of conventions to indicate symbolic correspondence to software operations and/or company brands, but they are only occasionally or incidentally organised into more complex semiotic structures.

Apple marketed the visual metaphor in 1983 as a key benefit of the Lisa computer. This advertisement said 'You can work with Lisa the same familiar way you work at your desk'. However a cont

Author/Copyright holder:Apple Computer, Inc and Computer History Museum, Mountain View, CA. Copyright terms and licence: All Rights Reserved. Reproduced with permission. See section "Exceptions" in the copyright terms below.

Figure 5.24 : Apple marketed the visual metaphor in 1983 as a key benefit of the Lisa computer. This advertisement said 'You can work with Lisa the same familiar way you work at your desk'. However a controlled study by Carroll and Mazur (1986) found that the claim for immediately familiar operation may have been exaggerated.

The Xerox Alto and Apple Lisa, early products in which bitmapped displays allowed pictorial icons to be used as mnemonic cues within the 'desktop metaphor'

Figure 5.25 : The Xerox Alto and Apple Lisa, early products in which bitmapped displays allowed pictorial icons to be used as mnemonic cues within the 'desktop metaphor'

Apple Lisa

Author/Copyright holder: Courtesy of Mschlindwein. Copyright terms and licence: CC-Att-SA (Creative Commons Attribution-ShareAlike 3.0 Unported).

Figure 5.26 : Apple Lisa

  • 5.7.1 Summary

Theories of visual representation, rather than theories of visual metaphor, are the best approach to explaining the conventional Macintosh/Windows 'desktop'. There is huge room for improvement. Where to learn more:

Blackwell , Alan (2006): The reification of metaphor as a design tool . In ACM Transactions on Computer-Human Interaction , 13 (4) pp. 490-530

  • 5.8 Unified theories of visual representation

The analysis in this article has addressed the most important principles of visual representation for screen design, introduced with examples from the early history of graphical user interfaces. In most cases, these principles have been developed and elaborated within whole fields of study and professional skill - typography, cartography, engineering and architectural draughting, art criticism and semiotics. Improving on the current conventions requires serious skill and understanding. Nevertheless, interaction designers should be able, when necessary, to invent new visual representations.

One approach is to take a holistic perspective on visual language, information design, notations, or diagrams. Specialist research communities in these fields address many relevant factors from low-level visual perception to critique of visual culture. Across all of them, it can be necessary to ignore (or not be distracted by) technical and marketing claims, and to remember that all visual representations simply comprise marks on a surface that are intended to correspond to things understood by the reader. The two dimensions of the surface can be made to correspond to physical space (in a map), to dimensions of an object, to a pictorial perspective, or to continuous abstract scales (time or quantity). The surface can also be partitioned into regions that should be interpreted differently. Within any region, elements can be aligned, grouped, connected or contained in order to express their relationships. In each case, the correspondence between that arrangement, and the intended interpretation, must be understood by convention, explained, or derived from the structural and perceptual properties of marks on the plane. Finally, any individual element might be assigned meaning according to many different semiotic principles of correspondence.

The following table summarises holistic views, as introduced above, drawing principally on the work of Bertin, Richards, MacEachren, Blackwell & Engelhardt and Engelhardt. Where to learn more:

Engelhardt , Yuri (2002). The Language of Graphics. A framework for the analysis of syntax and meaning in maps, charts and diagrams (PhD Thesis) . University of Amsterdam

Table 5.1 : Summary of the ways in which graphical representations can be applied in design, via different systems of correspondence

Table 5.2 : Screenshot from the site gapminder.org, illustrating a variety of correspondence conventions used in different parts of the page

As an example of how one might analyse (or working backwards, design) a complex visual representation, consider the case of musical scores. These consist of marks on a paper surface, bound into a multi-page book, that is placed on a stand at arms length in front of a performer. Each page is vertically divided into a number of regions, visually separated by white space and grid alignment cues. The regions are ordered, with that at the top of the page coming first. Each region contains two quantitative axes, with the horizontal axis representing time duration, and the vertical axis pitch. The vertical axis is segmented by lines to categorise pitch class. Symbols placed at a given x-y location indicate a specific pitched sound to be initiated at a specific time. A conventional symbol set indicates the duration of the sound. None of the elements use any variation in colour, saturation or texture. A wide variety of text labels and annotation symbols are used to elaborate these basic elements. Music can be, and is, also expressed using many other visual representations (see e.g. Duignan for a survey of representations used in digital music processing).

  • 5.9 Where to learn more

The historical examples of early computer representations used in this article are mainly drawn from Sutherland (Ed. Blackwell and Rodden 2003), Garland (1994), and Blackwell (2006). Historical reviews of visual representation in other fields include Ferguson (1992), Pérez-Gómez and Pelletier (1997), McCloud (1993), Tufte (1983). Reviews of human perceptual principles can be found in Gregory (1970), Ittelson (1996), Ware (2004), Blackwell (2002). Advice on principles of interaction with visual representation is distributed throughout the HCI literature, but classics include Norman (1988), Horton (1994), Shneiderman ( Shneiderman and Plaisant 2009, Card et al 1999, Bederson and Shneiderman 2003) and Spence (2001). Green's Cognitive Dimensions of Notations framework has for many years provided a systematic classification of the design parameters in interactive visual representations. A brief introduction is provided in Blackwell and Green (2003).

Research on visual representation topics is regularly presented at the Diagrams conference series (which has a particular emphasis on cognitive science ), the InfoDesign and Vision Plus conferences (which emphasise graphic and typographic information design), the Visual Languages and Human-Centric Computing symposia (emphasising software tools and development), and the InfoVis and Information Visualisation conferences (emphasising quantitative and scientific data visualisation ).

  • 5.9.0.1 IV - International Conference on Information Visualization

2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998

  • 5.9.0.2 DIAGRAMS - International Conference on the Theory and Application of Diagrams

2008 2006 2004 2002 2000

  • 5.9.0.3 VL-HCC - Symposium on Visual Languages and Human Centric Computing

2008 2007 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 1992 1991 1990

  • 5.9.0.4 InfoVis - IEEE Symposium on Information Visualization

2005 2004 2003 2002 2001 2000 1999 1998 1997 1995

  • 5.10 References

Anderson , Michael, Meyer , Bernd and Olivier , Patrick (2002): Diagrammatic Representation and Reasoning. London, UK,

Bederson , Benjamin B. and Shneiderman , Ben (2003): The Craft of Information Visualization : Readings and Reflections. Morgan Kaufman Publishers

Bertin , Jacques (1967): Semiology of Graphics: Diagrams, Networks, Maps (Sémiologie graphique: Les diagrammes - Les réseaux - Les cartes). English translation by W. J. Berg. Madison, WI, USA, University of Wisconsin Press

Blackwell , Alan (2002): Psychological perspectives on diagrams and their users. In: Anderson , Michael, Meyer , Bernd and Olivier , Patrick (eds.). "Diagrammatic Representation and Reasoning". London, UK: pp. 109-123

Blackwell , Alan and Engelhardt , Yuri (2002): A Meta-Taxonomy for Diagram Research. In: Anderson , Michael, Meyer , Bernd and Olivier , Patrick (eds.). "Diagrammatic Representation and Reasoning". London, UK: pp. 47-64

Blackwell , Alan and Green , T. R. G. (2003): Notational Systems - The Cognitive Dimensions of Notations Framework. In: Carroll , John M. (ed.). "HCI Models, Theories, and Frameworks". San Francisco: Morgan Kaufman Publisherspp. 103-133

Carroll , John M. and Mazur , Sandra A. (1986): LisaLearning . In Computer , 19 (11) pp. 35-49

Garland , Ken (1994): Mr . Beck's Underground Map. Capital Transport Publishing

Goodman , Nelson (1976): Languages of Art. Hackett Publishing Company

Gregory , Richard L. (1970): The Intelligent Eye. London, Weidenfeld and Nicolson

Horton , William (1994): The Icon Book: Visual Symbols for Computer Systems and Documentation. John Wiley and Sons

Ittelson , W. H. (1996): Visual perception of markings . In Psychonomic Bulletin & Review , 3 (2) pp. 171-187

Mccloud , Scott (1994): Understanding Comics: The Invisible Art. Harper Paperbacks

Norman , Donald A. (1988): The Design of Everyday Things. New York, Doubleday

Petre , Marian (1995): Why Looking Isn't Always Seeing: Readership Skills and Graphical Programming . In Communications of the ACM , 38 (6) pp. 33-44

Pérez-Gómez , Alberto and Pelletier , Louise (1997): Architectural Representation and the Perspective Hinge. MIT Press

Richards , Clive (1984). Diagrammatics: an investigation aimed at providing a theoretical framework for studying diagrams and for establishing a taxonomy of their fundamental modes of graphic organization. Unpublished Phd Thesis . Royal College of Art, London, UK

Sellen , Abigail and Harper , Richard H. R. (2001): The Myth of the Paperless Office. MIT Press

Shneiderman , Ben and Plaisant , Catherine (2009): Designing the User Interface : Strategies for Effective Human-Computer Interaction (5th ed.). Addison-Wesley

Spence , Robert (2001): Information Visualization. Addison Wesley

Sutherland , Ivan E. (1963). Sketchpad, A Man-Machine Graphical Communication System. PhD Thesis at Massachusetts Institute of Technology, online version and editors' introduction by Alan Blackwell & K. Rodden. Technical Report 574 . Cambridge University Computer Laboratory

Tufte , Edward R. (1983): The Visual Display of Quantitative Information. Cheshire, CT , Graphics Press

Ware , Colin (2004): Information Visualization: Perception for Design, 2nd Ed. San Francisco, Morgan Kaufman

  • 5 Visual Representation

Human-Computer Interaction: The Foundations of UX Design

visual representation about

Get Weekly Design Insights

5.10 commentary by ben shneiderman.

Since computer displays are such powerful visual appliances, careful designers devote extensive effort to getting the visual representation right. They have to balance the demands of many tasks, diverse users, and challenging requirements, such as short learning time, rapid performance, low error rates, and good retention over time. Designing esthetic interfaces that please and even delight users is a further expectation that designers must meet to be successful. For playful and discretionary tasks esthetic concerns may dominate, but for life critical tasks, rapid performance with low error rates are essential. Alan Blackwell's competent description of many visual representation issues is a great start for newcomers with helpful reminders even for experienced designers. The videos make for a pleasant personal accompaniment that bridges visual representation for interface design with thoughtful analyses of representational art. Blackwell's approach might be enriched by more discussion of visual representations in functional product design tied to meaningful tasks. Learning from paintings of Paris is fine, but aren't there other lessons to learn from visual representations in airport kiosks, automobile dashboards, or intensive care units? These devices as well as most graphical user interfaces and mobile devices raise additional questions of changing state visualization and interaction dynamics. Modern designers need to do more than show the right phone icon, they need to show ringing, busy, inactive, no network, conference mode, etc., which may include color changes (highlighted, grayed out), animations, and accompanying sounds. These designers also need to deal with interactive visual representations that happen with a click, double-click, right-click, drag, drag-and-drop, hover, multi-select, region-select, brushing-linking, and more. The world of mobile devices such as phones, cameras, music players, or medical sensors is the new frontier for design, where visual representations are dynamic and tightly integrated with sound, haptics, and novel actions such as shaking, twisting, or body movements. Even more challenging is the expectation that goes beyond the solitary viewer to the collaboration in which multiple users embedded in a changing physical environment produce new visual representations. These changing and interactive demands on designers invite creative expressions that are very different from designs for static signs, printed diagrams, or interpretive art. The adventure for visual representation designers is to create a new language of interaction that engages users, accelerates learning, provides comprehensible feedback, and offers appropriate warnings when dangers emerge. Blackwell touches on some of these issues in the closing Gapminder example, but I was thirsty for more.

5.11 Commentary by Clive Richards

If I may be permitted a graphically inspired metaphor Alan Blackwell provides us with a neat pen sketch of that extensive scene called 'visual representation' (Blackwell 2011).

"Visualisation has a lot more to offer than most people are aware of today" we are told by Robert Kosara at the end of his commentary (Kosara 2010) on Stephen Few's related article on ' Data visualisation for human perception ' (Few 2010). Korsara is right, and Blackwell maps out the broad territory in which many of these visualisation offerings may be located. In this commentary I offer a few observations on some prominent features in that landscape: dynamics, picturing, semiotics and metaphor.

Ben Shneiderman's critique of Blackwell's piece points to a lack of attention to "... additional questions of changing state visualisations and interaction dynamics" (Shneiderman 2010). Indeed the possibilities offered by these additional questions present some exciting challenges for interaction designers - opportunities to create novel and effective combinations of visual with other sensory and motor experiences in dynamic operational contexts. Shneiderman suggests that: "These changing and interactive demands on designers invite creative expressions that are very different from design for static signs, printed diagrams, or interpretive art". This may be so up to a point, but here Shneinderman and I part company a little. The focus of Blackwell's essay is properly on the visual representation side of facilities available to interaction designers, and in that context he is quite right to give prominence to highly successful but static visual representation precedents, and also to point out the various specialist fields of endeavour in which they have been developed. Some of these representational approaches have histories reaching back thousands of years and are deeply embedded within our culture. It would be foolhardy to disregard conventions established in, say, the print domain, and to try to re-invent everything afresh for the screen, even if this were a practical proposition. Others have made arguments to support looking to historical precedents. For example Michael Twyman has pointed out that when considering typographic cueing and "... the problems of the electronic age ... we have much to learn from the manuscript age" (Twyman 1987, p5). He proposes that studying the early scribes' use of colour, spacing and other graphical devices can usefully inform the design of today's screen-based texts. And as Blackwell points out in his opening section on 'Typography and text' "most information on computer screen is still presented as text".

It is also sometimes assumed that the pictorial representation of a dynamic process is best presented dynamically. However it can be argued that the comic book convention of using a sequence of static frames is sometimes superior for focusing the viewer's attention on the critical events in a process, rather than using an animated sequence in which key moments may be missed. This is of course not to deny the immense value of the moving and interactive visual image in the right context. The Gapminder charts are a case in point (http://www.gapminder.org). Blackwell usefully includes one of these, but as a static presentation. These diagrams come to life and really tell their story through the clustering of balloons that inflate or deflate as they move about the screen when driven through simulated periods of time.

While designing a tool for engineers to learn about the operation and maintenance of an oil system for an aircraft jet engine, Detlev Fischer devised a series of interactive animations, called 'Cinegrams' to display in diagrammatic form various operating procedures (Fischer and Richards 1995). He used the cinematic techniques of time compression and expansion in one animated sequence to show how the slow accumulation of debris in an oil filter, over an extended period of time, would eventually create a blockage to the oil flow and trigger the opening of a by-pass device in split seconds. Notwithstanding my earlier comment about the potential superiority of the comic strip genre for displaying some time dependant processes this particular Cinegram proved very instructive for the targeted users. There are many other examples one could cite where dynamic picturing of this sort has been deployed to similarly good effect in interactive environments.

Shneinderman also comments that: "Blackwell's approach might be enriched by more discussion of visual representation in functional product design tied to meaningful tasks". An area I have worked in is the pictorial representation of engineering assemblies to show that which is normally hidden from view. Techniques to do this on the printed page include 'ghosting' (making occluding parts appear as if transparent), 'exploding' (showing components separately, set out in dis-assembly order along an axis) and cutting away (taking a slice out of an outer shell to reveal mechanisms beneath). All these three-dimensional picturing techniques were used by, if not actually invented by, Leonardo Da Vinci (Richards 2006). All could be enhanced by interactive viewer control - an area of further fruitful exploration for picturing purposes in technical documentation contexts.

Blackwell's section on 'Pictures' warns us that when considering picturing options to avoid the "resemblance fallacy" pointing out the role that convention plays, even in so called photo-realistic images. He also points out that viewers can be distracted from the message by incidental information in 'realistic' pictures. From my own work in the field I know that technical illustrators' synoptic black and white outline depictions are regarded as best for drawing the viewer's attention to the key features of a pictorial representation. Research in this area has shown that when using linear perspective type drawings the appropriate deployment of lines of varying 'weight', rather than of a single thickness, can have a significant effect on viewers' levels of understanding about what is depicted (Richards, Bussard and Newman 2007). This work was done specifically to determine an 'easy to read' visual representational style when manipulating on the screen images of CAD objects. The most effective convention was shown to be: thin lines for edges where both planes forming the edge are visible and thicker lines for edges where only one plane is visible - that is where an outline edge forms a kind of horizon to the object.

These line thickness conventions appear on the face of it to have little to do with how we normally perceive the world, and Blackwell tells us that: "A good pictorial representation need not simulate visual experience any more than a good painting of a unicorn need resemble an actual unicorn". And some particular representations of unicorns can aid our understanding of how to use semiotic theory to figure out how pictures may be interpreted and, importantly, sometimes misunderstood - as I shall describe in the following.

Blackwell mentions semiotics, almost in passing, however it can help unravel some of the complexities of visual representation. Evelyn Goldsmith uses a Charles Addams cartoon to explain the relevance of the 'syntactic', 'semantic' and 'pragmatic' levels of semiotic analysis when applied to pictures (Goldsmith 1978). The cartoon in question, like many of those by Charles Addams, has no caption. It shows two unicorns standing on a small island in the pouring rain forlornly watching the Ark sailing away into the distance. Goldsmith suggests that most viewers will have little trouble in interpreting the overlapping elements in the scene, for example that one unicorn is standing behind the other, nor any difficulty understanding that the texture gradient of the sea stands for a receding horizontal plane. These represent the syntactic level of interpretation. Most adults will correctly identify the various components of the picture at the semantic level, however Goldsmith proposes that a young child might mistake the unicorns for horses and be happy with 'boat' for the Ark. But at the pragmatic level of interpretation, unless a viewer of the picture is aware of the story of Noah's Ark, the joke will be lost  - the connection will not be made between the scene depicted in the drawing and the scarcity of unicorns. This reinforces the point that one should not assume that the understanding of pictures is straightforward. There is much more to it than a simple matter or recognition. This is especially the case when metaphor is involved in visual representation.

Blackwell's section on 'Visual metaphor' is essentially a critique of the use of "theories of visual metaphor" as an "approach to explaining the conventional Mackintosh/Windows 'desktop' ". His is a convincing argument but there is much more which may be said about the use of visual metaphor - especially to show that which otherwise cannot be pictured. In fact most diagrams employ a kind of spatial metaphor when not depicting physical arrangements, for example when using the branches of a tree to represent relations within a family (Richards 2002). The capability to represent the invisible is the great strength of the visual metaphor, but there are dangers, and here I refer back to semiotics and particularly the pragmatic level of analysis. One needs to know the story to get the picture.

In our parental home, one of the many books much loved by my two brothers and me, was The Practical Encyclopaedia for Children (Odhams circa 1948). In it a double page spread illustration shows the possible evolutionary phases of the elephant. These are depicted as a procession of animals in a primordial swamp cum jungle setting. Starting with a tiny fish and passing to a small aquatic creature climbing out of the water onto the bank the procession progresses on through eight phases of transformation, including the Moeritherium and the Paleomatodon, finishing up with the land-based giant of today's African Elephant. Recently one of my brothers confessed to me that through studying this graphical diorama he had believed as a child that the elephant had a life cycle akin to that of a frog. He had understood that the procession was a metaphor for time. He had just got the duration wrong - by several orders of magnitude. He also hadn't understood that each separate depiction was of a different animal. He had used the arguably more sophisticated concept that it was the same animal at different times and stages in its individual development.

Please forgive the cliché if I say that this anecdote clearly illustrates that there can be more to looking at a picture than meets the eye? Blackwell's essay provides some useful pointers for exploring the possibilities of this fascinating territory of picturing and visual representation in general.   

  • Blackwell A 2011 'Visual representation' Interaction-Design.org
  • Few S 2010 ' Data visualisation for human perception ' Interaction-Design.org
  • Fischer D and Richards CJ 1995 'The presentation of time in interactive animated systems diagrams' In: Earnshaw RA and Vince JA (eds) Multimedia Systems and Applications London: Academic Press Ltd (pp141 - 159). ISBN 0-12-227740-6
  • Goldsmith E 1978 An analysis of the elements affecting comprehensibility of illustrations intended as supportive of text PhD thesis (CNAA) Brighton Polytechnic
  • Korsa R 2010 ' Commentary on Stephen Few's article : Data visualisation for human perception' Interaction-Design.org Odhams c. 1949 The practical encyclopaedia for children (pp 194 - 195)
  • Richards CJ 2002 'The fundamental design variables of diagramming' In: Oliver P, Anderson M and Meyer B (eds) Diagrammatic representation and reasoning London: Springer Verlag (pp 85 - 102) ISBN 1-85233-242-5
  • Richards CJ 2006 'Drawing out information - lines of communication in technical illustration' Information Design Journal 14 (2) 93 - 107
  • Richards CJ, Bussard N, Newman R 2007 'Weighing up line weights: the value of differing line thicknesses in technical illustrations' Information Design Journal 15 (2) 171 - 181
  • Shneiderman B 2011 'Commentary on Alan Blackwell's article: Visual representation' Interaction-Design.org
  • Twyman M 1982 'The graphic representation of language' Information Design Journal 3 (1) 2 - 22

5.12 Commentary by Peter C-H. Cheng

Alan Blackwell has provided us with a fine introduction to the design of visual representations. The article does a great job in motivating the novice designer of visual representations to explore some of the fundamental issues that lurk just beneath the surface of creating effective representations.  Furthermore, he gives us all quite a challenge:

Alan, quite rightly, claims that we must consider the fundamental principles of symbolic correspondence, if we are to design new genres of visual representations beyond the common forms of displays and interfaces.  The report begins to equip the novice visual representation designer with an understanding of the nature of symbolic correspondence between the components of visual representations and the things they represent, whether objects, actions or ideas.  In particular, it gives a useful survey of how correspondence works in a range of representations and provides a systematic framework of how systems of correspondence can be applied to design. The interactive screen shot is an exemplary visual representation that vividly reveals the correspondence techniques used in each part of the example diagram.

However, suppose you really wished to rise to the challenge of creating novel visual representations, how far will a knowledge of the fundamentals of symbolic correspondence take you? Drawing on my studies of the role of diagrams in the history of science, experience of inventing novel visual representations and research on problem solving and learning with diagrams, from the perspective of Cognitive Science, my view is that such knowledge will be necessary but not sufficient for your endeavours.  So, what else should the budding visual representation designer consider? From the perspective of cognitive science there are at least three aspects that we may profitably target.

First, there is the knowledge of how human process information; specifically the nature of the human cognitive architecture. By this, I mean more than visual perception, but an understanding of how we mentally receive, store, retrieve, transform and transmit information. The way the mind deals with each of these basic types of information processing provides relevant constrains for the design of visual representations. For instance, humans often, perhaps even typically, encode concepts in the form of hierarchies of schemas, which are information structures that coordinate attributes that describe and differentiate classes of concepts. These hierarchies of schemas underpin our ability to efficiently generalize or specialize concepts. Hence, we can use this knowledge to consider whether particular forms of symbolic correspondence will assist or hinder the forms of inference that we hope the user of the representation may make. For example, are the main symbolic correspondences in a visual representation consistent with the key attributes of the schemas for the concepts being considered?

Second, it may be useful for the designer to consider the broader nature of the tasks that the user may wish to do with the designed representation.  Resource allocation, optimization, calculating quantities, inferences about of possible outcomes, classification, reasoning about extreme or special cases, and debugging: these are just a few of the many possibilities. These tasks are more generic than the information-oriented options considered in the 'design uses' column of Figure 27 in the article. They are worth addressing, because they provide constraints for the initial stages of representation design, by narrowing the search for what are likely to be effective correspondences to adopt. For example, if taxonomic classification is important, then separation and layering will be important correspondences; whereas magnitude calculations may demand scale mapping, Euclidian and metrical correspondences.

The third aspect concerns situations in which the visual representation must support not just a single task, but many diverse tasks. For example, a visual representation to help students learn about electricity will be used to explain the topology of circuits, make computations with electrical quantities, provide explanations of circuit behaviour (in terms of formal algebraic models and as qualitative causal models), facilitate fault finding or trouble shooting, among other activities. The creation of novel representations in such circumstances is perhaps one of the most challenging for designers. So, what knowledge can help? In this case, I advocate attempting to design representations on the basis of an analysis of the underlying conceptual structure of the knowledge of the target domain. Why? Because the nature of the knowledge is invariant across different classes of task. For example, for problem solving and learning of electricity, all the tasks depend upon the common fundamental conceptual structures of the domain that knit together the laws governing the physical properties of electricity and circuit topology. Hence, a representation that makes these concepts readily available through effective representation designed will probably be effective for a wide range of tasks.

In summary, it is desirable for the aspiring visual representation designer to consider symbolic correspondence, but I recommend they cast their net more widely for inspiration by learning about the human cognitive architecture, focusing on the nature of the task for which they are designing, and most critically thinking about the underlying conceptual structure of the knowledge of the target domain.

5.13 Commentary by Brad A. Myers

I have been teaching human-computer interaction to students with a wide range of backgrounds for many years. One of the most difficult areas for them to learn seems to be visual design. Students seem to quickly pick up rules like Nielsen's Heuristics for interaction (Nielsen & Molich, 1990), whereas the guidelines for visual design are much more subtle. Alan Blackwell's article presents many useful points, but a designer needs to know so much more! Whereas students can achieve competence at achieving Nielsen's "consistency and standards," for example, they struggle with selecting an appropriate representation for their information. And only a trained graphic designer is likely to be able to create an attractive and effective icon. Some people have a much better aesthetic sense, and can create much more beautiful and appropriate representations. A key goal of my introductory course, therefore, is to try to impart to the students how difficult it is to do visual design, and how wide the set of choices is. Studying the examples that Blackwell provides will give the reader a small start towards effective visual representations, but the path requires talent, study, and then iterative design and testing to evaluate and improve a design's success.

  • Nielsen, J., & Molich, R. (1990). Heuristic evaluation of user interfaces. Paper presented at the Proc. ACM CHI'90 Conf, Seattle, WA, 249-256.
  • See also: http://www.useit.com/papers/heuristic/heuristic_list.html

Topics in This Book Chapter

Open access—link to us.

We believe in Open Access and the  democratization of knowledge . Unfortunately, world-class educational materials such as this page are normally hidden behind paywalls or in expensive textbooks.

If you want this to change , cite this book chapter , link to us, or join us to help us democratize design knowledge !

Privacy Settings

Our digital services use necessary tracking technologies, including third-party cookies, for security, functionality, and to uphold user rights. Optional cookies offer enhanced features, and analytics.

Experience the full potential of our site that remembers your preferences and supports secure sign-in.

Governs the storage of data necessary for maintaining website security, user authentication, and fraud prevention mechanisms.

Enhanced Functionality

Saves your settings and preferences, like your location, for a more personalized experience.

Referral Program

We use cookies to enable our referral program, giving you and your friends discounts.

Error Reporting

We share user ID with Bugsnag and NewRelic to help us track errors and fix issues.

Optimize your experience by allowing us to monitor site usage. You’ll enjoy a smoother, more personalized journey without compromising your privacy.

Analytics Storage

Collects anonymous data on how you navigate and interact, helping us make informed improvements.

Differentiates real visitors from automated bots, ensuring accurate usage data and improving your website experience.

Lets us tailor your digital ads to match your interests, making them more relevant and useful to you.

Advertising Storage

Stores information for better-targeted advertising, enhancing your online ad experience.

Personalization Storage

Permits storing data to personalize content and ads across Google services based on user behavior, enhancing overall user experience.

Advertising Personalization

Allows for content and ad personalization across Google services based on user behavior. This consent enhances user experiences.

Enables personalizing ads based on user data and interactions, allowing for more relevant advertising experiences across Google services.

Receive more relevant advertisements by sharing your interests and behavior with our trusted advertising partners.

Enables better ad targeting and measurement on Meta platforms, making ads you see more relevant.

Allows for improved ad effectiveness and measurement through Meta’s Conversions API, ensuring privacy-compliant data sharing.

LinkedIn Insights

Tracks conversions, retargeting, and web analytics for LinkedIn ad campaigns, enhancing ad relevance and performance.

LinkedIn CAPI

Enhances LinkedIn advertising through server-side event tracking, offering more accurate measurement and personalization.

Google Ads Tag

Tracks ad performance and user engagement, helping deliver ads that are most useful to you.

Share the knowledge!

Share this content on:

or copy link

Cite according to academic standards

Simply copy and paste the text below into your bibliographic reference list, onto your blog, or anywhere else. You can also just hyperlink to this book chapter.

New to UX Design? We’re giving you a free ebook!

The Basics of User Experience Design

Download our free ebook The Basics of User Experience Design to learn about core concepts of UX design.

In 9 chapters, we’ll cover: conducting user interviews, design thinking, interaction design, mobile UX design, usability, UX research, and many more!

Download Premium UX Design Literature

Enjoy unlimited downloads of our literature. Our online textbooks are written by 100+ leading designers, bestselling authors and Ivy League professors.

visual representation about

New to UX Design? We’re Giving You a Free ebook!

Initial Thoughts

Perspectives & resources, what is high-quality mathematics instruction and why is it important.

  • Page 1: The Importance of High-Quality Mathematics Instruction
  • Page 2: A Standards-Based Mathematics Curriculum
  • Page 3: Evidence-Based Mathematics Practices

What evidence-based mathematics practices can teachers employ?

  • Page 4: Explicit, Systematic Instruction

Page 5: Visual Representations

  • Page 6: Schema Instruction
  • Page 7: Metacognitive Strategies
  • Page 8: Effective Classroom Practices
  • Page 9: References & Additional Resources
  • Page 10: Credits

Teacher at board with student

Research Shows

  • Students who use accurate visual representations are six times more likely to correctly solve mathematics problems than are students who do not use them. However, students who use inaccurate visual representations are less likely to correctly solve mathematics problems than those who do not use visual representations at all. (Boonen, van Wesel, Jolles, & van der Schoot, 2014)
  • Students with a learning disability (LD) often do not create accurate visual representations or use them strategically to solve problems. Teaching students to systematically use a visual representation to solve word problems has led to substantial improvements in math achievement for students with learning disabilities. (van Garderen, Scheuermann, & Jackson, 2012; van Garderen, Scheuermann, & Poch, 2014)
  • Students who use visual representations to solve word problems are more likely to solve the problems accurately. This was equally true for students who had LD, were low-achieving, or were average-achieving. (Krawec, 2014)

Visual representations are flexible; they can be used across grade levels and types of math problems. They can be used by teachers to teach mathematics facts and by students to learn mathematics content. Visual representations can take a number of forms. Click on the links below to view some of the visual representations most commonly used by teachers and students.

How does this practice align?

High-leverage practice (hlp).

  • HLP15 : Provide scaffolded supports

CCSSM: Standards for Mathematical Practice

  • MP1 : Make sense of problems and persevere in solving them.

Number Lines

Definition : A straight line that shows the order of and the relation between numbers.

Common Uses : addition, subtraction, counting

number lines

Strip Diagrams

Definition : A bar divided into rectangles that accurately represent quantities noted in the problem.

Common Uses : addition, fractions, proportions, ratios

strip diagram

Definition : Simple drawings of concrete or real items (e.g., marbles, trucks).

Common Uses : counting, addition, subtraction, multiplication, division

pictures

Graphs/Charts

Definition : Drawings that depict information using lines, shapes, and colors.

Common Uses : comparing numbers, statistics, ratios, algebra

graphs and charts

Graphic Organizers

Definition : Visual that assists students in remembering and organizing information, as well as depicting the relationships between ideas (e.g., word webs, tables, Venn diagrams).

Common Uses : algebra, geometry

Before they can solve problems, however, students must first know what type of visual representation to create and use for a given mathematics problem. Some students—specifically, high-achieving students, gifted students—do this automatically, whereas others need to be explicitly taught how. This is especially the case for students who struggle with mathematics and those with mathematics learning disabilities. Without explicit, systematic instruction on how to create and use visual representations, these students often create visual representations that are disorganized or contain incorrect or partial information. Consider the examples below.

Elementary Example

Mrs. Aldridge ask her first-grade students to add 2 + 4 by drawing dots.

talias drawing of two plus four

Notice that Talia gets the correct answer. However, because Colby draws his dots in haphazard fashion, he fails to count all of them and consequently arrives at the wrong solution.

High School Example

Mr. Huang asks his students to solve the following word problem:

The flagpole needs to be replaced. The school would like to replace it with the same size pole. When Juan stands 11 feet from the base of the pole, the angle of elevation from Juan’s feet to the top of the pole is 70 degrees. How tall is the pole?

Compare the drawings below created by Brody and Zoe to represent this problem. Notice that Brody drew an accurate representation and applied the correct strategy. In contrast, Zoe drew a picture with partially correct information. The 11 is in the correct place, but the 70° is not. As a result of her inaccurate representation, Zoe is unable to move forward and solve the problem. However, given an accurate representation developed by someone else, Zoe is more likely to solve the problem correctly.

brodys drawing

Manipulatives

Some students will not be able to grasp mathematics skills and concepts using only the types of visual representations noted in the table above. Very young children and students who struggle with mathematics often require different types of visual representations known as manipulatives. These concrete, hands-on materials and objects—for example, an abacus or coins—help students to represent the mathematical idea they are trying to learn or the problem they are attempting to solve. Manipulatives can help students develop a conceptual understanding of mathematical topics. (For the purpose of this module, the term concrete objects refers to manipulatives and the term visual representations refers to schematic diagrams.)

It is important that the teacher make explicit the connection between the concrete object and the abstract concept being taught. The goal is for the student to eventually understand the concepts and procedures without the use of manipulatives. For secondary students who struggle with mathematics, teachers should show the abstract along with the concrete or visual representation and explicitly make the connection between them.

A move from concrete objects or visual representations to using abstract equations can be difficult for some students. One strategy teachers can use to help students systematically transition among concrete objects, visual representations, and abstract equations is the Concrete-Representational-Abstract (CRA) framework.

If you would like to learn more about this framework, click here.

Concrete-Representational-Abstract Framework

boy with manipulative number board

  • Concrete —Students interact and manipulate three-dimensional objects, for example algebra tiles or other algebra manipulatives with representations of variables and units.
  • Representational — Students use two-dimensional drawings to represent problems. These pictures may be presented to them by the teacher, or through the curriculum used in the class, or students may draw their own representation of the problem.
  • Abstract — Students solve problems with numbers, symbols, and words without any concrete or representational assistance.

CRA is effective across all age levels and can assist students in learning concepts, procedures, and applications. When implementing each component, teachers should use explicit, systematic instruction and continually monitor student work to assess their understanding, asking them questions about their thinking and providing clarification as needed. Concrete and representational activities must reflect the actual process of solving the problem so that students are able to generalize the process to solve an abstract equation. The illustration below highlights each of these components.

concrete pencils, representational count by marks, abstract numerals

For Your Information

One promising practice for moving secondary students with mathematics difficulties or disabilities from the use of manipulatives and visual representations to the abstract equation quickly is the CRA-I strategy . In this modified version of CRA, the teacher simultaneously presents the content using concrete objects, visual representations of the concrete objects, and the abstract equation. Studies have shown that this framework is effective for teaching algebra to this population of students (Strickland & Maccini, 2012; Strickland & Maccini, 2013; Strickland, 2017).

Kim Paulsen discusses the benefits of manipulatives and a number of things to keep in mind when using them (time: 2:35).

Kim Paulsen, EdD Associate Professor, Special Education Vanderbilt University

View Transcript

kim paulsen

Transcript: Kim Paulsen, EdD

Manipulatives are a great way of helping kids understand conceptually. The use of manipulatives really helps students see that conceptually, and it clicks a little more with them. Some of the things, though, that we need to remember when we’re using manipulatives is that it is important to give students a little bit of free time when you’re using a new manipulative so that they can just explore with them. We need to have specific rules for how to use manipulatives, that they aren’t toys, that they really are learning materials, and how students pick them up, how they put them away, the right time to use them, and making sure that they’re not distracters while we’re actually doing the presentation part of the lesson. One of the important things is that we don’t want students to memorize the algorithm or the procedures while they’re using the manipulatives. It really is just to help them understand conceptually. That doesn’t mean that kids are automatically going to understand conceptually or be able to make that bridge between using the concrete manipulatives into them being able to solve the problems. For some kids, it is difficult to use the manipulatives. That’s not how they learn, and so we don’t want to force kids to have to use manipulatives if it’s not something that is helpful for them. So we have to remember that manipulatives are one way to think about teaching math.

I think part of the reason that some teachers don’t use them is because it takes a lot of time, it takes a lot of organization, and they also feel that students get too reliant on using manipulatives. One way to think about using manipulatives is that you do it a couple of lessons when you’re teaching a new concept, and then take those away so that students are able to do just the computation part of it. It is true we can’t walk around life with manipulatives in our hands. And I think one of the other reasons that a lot of schools or teachers don’t use manipulatives is because they’re very expensive. And so it’s very helpful if all of the teachers in the school can pool resources and have a manipulative room where teachers can go check out manipulatives so that it’s not so expensive. Teachers have to know how to use them, and that takes a lot of practice.

Book cover

Universal, Intuitive, and Permanent Pictograms pp 11–31 Cite as

Step 2: Understanding Visual Representation(s)

  • Daniel Bühler 2  
  • First Online: 28 September 2021

455 Accesses

In the second HCD step, the context of use, including the users and the user tasks and goals, needs to be identified, described, and analyzed. This step usually consists in a close examination of real situations in which existing products are used by actual users. Since the context of use in the UIPP project was specified by the Universal Cognitive User Interface project, this chapter discusses universal characteristics of visual representations in HCI. That is, it describes in detail central properties and relations, it discusses existing pictogram systems, and it proposes a taxonomy of visual representations. For example, it argues that always two central properties must be considered: the design and the reference relation. The goal of the chapter is to achieve a general understanding of visual representations that might be the basis for the following steps in the process as much as for other design projects.

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Compact, lightweight edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

De Souza has argued as early as in 1993 (pp. 771–772) that, in addition to findings in cognitive sciences, semiotics would lead to a better understanding of HCI.

Author information

Authors and affiliations.

Berlin, Germany

Daniel Bühler

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Daniel Bühler .

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature

About this chapter

Cite this chapter.

Bühler, D. (2021). Step 2: Understanding Visual Representation(s). In: Universal, Intuitive, and Permanent Pictograms. Springer, Wiesbaden. https://doi.org/10.1007/978-3-658-32310-3_2

Download citation

DOI : https://doi.org/10.1007/978-3-658-32310-3_2

Published : 28 September 2021

Publisher Name : Springer, Wiesbaden

Print ISBN : 978-3-658-32309-7

Online ISBN : 978-3-658-32310-3

eBook Packages : Computer Science Computer Science (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Cogn Res Princ Implic

Creating visual explanations improves learning

Eliza bobek.

1 University of Massachusetts Lowell, Lowell, MA USA

Barbara Tversky

2 Stanford University, Columbia University Teachers College, New York, NY USA

Associated Data

Many topics in science are notoriously difficult for students to learn. Mechanisms and processes outside student experience present particular challenges. While instruction typically involves visualizations, students usually explain in words. Because visual explanations can show parts and processes of complex systems directly, creating them should have benefits beyond creating verbal explanations. We compared learning from creating visual or verbal explanations for two STEM domains, a mechanical system (bicycle pump) and a chemical system (bonding). Both kinds of explanations were analyzed for content and learning assess by a post-test. For the mechanical system, creating a visual explanation increased understanding particularly for participants of low spatial ability. For the chemical system, creating both visual and verbal explanations improved learning without new teaching. Creating a visual explanation was superior and benefitted participants of both high and low spatial ability. Visual explanations often included crucial yet invisible features. The greater effectiveness of visual explanations appears attributable to the checks they provide for completeness and coherence as well as to their roles as platforms for inference. The benefits should generalize to other domains like the social sciences, history, and archeology where important information can be visualized. Together, the findings provide support for the use of learner-generated visual explanations as a powerful learning tool.

Electronic supplementary material

The online version of this article (doi:10.1186/s41235-016-0031-6) contains supplementary material, which is available to authorized users.

Significance

Uncovering cognitive principles for effective teaching and learning is a central application of cognitive psychology. Here we show: (1) creating explanations of STEM phenomena improves learning without additional teaching; and (2) creating visual explanations is superior to creating verbal ones. There are several notable differences between visual and verbal explanations; visual explanations map thought more directly than words and provide checks for completeness and coherence as well as a platform for inference, notably from structure to process. Extensions of the technique to other domains should be possible. Creating visual explanations is likely to enhance students’ spatial thinking skills, skills that are increasingly needed in the contemporary and future world.

Dynamic systems such as those in science and engineering, but also in history, politics, and other domains, are notoriously difficult to learn (e.g. Chi, DeLeeuw, Chiu, & Lavancher, 1994 ; Hmelo-Silver & Pfeffer, 2004 ; Johnstone, 1991 ; Perkins & Grotzer, 2005 ). Mechanisms, processes, and behavior of complex systems present particular challenges. Learners must master not only the individual components of the system or process (structure) but also the interactions and mechanisms (function), which may be complex and frequently invisible. If the phenomena are macroscopic, sub-microscopic, or abstract, there is an additional level of difficulty. Although the teaching of STEM phenomena typically relies on visualizations, such as pictures, graphs, and diagrams, learning is typically revealed in words, both spoken and written. Visualizations have many advantages over verbal explanations for teaching; can creating visual explanations promote learning?

Learning from visual representations in STEM

Given the inherent challenges in teaching and learning complex or invisible processes in science, educators have developed ways of representing these processes to enable and enhance student understanding. External visual representations, including diagrams, photographs, illustrations, flow charts, and graphs, are often used in science to both illustrate and explain concepts (e.g., Hegarty, Carpenter, & Just, 1990 ; Mayer, 1989 ). Visualizations can directly represent many structural and behavioral properties. They also help to draw inferences (Larkin & Simon, 1987 ), find routes in maps (Levine, 1982 ), spot trends in graphs (Kessell & Tversky, 2011 ; Zacks & Tversky, 1999 ), imagine traffic flow or seasonal changes in light from architectural sketches (e.g. Tversky & Suwa, 2009 ), and determine the consequences of movements of gears and pulleys in mechanical systems (e.g. Hegarty & Just, 1993 ; Hegarty, Kriz, & Cate, 2003 ). The use of visual elements such as arrows is another benefit to learning with visualizations. Arrows are widely produced and comprehended as representing a range of kinds of forces as well as changes over time (e.g. Heiser & Tversky, 2002 ; Tversky, Heiser, MacKenzie, Lozano, & Morrison, 2007 ). Visualizations are thus readily able to depict the parts and configurations of systems; presenting the same content via language may be more difficult. Although words can describe spatial properties, because the correspondences of meaning to language are purely symbolic, comprehension and construction of mental representations from descriptions is far more effortful and error prone (e.g. Glenberg & Langston, 1992 ; Hegarty & Just, 1993 ; Larkin & Simon, 1987 ; Mayer, 1989 ). Given the differences in how visual and verbal information is processed, how learners draw inferences and construct understanding in these two modes warrants further investigation.

Benefits of generating explanations

Learner-generated explanations of scientific phenomena may be an important learning strategy to consider beyond the utility of learning from a provided external visualization. Explanations convey information about concepts or processes with the goal of making clear and comprehensible an idea or set of ideas. Explanations may involve a variety of elements, such as the use of examples and analogies (Roscoe & Chi, 2007 ). When explaining something new, learners may have to think carefully about the relationships between elements in the process and prioritize the multitude of information available to them. Generating explanations may require learners to reorganize their mental models by allowing them to make and refine connections between and among elements and concepts. Explaining may also help learners metacognitively address their own knowledge gaps and misconceptions.

Many studies have shown that learning is enhanced when students are actively engaged in creative, generative activities (e.g. Chi, 2009 ; Hall, Bailey, & Tillman, 1997 ). Generative activities have been shown to benefit comprehension of domains involving invisible components, including electric circuits (Johnson & Mayer, 2010 ) and the chemistry of detergents (Schwamborn, Mayer, Thillmann, Leopold, & Leutner, 2010 ). Wittrock’s ( 1990 ) generative theory stresses the importance of learners actively constructing and developing relationships. Generative activities require learners to select information and choose how to integrate and represent the information in a unified way. When learners make connections between pieces of information, knowledge, and experience, by generating headings, summaries, pictures, and analogies, deeper understanding develops.

The information learners draw upon to construct their explanations is likely important. For example, Ainsworth and Loizou ( 2003 ) found that asking participants to self-explain with a diagram resulted in greater learning than self-explaining from text. How might learners explain with physical mechanisms or materials with multi-modal information?

Generating visual explanations

Learner-generated visualizations have been explored in several domains. Gobert and Clement ( 1999 ) investigated the effectiveness of student-generated diagrams versus student-generated summaries on understanding plate tectonics after reading an expository text. Students who generated diagrams scored significantly higher on a post-test measuring spatial and causal/dynamic content, even though the diagrams contained less domain-related information. Hall et al. ( 1997 ) showed that learners who generated their own illustrations from text performed equally as well as learners provided with text and illustrations. Both groups outperformed learners only provided with text. In a study concerning the law of conservation of energy, participants who generated drawings scored higher on a post-test than participants who wrote their own narrative of the process (Edens & Potter, 2003 ). In addition, the quality and number of concept units present in the drawing/science log correlated with performance on the post-test. Van Meter ( 2001 ) found that drawing while reading a text about Newton’s Laws was more effective than answering prompts in writing.

One aspect to explore is whether visual and verbal productions contain different types of information. Learning advantages for the generation of visualizations could be attributed to learners’ translating across modalities, from a verbal format into a visual format. Translating verbal information from the text into a visual explanation may promote deeper processing of the material and more complete and comprehensive mental models (Craik & Lockhart, 1972 ). Ainsworth and Iacovides ( 2005 ) addressed this issue by asking two groups of learners to self-explain while learning about the circulatory system of the human body. Learners given diagrams were asked to self-explain in writing and learners given text were asked to explain using a diagram. The results showed no overall differences in learning outcomes, however the learners provided text included significantly more information in their diagrams than the other group. Aleven and Koedinger ( 2002 ) argue that explanations are most helpful if they can integrate visual and verbal information. Translating across modalities may serve this purpose, although translating is not necessarily an easy task (Ainsworth, Bibby, & Wood, 2002 ).

It is important to remember that not all studies have found advantages to generating explanations. Wilkin ( 1997 ) found that directions to self-explain using a diagram hindered understanding in examples in physical motion when students were presented with text and instructed to draw a diagram. She argues that the diagrams encouraged learners to connect familiar but unrelated knowledge. In particular, “low benefit learners” in her study inappropriately used spatial adjacency and location to connect parts of diagrams, instead of the particular properties of those parts. Wilkin argues that these learners are novices and that experts may not make the same mistake since they have the skills to analyze features of a diagram according to their relevant properties. She also argues that the benefits of self-explaining are highest when the learning activity is constrained so that learners are limited in their possible interpretations. Other studies that have not found a learning advantage from generating drawings have in common an absence of support for the learner (Alesandrini, 1981 ; Leutner, Leopold, & Sumfleth, 2009 ). Another mediating factor may be the learner’s spatial ability.

The role of spatial ability

Spatial thinking involves objects, their size, location, shape, their relation to one another, and how and where they move through space. How then, might learners with different levels of spatial ability gain structural and functional understanding in science and how might this ability affect the utility of learner-generated visual explanations? Several lines of research have sought to explore the role of spatial ability in learning science. Kozhevnikov, Hegarty, and Mayer ( 2002 ) found that low spatial ability participants interpreted graphs as pictures, whereas high spatial ability participants were able to construct more schematic images and manipulate them spatially. Hegarty and Just ( 1993 ) found that the ability to mentally animate mechanical systems correlated with spatial ability, but not verbal ability. In their study, low spatial ability participants made more errors in movement verification tasks. Leutner et al. ( 2009 ) found no effect of spatial ability on the effectiveness of drawing compared to mentally imagining text content. Mayer and Sims ( 1994 ) found that spatial ability played a role in participants’ ability to integrate visual and verbal information presented in an animation. The authors argue that their results can be interpreted within the context of dual-coding theory. They suggest that low spatial ability participants must devote large amounts of cognitive effort into building a visual representation of the system. High spatial ability participants, on the other hand, are more able to allocate sufficient cognitive resources to building referential connections between visual and verbal information.

Benefits of testing

Although not presented that way, creating an explanation could be regarded as a form of testing. Considerable research has documented positive effects of testing on learning. Presumably taking a test requires retrieving and sometimes integrating the learned material and those processes can augment learning without additional teaching or study (e.g. Roediger & Karpicke, 2006 ; Roediger, Putnam, & Smith, 2011 ; Wheeler & Roediger, 1992 ). Hausmann and Vanlehn ( 2007 ) addressed the possibility that generating explanations is beneficial because learners merely spend more time with the content material than learners who are not required to generate an explanation. In their study, they compared the effects of using instructions to self-explain with instructions to merely paraphrase physics (electrodynamics) material. Attending to provided explanations by paraphrasing was not as effective as generating explanations as evidenced by retention scores on an exam 29 days after the experiment and transfer scores within and across domains. Their study concludes, “the important variable for learning was the process of producing an explanation” (p. 423). Thus, we expect benefits from creating either kind of explanation but for the reasons outlined previously, we expect larger benefits from creating visual explanations.

Present experiments

This study set out to answer a number of related questions about the role of learner-generated explanations in learning and understanding of invisible processes. (1) Do students learn more when they generate visual or verbal explanations? We anticipate that learning will be greater with the creation of visual explanations, as they encourage completeness and the integration of structure and function. (2) Does the inclusion of structural and functional information correlate with learning as measured by a post-test? We predict that including greater counts of information, particularly invisible and functional information, will positively correlate with higher post-test scores. (3) Does spatial ability predict the inclusion of structural and functional information in explanations, and does spatial ability predict post-test scores? We predict that high spatial ability participants will include more information in their explanations, and will score higher on post-tests.

Experiment 1

The first experiment examines the effects of creating visual or verbal explanations on the comprehension of a bicycle tire pump’s operation in participants with low and high spatial ability. Although the pump itself is not invisible, the components crucial to its function, notably the inlet and outlet valves, and the movement of air, are located inside the pump. It was predicted that visual explanations would include more information than verbal explanations, particularly structural information, since their construction encourages completeness and the production of a whole mechanical system. It was also predicted that functional information would be biased towards a verbal format, since much of the function of the pump is hidden and difficult to express in pictures. Finally, it was predicted that high spatial ability participants would be able to produce more complete explanations and would thus also demonstrate better performance on the post-test. Explanations were coded for structural and functional content, essential features, invisible features, arrows, and multiple steps.

Participants

Participants were 127 (59 female) seventh and eighth grade students, aged 12–14 years, enrolled in an independent school in New York City. The school’s student body is 70% white, 30% other ethnicities. Approximately 25% of the student body receives financial aid. The sample consisted of three class sections of seventh grade students and three class sections of eighth grade students. Both seventh and eighth grade classes were integrated science (earth, life, and physical sciences) and students were not grouped according to ability in any section. Written parental consent was obtained by means of signed informed consent forms. Each participant was randomly assigned to one of two conditions within each class. There were 64 participants in the visual condition explained the bicycle pump’s function by drawing and 63 participants explained the pump’s function by writing.

The materials consisted of a 12-inch Spalding bicycle pump, a blank 8.5 × 11 in. sheet of paper, and a post-test (Additional file 1 ). The pump’s chamber and hose were made of clear plastic; the handle and piston were black plastic. The parts of the pump (e.g. inlet valve, piston) were labeled.

Spatial ability was assessed using the Vandenberg and Kuse ( 1978 ) mental rotation test (MRT). The MRT is a 20-item test in which two-dimensional drawings of three-dimensional objects are compared. Each item consists of one “target” drawing and four drawings that are to be compared to the target. Two of the four drawings are rotated versions of the target drawing and the other two are not. The task is to identify the two rotated versions of the target. A score was determined by assigning one point to each question if both of the correct rotated versions were chosen. The maximum score was 20 points.

The post-test consisted of 16 true/false questions printed on a single sheet of paper measuring 8.5 × 11 in. Half of the questions related to the structure of the pump and the other half related to its function. The questions were adapted from Heiser and Tversky ( 2002 ) in order to be clear and comprehensible for this age group.

The experiment was conducted over the course of two non-consecutive days during the normal school day and during regularly scheduled class time. On the first day, participants completed the MRT as a whole-class activity. After completing an untimed practice test, they were given 3 min for each of the two parts of the MRT. On the second day, occurring between two and four days after completing the MRT, participants were individually asked to study an actual bicycle tire pump and were then asked to generate explanations of its function. The participants were tested individually in a quiet room away from the rest of the class. In addition to the pump, each participant was one instruction sheet and one blank sheet of paper for their explanations. The post-test was given upon completion of the explanation. The instruction sheet was read aloud to participants and they were instructed to read along. The first set of instructions was as follows: “A bicycle pump is a mechanical device that pumps air into bicycle tires. First, take this bicycle pump and try to understand how it works. Spend as much time as you need to understand the pump.” The next set of instructions differed for participants in each condition. The instructions for the visual condition were as follows: “Then, we would like you to draw your own diagram or set of diagrams that explain how the bike pump works. Draw your explanation so that someone else who has not seen the pump could understand the bike pump from your explanation. Don’t worry about the artistic quality of the diagrams; in fact, if something is hard for you to draw, you can explain what you would draw. What’s important is that the explanation should be primarily visual, in a diagram or diagrams.” The instructions for the verbal condition were as follows: “Then, we would like you to write an explanation of how the bike pump works. Write your explanation so that someone else who has not seen the pump could understand the bike pump from your explanation.” All participants then received these instructions: “You may not use the pump while you create your explanations. Please return it to me when you are ready to begin your explanation. When you are finished with the explanation, you will hand in your explanation to me and I will then give you 16 true/false questions about the bike pump. You will not be able to look at your explanation while you complete the questions.” Study and test were untimed. All students finished within the 45-min class period.

Spatial ability

The mean score on the MRT was 10.56, with a median of 11. Boys scored significantly higher (M = 13.5, SD = 4.4) than girls (M = 8.8, SD = 4.5), F(1, 126) = 19.07, p  < 0.01, a typical finding (Voyer, Voyer, & Bryden, 1995 ). Participants were split into high or low spatial ability by the median. Low and high spatial ability participants were equally distributed in the visual and verbal groups.

Learning outcomes

It was predicted that high spatial ability participants would be better able to mentally animate the bicycle pump system and therefore score higher on the post-test and that post-test scores would be higher for those who created visual explanations. Table  1 shows the scores on the post-test by condition and spatial ability. A two-way factorial ANOVA revealed marginally significant main effect of spatial ability F(1, 124) = 3.680, p  = 0.06, with high spatial ability participants scoring higher on the post-test. There was also a significant interaction between spatial ability and explanation type F(1, 124) = 4.094, p  < 0.01, see Fig.  1 . Creating a visual explanation of the bicycle pump selectively helped low spatial participants.

Post-test scores, by explanation type and spatial ability

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig1_HTML.jpg

Scores on the post-test by condition and spatial ability

Coding explanations

Explanations (see Fig.  2 ) were coded for structural and functional content, essential features, invisible features, arrows, and multiple steps. A subset of the explanations (20%) was coded by the first author and another researcher using the same coding system as a guide. The agreement between scores was above 90% for all measures. Disagreements were resolved through discussion. The first author then scored the remaining explanations.

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig2_HTML.jpg

Examples of visual and verbal explanations of the bicycle pump

Coding for structure and function

A maximum score of 12 points was awarded for the inclusion and labeling of six structural components: chamber, piston, inlet valve, outlet valve, handle, and hose. For the visual explanations, 1 point was given for a component drawn correctly and 1 additional point if the component was labeled correctly. For verbal explanations, sentences were divided into propositions, the smallest unit of meaning in a sentence. Descriptions of structural location e.g. “at the end of the piston is the inlet valve,” or of features of the components, e.g. the shape of a part, counted as structural components. Information was coded as functional if it depicted (typically with an arrow) or described the function/movement of an individual part, or the way multiple parts interact. No explanation contained more than ten functional units.

Visual explanations contained significantly more structural components (M = 6.05, SD = 2.76) than verbal explanations (M = 4.27, SD = 1.54), F(1, 126) = 20.53, p  < 0.05. The number of functional components did not differ between visual and verbal explanations as displayed in Figs.  3 and ​ and4. 4 . Many visual explanations (67%) contained verbal components; the structural and functional information in explanations was coded as depictive or descriptive. Structural and functional information were equally likely to be expressed in words or pictures in visual explanations. It was predicted that explanations created by high spatial participants would include more functional information. However, there were no significant differences found between low spatial (M = 5.15, SD = 2.21) and high spatial (M = 4.62, SD = 2.16) participants in the number of structural units or between low spatial (M = 3.83, SD = 2.51) and high spatial (M = 4.10, SD = 2.13) participants in the number of functional units.

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig3_HTML.jpg

Average number of structural and functional components in visual and verbal explanations

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig4_HTML.jpg

Visual and verbal explanations of chemical bonding

Coding of essential features

To further establish a relationship between the explanations generated and outcomes on the post-test, explanations were also coded for the inclusion of information essential to its function according to a 4-point scale (adapted from Hall et al., 1997 ). One point was given if both the inlet and the outlet valve were clearly present in the drawing or described in writing, 1 point was given if the piston inserted into the chamber was shown or described to be airtight, and 1 point was given for each of the two valves if they were shown or described to be opening/closing in the correct direction.

Visual explanations contained significantly more essential information (M = 1.78, SD = 1.0) than verbal explanations (M = 1.20, SD = 1.21), F(1, 126) = 7.63, p  < 0.05. Inclusion of essential features correlated positively with post-test scores, r = 0.197, p  < 0.05).

Coding arrows and multiple steps

For the visual explanations, three uses of arrows were coded and tallied: labeling a part or action, showing motion, or indicating sequence. Analysis of visual explanations revealed that 87% contained arrows. No significant differences were found between low and high spatial participants’ use of arrows to label and no signification correlations were found between the use of arrows and learning outcomes measured on the post-test.

The explanations were coded for the number of discrete steps used to explain the process of using the bike pump. The number of steps used by participants ranged from one to six. Participants whose explanations, whether verbal or visual, contained multiple steps scored significantly higher (M = 0.76, SD = 0.18) on the post-test than participants whose explanations consisted of a single step (M = 0.67, SD = 0.19), F(1, 126) = 5.02, p  < 0.05.

Coding invisible features

The bicycle tire pump, like many mechanical devices, contains several structural features that are hidden or invisible and must be inferred from the function of the pump. For the bicycle pump the invisible features are the inlet and outlet valves and the three phases of movement of air, entering the pump, moving through the pump, exiting the pump. Each feature received 1 point for a total of 5 possible points.

The mean score for the inclusion of invisible features was 3.26, SD = 1.25. The data were analyzed using linear regression and revealed that the total score for invisible parts significantly predicted scores on the post-test, F(1, 118) = 3.80, p  = 0.05.

In the first experiment, students learned the workings of a bicycle pump from interacting with an actual pump and creating a visual or verbal explanation of its function. Understanding the functionality of a bike pump depends on the actions and consequences of parts that are not visible. Overall, the results provide support for the use of learner-generated visual explanations in developing understanding of a new scientific system. The results show that low spatial ability participants were able to learn as successfully as high spatial ability participants when they first generated an explanation in a visual format.

Visual explanations may have led to greater understanding for a number of reasons. As discussed previously, visual explanations encourage completeness. They force learners to decide on the size, shape, and location of parts/objects. Understanding the “hidden” function of the invisible parts is key to understanding the function of the entire system and requires an understanding of how both the visible and invisible parts interact. The visual format may have been able to elicit components and concepts that are invisible and difficult to integrate into the formation of a mental model. The results show that including more of the essential features and showing multiple steps correlated with superior test performance. Understanding the bicycle pump requires understanding how all of these components are connected through movement, force, and function. Many (67%) of the visual explanations also contained written components to accompany their explanation. Arguably, some types of information may be difficult to depict visually and verbal language has many possibilities that allow for specificity. The inclusion of text as a complement to visual explanations may be key to the success of learner-generated explanations and the development of understanding.

A limitation of this experiment is that participants were not provided with detailed instructions for completing their explanations. In addition, this experiment does not fully clarify the role of spatial ability, since high spatial participants in the visual and verbal groups demonstrated equivalent knowledge of the pump on the post-test. One possibility is that the interaction with the bicycle pump prior to generating explanations was a sufficient learning experience for the high spatial participants. Other researchers (e.g. Flick, 1993 ) have shown that hands-on interactive experiences can be effective learning situations. High spatial ability participants may be better able to imagine the movement and function of a system (e.g. Hegarty, 1992 ).

Experiment 1 examined learning a mechanical system with invisible (hidden) parts. Participants were introduced to the system by being able to interact with an actual bicycle pump. While we did not assess participants’ prior knowledge of the pump with a pre-test, participants were randomly assigned to each condition. The findings have promising implications for teaching. Creating visual explanations should be an effective way to improve performance, especially in low spatial students. Instructors can guide the creation of visual explanations toward the features that augment learning. For example, students can be encouraged to show every step and action and to focus on the essential parts, even if invisible. The coding system shows that visual explanations can be objectively evaluated to provide feedback on students’ understanding. The utility of visual explanations may differ for scientific phenomena that are more abstract, or contain elements that are invisible due to their scale. Experiment 2 addresses this possibility by examining a sub-microscopic area of science: chemical bonding.

Experiment 2

In this experiment, we examine visual and verbal explanations in an area of chemistry: ionic and covalent bonding. Chemistry is often regarded as a difficult subject; one of the essential or inherent features of chemistry which presents difficulty is the interplay between the macroscopic, sub-microscopic, and representational levels (e.g. Bradley & Brand, 1985 ; Johnstone, 1991 ; Taber, 1997 ). In chemical bonding, invisible components engage in complex processes whose scale makes them impossible to observe. Chemists routinely use visual representations to investigate relationships and move between the observable, physical level and the invisible particulate level (Kozma, Chin, Russell, & Marx, 2002 ). Generating explanations in a visual format may be a particularly useful learning tool for this domain.

For this topic, we expect that creating a visual rather than verbal explanation will aid students of both high and low spatial abilities. Visual explanations demand completeness; they were predicted to include more information than verbal explanations, particularly structural information. The inclusion of functional information should lead to better performance on the post-test since understanding how and why atoms bond is crucial to understanding the process. Participants with high spatial ability may be better able to explain function since the sub-microscopic nature of bonding requires mentally imagining invisible particles and how they interact. This experiment also asks whether creating an explanation per se can increase learning in the absence of additional teaching by administering two post-tests of knowledge, one immediately following instruction but before creating an explanation and one after creating an explanation. The scores on this immediate post-test were used to confirm that the visual and verbal groups were equivalent prior to the generation of explanations. Explanations were coded for structural and functional information, arrows, specific examples, and multiple representations. Do the acts of selecting, integrating, and explaining knowledge serve learning even in the absence of further study or teaching?

Participants were 126 (58 female) eighth grade students, aged 13–14 years, with written parental consent and enrolled in the same independent school described in Experiment 1. None of the students previously participated in Experiment 1. As in Experiment 1, randomization occurred within-class, with participants assigned to either the visual or verbal explanation condition.

The materials consisted of the MRT (same as Experiment 1), a video lesson on chemical bonding, two versions of the instructions, the immediate post-test, the delayed post-test, and a blank page for the explanations. All paper materials were typed on 8.5 × 11 in. sheets of paper. Both immediate and delayed post-tests consisted of seven multiple-choice items and three free-response items. The video lesson on chemical bonding consisted of a video that was 13 min 22 s. The video began with a brief review of atoms and their structure and introduced the idea that atoms combine to form molecules. Next, the lesson showed that location in the periodic table reveals the behavior and reactivity of atoms, in particular the gain, loss, or sharing of electrons. Examples of atoms, their valence shell structure, stability, charges, transfer and sharing of electrons, and the formation of ionic, covalent, and polar covalent bonds were discussed. The example of NaCl (table salt) was used to illustrate ionic bonding and the examples of O 2 and H 2 O (water) were used to illustrate covalent bonding. Information was presented verbally, accompanied by drawings, written notes of keywords and terms, and a color-coded periodic table.

On the first of three non-consecutive school days, participants completed the MRT as a whole-class activity. On the second day (occurring between two and three days after completing the MRT), participants viewed the recorded lesson on chemical bonding. They were instructed to pay close attention to the material but were not allowed to take notes. Immediately following the video, participants had 20 min to complete the immediate post-test; all finished within this time frame. On the third day (occurring on the next school day after viewing the video and completing the immediate post-test), the participants were randomly assigned to either the visual or verbal explanation condition. The typed instructions were given to participants along with a blank 8.5 × 11 in. sheet of paper for their explanations. The instructions differed for each condition. For the visual condition, the instructions were as follows: “You have just finished learning about chemical bonding. On the next piece of paper, draw an explanation of how atoms bond and how ionic and covalent bonds differ. Draw your explanation so that another student your age who has never studied this topic will be able to understand it. Be as clear and complete as possible, and remember to use pictures/diagrams only. After you complete your explanation, you will be asked to answer a series of questions about bonding.”

For the verbal condition the instructions were: “You have just finished learning about chemical bonding. On the next piece of paper, write an explanation of how atoms bond and how ionic and covalent bonds differ. Write your explanation so that another student your age who has never studied this topic will be able to understand it. Be as clear and complete as possible. After you complete your explanation, you will be asked to answer a series of questions about bonding.”

Participants were instructed to read the instructions carefully before beginning the task. The participants completed their explanations as a whole-class activity. Participants were given unlimited time to complete their explanations. Upon completion of their explanations, participants were asked to complete the ten-question delayed post-test (comparable to but different from the first) and were given a maximum of 20 min to do so. All participants completed their explanations as well as the post-test during the 45-min class period.

The mean score on the MRT was 10.39, with a median of 11. Boys (M = 12.5, SD = 4.8) scored significantly higher than girls (M = 8.0, SD = 4.0), F(1, 125) = 24.49, p  < 0.01. Participants were split into low and high spatial ability based on the median.

The maximum score for both the immediate and delayed post-test was 10 points. A repeated measures ANOVA showed that the difference between the immediate post-test scores (M = 4.63, SD = 0.469) and delayed post-test scores (M = 7.04, SD = 0.299) was statistically significant F(1, 125) = 18.501, p  < 0.05). Without any further instruction, scores increased following the generation of a visual or verbal explanation. Both groups improved significantly; those who created visual explanations (M = 8.22, SD = 0.208), F(1, 125) = 51.24, p  < 0.01, Cohen’s d  = 1.27 as well as those who created verbal explanations (M = 6.31, SD = 0.273), F(1,125) = 15.796, p  < 0.05, Cohen’s d  = 0.71. As seen in Fig.  5 , participants who generated visual explanations (M = 0.822, SD = 0.208) scored considerably higher on the delayed post-test than participants who generated verbal explanations (M = 0.631, SD = 0.273), F(1, 125) = 19.707, p  < 0.01, Cohen’s d  = 0.88. In addition, high spatial participants (M = 0.824, SD = 0.273) scored significantly higher than low spatial participants (M = 0.636, SD = 0.207), F(1, 125) = 19.94, p  < 0.01, Cohen’s d  = 0.87. The results of the test of the interaction between group and spatial ability was not significant.

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig5_HTML.jpg

Scores on the post-tests by explanation type and spatial ability

Explanations were coded for structural and functional content, arrows, specific examples, and multiple representations. A subset of the explanations (20%) was coded by both the first author and a middle school science teacher with expertise in Chemistry. Both scorers used the same coding system as a guide. The percentage of agreement between scores was above 90 for all measures. The first author then scored the remainder of the explanations. As evident from Fig.  4 , the visual explanations were individual inventions; they neither resembled each other nor those used in teaching. Most contained language, especially labels and symbolic language such as NaCl.

Structure, function, and modality

Visual and verbal explanations were coded for depicting or describing structural and functional components. The structural components included the following: the correct number of valence electrons, the correct charges of atoms, the bonds between non-metals for covalent molecules and between a metal and non-metal for ionic molecules, the crystalline structure of ionic molecules, and that covalent bonds were individual molecules. The functional components included the following: transfer of electrons in ionic bonds, sharing of electrons in covalent bonds, attraction between ions of opposite charge, bonding resulting in atoms with neutral charge and stable electron shell configurations, and outcome of bonding shows molecules with overall neutral charge. The presence of each component was awarded 1 point; the maximum possible points was 5 for structural and 5 for functional information. The modality, visual or verbal, of each component was also coded; if the information was given in both formats, both were coded.

As displayed in Fig.  6 , visual explanations contained a significantly greater number of structural components (M = 2.81, SD = 1.56) than verbal explanations (M = 1.30, SD = 1.54), F(1, 125) = 13.69, p  < 0.05. There were no differences between verbal and visual explanations in the number of functional components. Structural information was more likely to be depicted (M = 3.38, SD = 1.49) than described (M = 0.429, SD = 1.03), F(1, 62) = 21.49, p  < 0.05, but functional information was equally likely to be depicted (M = 1.86, SD = 1.10) or described (M = 1.71, SD = 1.87).

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig6_HTML.jpg

Functional information expressed verbally in the visual explanations significantly predicted scores on the post-test, F(1, 62) = 21.603, p  < 0.01, while functional information in verbal explanations did not. The inclusion of structural information did not significantly predict test scores. As seen Fig.  7 , explanations created by high spatial participants contained significantly more functional components, F(1, 125) = 7.13, p  < 0.05, but there were no ability differences in the amount of structural information created by high spatial participants in either visual or verbal explanations.

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig7_HTML.jpg

Average number of structural and functional components created by low and high spatial ability learners

Ninety-two percent of visual explanations contained arrows. Arrows were used to indicate motion as well as to label. The use of arrows was positively correlated with scores on the post-test, r = 0.293, p  < 0.05. There were no significant differences in the use of arrows between low and high spatial participants.

Specific examples

Explanations were coded for the use of specific examples, such as NaCl, to illustrate ionic bonding and CO 2 and O 2 to illustrate covalent bonding. High spatial participants (M = 1.6, SD = 0.69) used specific examples in their verbal and visual explanations more often than low spatial participants (M = 1.07, SD = 0.79), a marginally significant effect F(1, 125) = 3.65, p  = 0.06. Visual and verbal explanations did not differ in the presence of specific examples. The inclusion of a specific example was positively correlated with delayed test scores, r = 0.555, p  < 0.05.

Use of multiple representations

Many of the explanations (65%) contained multiple representations of bonding. For example, ionic bonding and its properties can be represented at the level of individual atoms or at the level of many atoms bonded together in a crystalline compound. The representations that were coded were as follows: symbolic (e.g. NaCl), atomic (showing structure of atom(s), and macroscopic (visible). Participants who created visual explanations generated significantly more (M =1.79, SD = 1.20) than those who created verbal explanations (M = 1.33, SD = 0.48), F (125) = 6.03, p  < 0.05. However, the use of multiple representations did not significantly correlate with delayed post-test scores on the delayed post-test.

Metaphoric explanations

Although there were too few examples to be included in the statistical analyses, some participants in the visual group created explanations that used metaphors and/or analogies to illustrate the differences between the types of bonding. Figure  4 shows examples of metaphoric explanations. In one example, two stick figures are used to show “transfer” and “sharing” of an object between people. In another, two sharks are used to represent sodium and chlorine, and the transfer of fish instead of electrons.

In the second experiment, students were introduced to chemical bonding, a more abstract and complex set of phenomena than the bicycle pump used in the first experiment. Students were tested immediately after instruction. The following day, half the students created visual explanations and half created verbal explanations. Following creation of the explanations, students were tested again, with different questions. Performance was considerably higher as a consequence of creating either explanation despite the absence of new teaching. Generating an explanation in this way could be regarded as a test of learning. Seen this way, the results echo and amplify previous research showing the advantages of testing over study (e.g. Roediger et al., 2011 ; Roediger & Karpicke, 2006 ; Wheeler & Roediger, 1992 ). Specifically, creating an explanation requires selecting the crucial information, integrating it temporally and causally, and expressing it clearly, processes that seem to augment learning and understanding without additional teaching. Importantly, creating a visual explanation gave an extra boost to learning outcomes over and above the gains provided by creating a verbal explanation. This is most likely due to the directness of mapping complex systems to a visual-spatial format, a format that can also provide a natural check for completeness and coherence as well as a platform for inference. In the case of this more abstract and complex material, generating a visual explanation benefited both low spatial and high spatial participants even if it did not bring low spatial participants up to the level of high spatial participants as for the bicycle pump.

Participants high in spatial ability not only scored better, they also generated better explanations, including more of the information that predicted learning. Their explanations contained more functional information and more specific examples. Their visual explanations also contained more functional information.

As in Experiment 1, qualities of the explanations predicted learning outcomes. Including more arrows, typically used to indicate function, predicted delayed test scores as did articulating more functional information in words in visual explanations. Including more specific examples in both types of explanation also improved learning outcomes. These are all indications of deeper understanding of the processes, primarily expressed in the visual explanations. As before, these findings provide ways that educators can guide students to craft better visual explanations and augment learning.

General discussion

Two experiments examined how learner-generated explanations, particularly visual explanations, can be used to increase understanding in scientific domains, notably those that contain “invisible” components. It was proposed that visual explanations would be more effective than verbal explanations because they encourage completeness and coherence, are more explicit, and are typically multimodal. These two experiments differ meaningfully from previous studies in that the information selected for drawing was not taken from a written text, but from a physical object (bicycle pump) and a class lesson with multiple representations (chemical bonding).

The results show that creating an explanation of a STEM phenomenon benefits learning, even when the explanations are created after learning and in the absence of new instruction. These gains in performance in the absence of teaching bear similarities to recent research showing gains in learning from testing in the absence of new instruction (e.g. Roediger et al., 2011 ; Roediger & Karpicke, 2006 ; Wheeler & Roediger, 1992 ). Many researchers have argued that the retrieval of information required during testing strengthens or enhances the retrieval process itself. Formulating explanations may be an especially effective form of testing for post-instruction learning. Creating an explanation of a complex system requires the retrieval of critical information and then the integration of that information into a coherent and plausible account. Other factors, such as the timing of the creation of the explanations, and whether feedback is provided to students, should help clarify the benefits of generating explanations and how they may be seen as a form of testing. There may even be additional benefits to learners, including increasing their engagement and motivation in school, and increasing their communication and reasoning skills (Ainsworth, Prain, & Tytler, 2011 ). Formulating a visual explanation draws upon students’ creativity and imagination as they actively create their own product.

As in previous research, students with high spatial ability both produced better explanations and performed better on tests of learning (e.g. Uttal et al., 2013 ). The visual explanations of high spatial students contained more information and more of the information that predicts learning outcomes. For the workings of a bicycle pump, creating a visual as opposed to verbal explanation had little impact on students of high spatial ability but brought students of lower spatial ability up to the level of students with high spatial abilities. For the more difficult set of concepts, chemical bonding, creating a visual explanation led to much larger gains than creating a verbal one for students both high and low in spatial ability. It is likely a mistake to assume that how and high spatial learners will remain that way; there is evidence that spatial ability develops with experience (Baenninger & Newcombe, 1989 ). It is possible that low spatial learners need more support in constructing explanations that require imagining the movement and manipulation of objects in space. Students learned the function of the bike pump by examining an actual pump and learned bonding through a video presentation. Future work to investigate methods of presenting material to students may also help to clarify the utility of generating explanations.

Creating visual explanations had greater benefits than those accruing from creating verbal ones. Surely some of the effectiveness of visual explanations is because they represent and communicate more directly than language. Elements of a complex system can be depicted and arrayed spatially to reflect actual or metaphoric spatial configurations of the system parts. They also allow, indeed, encourage, the use of well-honed spatial inferences to substitute for and support abstract inferences (e.g. Larkin & Simon, 1987 ; Tversky, 2011 ). As noted, visual explanations provide checks for completeness and coherence, that is, verification that all the necessary elements of the system are represented and that they work together properly to produce the outcomes of the processes. Visual explanations also provide a concrete reference for making and checking inferences about the behavior, causality, and function of the system. Thus, creating a visual explanation facilitates the selection and integration of information underlying learning even more than creating a verbal explanation.

Creating visual explanations appears to be an underused method of supporting and evaluating students’ understanding of dynamic processes. Two obstacles to using visual explanations in classrooms seem to be developing guidelines for creating visual explanations and developing objective scoring systems for evaluating them. The present findings give insights into both. Creating a complete and coherent visual explanation entails selecting the essential components and linking them by behavior, process, or causality. This structure and organization is familiar from recipes or construction sets: first the ingredients or parts, then the sequence of actions. It is also the ingredients of theater or stories: the players and their actions. In fact, the creation of visual explanations can be practiced on these more familiar cases and then applied to new ones in other domains. Deconstructing and reconstructing knowledge and information in these ways has more generality than visual explanations: these techniques of analysis serve thought and provide skills and tools that underlie creative thought. Next, we have shown that objective scoring systems can be devised, beginning with separating the information into structure and function, then further decomposing the structure into the central parts or actors and the function into the qualities of the sequence of actions and their consequences. Assessing students’ prior knowledge and misconceptions can also easily be accomplished by having students create explanations at different times in a unit of study. Teachers can see how their students’ ideas change and if students can apply their understanding by analyzing visual explanations as a culminating activity.

Creating visual explanations of a range of phenomena should be an effective way to augment students’ spatial thinking skills, thereby increasing the effectiveness of these explanations as spatial ability increases. The proverbial reading, writing, and arithmetic are routinely regarded as the basic curriculum of school learning and teaching. Spatial skills are not typically taught in schools, but should be: these skills can be learned and are essential to functioning in the contemporary and future world (see Uttal et al., 2013 ). In our lives, both daily and professional, we need to understand the maps, charts, diagrams, and graphs that appear in the media and public places, with our apps and appliances, in forms we complete, in equipment we operate. In particular, spatial thinking underlies the skills needed for professional and amateur understanding in STEM fields and knowledge and understanding STEM concepts is increasingly required in what have not been regarded as STEM fields, notably the largest employers, business, and service.

This research has shown that creating visual explanations has clear benefits to students, both specific and potentially general. There are also benefits to teachers, specifically, revealing misunderstandings and gaps in knowledge. Visualizations could be used by teachers as a formative assessment tool to guide further instructional activities and scoring rubrics could allow for the identification of specific misconceptions. The bottom line is clear. Creating a visual explanation is an excellent way to learn and master complex systems.

Additional file

Post-tests. (DOC 44 kb)

Acknowledgments

The authors are indebted to the Varieties of Understanding Project at Fordham University and The John Templeton Foundation and to the following National Science Foundation grants for facilitating the research and/or preparing the manuscript: National Science Foundation NSF CHS-1513841, HHC 0905417, IIS-0725223, IIS-0855995, and REC 0440103. We are grateful to James E. Corter for his helpful suggestions and to Felice Frankel for her inspiration. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the funders. Please address correspondence to Barbara Tversky at the Columbia Teachers College, 525 W. 120th St., New York, NY 10025, USA. Email: [email protected].

Authors’ contributions

This research was part of EB’s doctoral dissertation under the advisement of BT. Both authors contributed to the design, analysis, and drafting of the manuscript. Both authors read and approved the final manuscript.

Competing interests

The author declares that they have no competing interests.

  • Ainsworth SE, Bibby PA, Wood DJ. Examining the effects of different multiple representational systems in learning primary mathematics. Journal of the Learning Sciences. 2002; 11 (1):25–62. doi: 10.1207/S15327809JLS1101_2. [ CrossRef ] [ Google Scholar ]
  • Ainsworth, S. E., & Iacovides, I. (2005). Learning by constructing self-explanation diagrams. Paper presented at the 11th Biennial Conference of European Association for Resarch on Learning and Instruction, Nicosia, Cyprus.
  • Ainsworth SE, Loizou AT. The effects of self-explaining when learning with text or diagrams. Cognitive Science. 2003; 27 (4):669–681. doi: 10.1207/s15516709cog2704_5. [ CrossRef ] [ Google Scholar ]
  • Ainsworth S, Prain V, Tytler R. Drawing to learn in science. Science. 2011; 26 :1096–1097. doi: 10.1126/science.1204153. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Alesandrini KL. Pictorial-verbal and analytic-holistic learning strategies in science learning. Journal of Educational Psychology. 1981; 73 :358–368. doi: 10.1037/0022-0663.73.3.358. [ CrossRef ] [ Google Scholar ]
  • Aleven, V. & Koedinger, K. R. (2002). An effective metacognitive strategy: learning by doing and explaining with a computer-based cognitive tutor. Cognitive Science , 26 , 147–179.
  • Baenninger M, Newcombe N. The role of experience in spatial test performance: A meta-analysis. Sex Roles. 1989; 20 (5–6):327–344. doi: 10.1007/BF00287729. [ CrossRef ] [ Google Scholar ]
  • Bradley JD, Brand M. Stamping out misconceptions. Journal of Chemical Education. 1985; 62 (4):318. doi: 10.1021/ed062p318. [ CrossRef ] [ Google Scholar ]
  • Chi MT. Active-Constructive-Interactive: A conceptual framework for differentiating learning activities. Topics in Cognitive Science. 2009; 1 :73–105. doi: 10.1111/j.1756-8765.2008.01005.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chi MTH, DeLeeuw N, Chiu M, LaVancher C. Eliciting self-explanations improves understanding. Cognitive Science. 1994; 18 :439–477. [ Google Scholar ]
  • Craik F, Lockhart R. Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior. 1972; 11 :671–684. doi: 10.1016/S0022-5371(72)80001-X. [ CrossRef ] [ Google Scholar ]
  • Edens KM, Potter E. Using descriptive drawings as a conceptual change strategy in elementary science. School Science and Mathematics. 2003; 103 (3):135–144. doi: 10.1111/j.1949-8594.2003.tb18230.x. [ CrossRef ] [ Google Scholar ]
  • Flick LB. The meanings of hands-on science. Journal of Science Teacher Education. 1993; 4 :1–8. doi: 10.1007/BF02628851. [ CrossRef ] [ Google Scholar ]
  • Glenberg AM, Langston WE. Comprehension of illustrated text: Pictures help to build mental models. Journal of Memory and Language. 1992; 31 :129–151. doi: 10.1016/0749-596X(92)90008-L. [ CrossRef ] [ Google Scholar ]
  • Gobert JD, Clement JJ. Effects of student-generated diagrams versus student-generated summaries on conceptual understanding of causal and dynamic knowledge in plate tectonics. Journal of Research in Science Teaching. 1999; 36 :39–53. doi: 10.1002/(SICI)1098-2736(199901)36:1<39::AID-TEA4>3.0.CO;2-I. [ CrossRef ] [ Google Scholar ]
  • Hall VC, Bailey J, Tillman C. Can student-generated illustrations be worth ten thousand words? Journal of Educational Psychology. 1997; 89 (4):677–681. doi: 10.1037/0022-0663.89.4.677. [ CrossRef ] [ Google Scholar ]
  • Hausmann RGM, Vanlehn K. Explaining self-explaining: A contrast between content and generation. In: Luckin R, Koedinger KR, Greer J, editors. Artificial intelligence in education: Building technology rich learning contexts that work. Amsterdam: Ios Press; 2007. pp. 417–424. [ Google Scholar ]
  • Hegarty M. Mental animation: Inferring motion from static displays of mechanical systems. Journal of Experimental Psychology: Learning, Memory & Cognition. 1992; 18 :1084–1102. [ PubMed ] [ Google Scholar ]
  • Hegarty M, Carpenter PA, Just MA. Diagrams in the comprehension of scientific text. In: Barr R, Kamil MS, Mosenthal P, Pearson PD, editors. Handbook of reading research. New York: Longman; 1990. pp. 641–669. [ Google Scholar ]
  • Hegarty M, Just MA. Constructing mental models of machines from text and diagrams. Journal of Memory and Language. 1993; 32 :717–742. doi: 10.1006/jmla.1993.1036. [ CrossRef ] [ Google Scholar ]
  • Hegarty M, Kriz S, Cate C. The roles of mental animations and external animations in understanding mechanical systems. Cognition & Instruction. 2003; 21 (4):325–360. doi: 10.1207/s1532690xci2104_1. [ CrossRef ] [ Google Scholar ]
  • Heiser J, Tversky B. Diagrams and descriptions in acquiring complex systems. Proceedings of the Cognitive Science Society. Hillsdale: Erlbaum; 2002. [ Google Scholar ]
  • Hmelo-Silver C, Pfeffer MG. Comparing expert and novice understanding of a complex system from the perspective of structures, behaviors, and functions. Cognitive Science. 2004; 28 :127–138. doi: 10.1207/s15516709cog2801_7. [ CrossRef ] [ Google Scholar ]
  • Johnson CI, Mayer RE. Applying the self-explanation principle to multimedia learning in a computer-based game-like environment. Computers in Human Behavior. 2010; 26 :1246–1252. doi: 10.1016/j.chb.2010.03.025. [ CrossRef ] [ Google Scholar ]
  • Johnstone AH. Why is science difficult to learn? Things are seldom what they seem. Journal of Chemical Education. 1991; 61 (10):847–849. doi: 10.1021/ed061p847. [ CrossRef ] [ Google Scholar ]
  • Kessell AM, Tversky B. Visualizing space, time, and agents: Production, performance, and preference. Cognitive Processing. 2011; 12 :43–52. doi: 10.1007/s10339-010-0379-3. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kozhevnikov M, Hegarty M, Mayer R. Revising the Visualizer–Verbalizer Dimension: Evidence for Two Types of Visualizers. Cognition & Instruction. 2002; 20 :37–77. doi: 10.1207/S1532690XCI2001_3. [ CrossRef ] [ Google Scholar ]
  • Kozma R, Chin E, Russell J, Marx N. The roles of representations and tools in the chemistry laboratory and their implication for chemistry learning. Journal of the Learning Sciences. 2002; 9 (2):105–143. doi: 10.1207/s15327809jls0902_1. [ CrossRef ] [ Google Scholar ]
  • Larkin J, Simon H. Why a diagram is (sometimes) worth ten thousand words. Cognitive Science. 1987; 11 :65–100. doi: 10.1111/j.1551-6708.1987.tb00863.x. [ CrossRef ] [ Google Scholar ]
  • Leutner D, Leopold C, Sumfleth E. Cognitive load and science text comprehension: Effects of drawing and mentally imagining text content. Computers in Human Behavior. 2009; 25 :284–289. doi: 10.1016/j.chb.2008.12.010. [ CrossRef ] [ Google Scholar ]
  • Levine M. You-are-here maps: Psychological considerations. Environment and Behavior. 1982; 14 :221–237. doi: 10.1177/0013916584142006. [ CrossRef ] [ Google Scholar ]
  • Mayer RE. Systematic thinking fostered by illustrations in scientific text. Journal of Educational Psychology. 1989; 81 :240–246. doi: 10.1037/0022-0663.81.2.240. [ CrossRef ] [ Google Scholar ]
  • Mayer RE, Sims VK. For whom is a picture worth a thousand words? Extensions of a dual-coding theory of multimedia learning. Journal of Educational Psychology. 1994; 86 (3):389–401. doi: 10.1037/0022-0663.86.3.389. [ CrossRef ] [ Google Scholar ]
  • Perkins DN, Grotzer TA. Dimensions of causal understanding: The role of complex causal models in students’ understanding of science. Studies in Science Education. 2005; 41 :117–166. doi: 10.1080/03057260508560216. [ CrossRef ] [ Google Scholar ]
  • Roediger HL, Karpicke JD. Test enhanced learning: Taking memory tests improves long-term retention. Psychological Science. 2006; 17 :249–255. doi: 10.1111/j.1467-9280.2006.01693.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Roediger HL, Putnam AL, Smith MA. Ten benefits of testing and their applications to educational practice. In: Ross BH, editor. The psychology of learning and motivation. New York: Elsevier; 2011. pp. 1–36. [ Google Scholar ]
  • Roscoe RD, Chi MTH. Understanding tutor learning: Knowledge-building and knowledge-telling in peer tutors’ explanations and questions. Review of Educational Research. 2007; 77 :534–574. doi: 10.3102/0034654307309920. [ CrossRef ] [ Google Scholar ]
  • Schwamborn A, Mayer RE, Thillmann H, Leopold C, Leutner D. Drawing as a generative activity and drawing as a prognostic activity. Journal of Educational Psychology. 2010; 102 :872–879. doi: 10.1037/a0019640. [ CrossRef ] [ Google Scholar ]
  • Taber KS. Student understanding of ionic bonding: Molecular versus electrostatic framework? School Science Review. 1997; 78 (285):85–95. [ Google Scholar ]
  • Tversky B. Visualizing thought. Topics in Cognitive Science. 2011; 3 :499–535. doi: 10.1111/j.1756-8765.2010.01113.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Tversky B, Heiser J, MacKenzie R, Lozano S, Morrison JB. Enriching animations. In: Lowe R, Schnotz W, editors. Learning with animation: Research implications for design. New York: Cambridge University Press; 2007. pp. 263–285. [ Google Scholar ]
  • Tversky B, Suwa M. Thinking with sketches. In: Markman AB, Wood KL, editors. Tools for innovation. Oxford: Oxford University Press; 2009. pp. 75–84. [ Google Scholar ]
  • Uttal DH, Meadow NG, Tipton E, Hand LL, Alden AR, Warren C, et al. The malleability of spatial skills: A meta-analysis of training studies. Psychological Bulletin. 2013; 139 :352–402. doi: 10.1037/a0028446. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Van Meter P. Drawing construction as a strategy for learning from text. Journal of Educational Psychology. 2001; 93 (1):129–140. doi: 10.1037/0022-0663.93.1.129. [ CrossRef ] [ Google Scholar ]
  • Vandenberg SG, Kuse AR. Mental rotations: A group test of three-dimensional spatial visualization. Perceptual Motor Skills. 1978; 47 :599–604. doi: 10.2466/pms.1978.47.2.599. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Voyer D, Voyer S, Bryden MP. Magnitude of sex differences in spatial abilities: A meta-analysis and consideration of critical variables. Psychological Bulletin. 1995; 117 :250–270. doi: 10.1037/0033-2909.117.2.250. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wheeler MA, Roediger HL. Disparate effects of repeated testing: Reconciling Ballard’s (1913) and Bartlett’s (1932) results. Psychological Science. 1992; 3 :240–245. doi: 10.1111/j.1467-9280.1992.tb00036.x. [ CrossRef ] [ Google Scholar ]
  • Wilkin J. Learning from explanations: Diagrams can “inhibit” the self-explanation effect. In: Anderson M, editor. Reasoning with diagrammatic representations II. Menlo Park: AAAI Press; 1997. [ Google Scholar ]
  • Wittrock MC. Generative processes of comprehension. Educational Psychologist. 1990; 24 :345–376. doi: 10.1207/s15326985ep2404_2. [ CrossRef ] [ Google Scholar ]
  • Zacks J, Tversky B. Bars and lines: A study of graphic communication. Memory and Cognition. 1999; 27 :1073–1079. doi: 10.3758/BF03201236. [ PubMed ] [ CrossRef ] [ Google Scholar ]

We use essential cookies to make Venngage work. By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts.

Manage Cookies

Cookies and similar technologies collect certain information about how you’re using our website. Some of them are essential, and without them you wouldn’t be able to use Venngage. But others are optional, and you get to choose whether we use them or not.

Strictly Necessary Cookies

These cookies are always on, as they’re essential for making Venngage work, and making it safe. Without these cookies, services you’ve asked for can’t be provided.

Show cookie providers

  • Google Login

Functionality Cookies

These cookies help us provide enhanced functionality and personalisation, and remember your settings. They may be set by us or by third party providers.

Performance Cookies

These cookies help us analyze how many people are using Venngage, where they come from and how they're using it. If you opt out of these cookies, we can’t get feedback to make Venngage better for you and all our users.

  • Google Analytics

Targeting Cookies

These cookies are set by our advertising partners to track your activity and show you relevant Venngage ads on other sites as you browse the internet.

  • Google Tag Manager
  • Infographics
  • Daily Infographics
  • Graphic Design
  • Graphs and Charts
  • Data Visualization
  • Human Resources
  • Training and Development
  • Beginner Guides

Blog Graphic Design

15 Effective Visual Presentation Tips To Wow Your Audience

By Krystle Wong , Sep 28, 2023

Visual Presentation Tips

So, you’re gearing up for that big presentation and you want it to be more than just another snooze-fest with slides. You want it to be engaging, memorable and downright impressive. 

Well, you’ve come to the right place — I’ve got some slick tips on how to create a visual presentation that’ll take your presentation game up a notch. 

Packed with presentation templates that are easily customizable, keep reading this blog post to learn the secret sauce behind crafting presentations that captivate, inform and remain etched in the memory of your audience.

Click to jump ahead:

What is a visual presentation & why is it important?

15 effective tips to make your visual presentations more engaging, 6 major types of visual presentation you should know , what are some common mistakes to avoid in visual presentations, visual presentation faqs, 5 steps to create a visual presentation with venngage.

A visual presentation is a communication method that utilizes visual elements such as images, graphics, charts, slides and other visual aids to convey information, ideas or messages to an audience. 

Visual presentations aim to enhance comprehension engagement and the overall impact of the message through the strategic use of visuals. People remember what they see, making your point last longer in their heads. 

Without further ado, let’s jump right into some great visual presentation examples that would do a great job in keeping your audience interested and getting your point across.

In today’s fast-paced world, where information is constantly bombarding our senses, creating engaging visual presentations has never been more crucial. To help you design a presentation that’ll leave a lasting impression, I’ve compiled these examples of visual presentations that will elevate your game.

1. Use the rule of thirds for layout

Ever heard of the rule of thirds? It’s a presentation layout trick that can instantly up your slide game. Imagine dividing your slide into a 3×3 grid and then placing your text and visuals at the intersection points or along the lines. This simple tweak creates a balanced and seriously pleasing layout that’ll draw everyone’s eyes.

2. Get creative with visual metaphors

Got a complex idea to explain? Skip the jargon and use visual metaphors. Throw in images that symbolize your point – for example, using a road map to show your journey towards a goal or using metaphors to represent answer choices or progress indicators in an interactive quiz or poll.

3. Visualize your data with charts and graphs

The right data visualization tools not only make content more appealing but also aid comprehension and retention. Choosing the right visual presentation for your data is all about finding a good match. 

For ordinal data, where things have a clear order, consider using ordered bar charts or dot plots. When it comes to nominal data, where categories are on an equal footing, stick with the classics like bar charts, pie charts or simple frequency tables. And for interval-ratio data, where there’s a meaningful order, go for histograms, line graphs, scatterplots or box plots to help your data shine.

In an increasingly visual world, effective visual communication is a valuable skill for conveying messages. Here’s a guide on how to use visual communication to engage your audience while avoiding information overload.

visual representation about

4. Employ the power of contrast

Want your important stuff to pop? That’s where contrast comes in. Mix things up with contrasting colors, fonts or shapes. It’s like highlighting your key points with a neon marker – an instant attention grabber.

5. Tell a visual story

Structure your slides like a storybook and create a visual narrative by arranging your slides in a way that tells a story. Each slide should flow into the next, creating a visual narrative that keeps your audience hooked till the very end.

Icons and images are essential for adding visual appeal and clarity to your presentation. Venngage provides a vast library of icons and images, allowing you to choose visuals that resonate with your audience and complement your message. 

visual representation about

6. Show the “before and after” magic

Want to drive home the impact of your message or solution? Whip out the “before and after” technique. Show the current state (before) and the desired state (after) in a visual way. It’s like showing a makeover transformation, but for your ideas.

7. Add fun with visual quizzes and polls

To break the monotony and see if your audience is still with you, throw in some quick quizzes or polls. It’s like a mini-game break in your presentation — your audience gets involved and it makes your presentation way more dynamic and memorable.

8. End with a powerful visual punch

Your presentation closing should be a showstopper. Think a stunning clip art that wraps up your message with a visual bow, a killer quote that lingers in minds or a call to action that gets hearts racing.

visual representation about

9. Engage with storytelling through data

Use storytelling magic to bring your data to life. Don’t just throw numbers at your audience—explain what they mean, why they matter and add a bit of human touch. Turn those stats into relatable tales and watch your audience’s eyes light up with understanding.

visual representation about

10. Use visuals wisely

Your visuals are the secret sauce of a great presentation. Cherry-pick high-quality images, graphics, charts and videos that not only look good but also align with your message’s vibe. Each visual should have a purpose – they’re not just there for decoration. 

11. Utilize visual hierarchy

Employ design principles like contrast, alignment and proximity to make your key info stand out. Play around with fonts, colors and placement to make sure your audience can’t miss the important stuff.

12. Engage with multimedia

Static slides are so last year. Give your presentation some sizzle by tossing in multimedia elements. Think short video clips, animations, or a touch of sound when it makes sense, including an animated logo . But remember, these are sidekicks, not the main act, so use them smartly.

13. Interact with your audience

Turn your presentation into a two-way street. Start your presentation by encouraging your audience to join in with thought-provoking questions, quick polls or using interactive tools. Get them chatting and watch your presentation come alive.

visual representation about

When it comes to delivering a group presentation, it’s important to have everyone on the team on the same page. Venngage’s real-time collaboration tools enable you and your team to work together seamlessly, regardless of geographical locations. Collaborators can provide input, make edits and offer suggestions in real time. 

14. Incorporate stories and examples

Weave in relatable stories, personal anecdotes or real-life examples to illustrate your points. It’s like adding a dash of spice to your content – it becomes more memorable and relatable.

15. Nail that delivery

Don’t just stand there and recite facts like a robot — be a confident and engaging presenter. Lock eyes with your audience, mix up your tone and pace and use some gestures to drive your points home. Practice and brush up your presentation skills until you’ve got it down pat for a persuasive presentation that flows like a pro.

Venngage offers a wide selection of professionally designed presentation templates, each tailored for different purposes and styles. By choosing a template that aligns with your content and goals, you can create a visually cohesive and polished presentation that captivates your audience.

Looking for more presentation ideas ? Why not try using a presentation software that will take your presentations to the next level with a combination of user-friendly interfaces, stunning visuals, collaboration features and innovative functionalities that will take your presentations to the next level. 

Visual presentations come in various formats, each uniquely suited to convey information and engage audiences effectively. Here are six major types of visual presentations that you should be familiar with:

1. Slideshows or PowerPoint presentations

Slideshows are one of the most common forms of visual presentations. They typically consist of a series of slides containing text, images, charts, graphs and other visual elements. Slideshows are used for various purposes, including business presentations, educational lectures and conference talks.

visual representation about

2. Infographics

Infographics are visual representations of information, data or knowledge. They combine text, images and graphics to convey complex concepts or data in a concise and visually appealing manner. Infographics are often used in marketing, reporting and educational materials.

Don’t worry, they are also super easy to create thanks to Venngage’s fully customizable infographics templates that are professionally designed to bring your information to life. Be sure to try it out for your next visual presentation!

visual representation about

3. Video presentation

Videos are your dynamic storytellers. Whether it’s pre-recorded or happening in real-time, videos are the showstoppers. You can have interviews, demos, animations or even your own mini-documentary. Video presentations are highly engaging and can be shared in both in-person and virtual presentations .

4. Charts and graphs

Charts and graphs are visual representations of data that make it easier to understand and analyze numerical information. Common types include bar charts, line graphs, pie charts and scatterplots. They are commonly used in scientific research, business reports and academic presentations.

Effective data visualizations are crucial for simplifying complex information and Venngage has got you covered. Venngage’s tools enable you to create engaging charts, graphs,and infographics that enhance audience understanding and retention, leaving a lasting impression in your presentation.

visual representation about

5. Interactive presentations

Interactive presentations involve audience participation and engagement. These can include interactive polls, quizzes, games and multimedia elements that allow the audience to actively participate in the presentation. Interactive presentations are often used in workshops, training sessions and webinars.

Venngage’s interactive presentation tools enable you to create immersive experiences that leave a lasting impact and enhance audience retention. By incorporating features like clickable elements, quizzes and embedded multimedia, you can captivate your audience’s attention and encourage active participation.

6. Poster presentations

Poster presentations are the stars of the academic and research scene. They consist of a large poster that includes text, images and graphics to communicate research findings or project details and are usually used at conferences and exhibitions. For more poster ideas, browse through Venngage’s gallery of poster templates to inspire your next presentation.

visual representation about

Different visual presentations aside, different presentation methods also serve a unique purpose, tailored to specific objectives and audiences. Find out which type of presentation works best for the message you are sending across to better capture attention, maintain interest and leave a lasting impression. 

To make a good presentation , it’s crucial to be aware of common mistakes and how to avoid them. Without further ado, let’s explore some of these pitfalls along with valuable insights on how to sidestep them.

Overloading slides with text

Text heavy slides can be like trying to swallow a whole sandwich in one bite – overwhelming and unappetizing. Instead, opt for concise sentences and bullet points to keep your slides simple. Visuals can help convey your message in a more engaging way.

Using low-quality visuals

Grainy images and pixelated charts are the equivalent of a scratchy vinyl record at a DJ party. High-resolution visuals are your ticket to professionalism. Ensure that the images, charts and graphics you use are clear, relevant and sharp.

Choosing the right visuals for presentations is important. To find great visuals for your visual presentation, Browse Venngage’s extensive library of high-quality stock photos. These images can help you convey your message effectively, evoke emotions and create a visually pleasing narrative. 

Ignoring design consistency

Imagine a book with every chapter in a different font and color – it’s a visual mess. Consistency in fonts, colors and formatting throughout your presentation is key to a polished and professional look.

Reading directly from slides

Reading your slides word-for-word is like inviting your audience to a one-person audiobook session. Slides should complement your speech, not replace it. Use them as visual aids, offering key points and visuals to support your narrative.

Lack of visual hierarchy

Neglecting visual hierarchy is like trying to find Waldo in a crowd of clones. Use size, color and positioning to emphasize what’s most important. Guide your audience’s attention to key points so they don’t miss the forest for the trees.

Ignoring accessibility

Accessibility isn’t an option these days; it’s a must. Forgetting alt text for images, color contrast and closed captions for videos can exclude individuals with disabilities from understanding your presentation. 

Relying too heavily on animation

While animations can add pizzazz and draw attention, overdoing it can overshadow your message. Use animations sparingly and with purpose to enhance, not detract from your content.

Using jargon and complex language

Keep it simple. Use plain language and explain terms when needed. You want your message to resonate, not leave people scratching their heads.

Not testing interactive elements

Interactive elements can be the life of your whole presentation, but not testing them beforehand is like jumping into a pool without checking if there’s water. Ensure that all interactive features, from live polls to multimedia content, work seamlessly. A smooth experience keeps your audience engaged and avoids those awkward technical hiccups.

Presenting complex data and information in a clear and visually appealing way has never been easier with Venngage. Build professional-looking designs with our free visual chart slide templates for your next presentation.

What software or tools can I use to create visual presentations?

You can use various software and tools to create visual presentations, including Microsoft PowerPoint, Google Slides, Adobe Illustrator, Canva, Prezi and Venngage, among others.

What is the difference between a visual presentation and a written report?

The main difference between a visual presentation and a written report is the medium of communication. Visual presentations rely on visuals, such as slides, charts and images to convey information quickly, while written reports use text to provide detailed information in a linear format.

How do I effectively communicate data through visual presentations?

To effectively communicate data through visual presentations, simplify complex data into easily digestible charts and graphs, use clear labels and titles and ensure that your visuals support the key messages you want to convey.

Are there any accessibility considerations for visual presentations?

Accessibility considerations for visual presentations include providing alt text for images, ensuring good color contrast, using readable fonts and providing transcripts or captions for multimedia content to make the presentation inclusive.

Most design tools today make accessibility hard but Venngage’s Accessibility Design Tool comes with accessibility features baked in, including accessible-friendly and inclusive icons.

How do I choose the right visuals for my presentation?

Choose visuals that align with your content and message. Use charts for data, images for illustrating concepts, icons for emphasis and color to evoke emotions or convey themes.

What is the role of storytelling in visual presentations?

Storytelling plays a crucial role in visual presentations by providing a narrative structure that engages the audience, helps them relate to the content and makes the information more memorable.

How can I adapt my visual presentations for online or virtual audiences?

To adapt visual presentations for online or virtual audiences, focus on concise content, use engaging visuals, ensure clear audio, encourage audience interaction through chat or polls and rehearse for a smooth online delivery.

What is the role of data visualization in visual presentations?

Data visualization in visual presentations simplifies complex data by using charts, graphs and diagrams, making it easier for the audience to understand and interpret information.

How do I choose the right color scheme and fonts for my visual presentation?

Choose a color scheme that aligns with your content and brand and select fonts that are readable and appropriate for the message you want to convey.

How can I measure the effectiveness of my visual presentation?

Measure the effectiveness of your visual presentation by collecting feedback from the audience, tracking engagement metrics (e.g., click-through rates for online presentations) and evaluating whether the presentation achieved its intended objectives.

Ultimately, creating a memorable visual presentation isn’t just about throwing together pretty slides. It’s about mastering the art of making your message stick, captivating your audience and leaving a mark.

Lucky for you, Venngage simplifies the process of creating great presentations, empowering you to concentrate on delivering a compelling message. Follow the 5 simple steps below to make your entire presentation visually appealing and impactful:

1. Sign up and log In: Log in to your Venngage account or sign up for free and gain access to Venngage’s templates and design tools.

2. Choose a template: Browse through Venngage’s presentation template library and select one that best suits your presentation’s purpose and style. Venngage offers a variety of pre-designed templates for different types of visual presentations, including infographics, reports, posters and more.

3. Edit and customize your template: Replace the placeholder text, image and graphics with your own content and customize the colors, fonts and visual elements to align with your presentation’s theme or your organization’s branding.

4. Add visual elements: Venngage offers a wide range of visual elements, such as icons, illustrations, charts, graphs and images, that you can easily add to your presentation with the user-friendly drag-and-drop editor.

5. Save and export your presentation: Export your presentation in a format that suits your needs and then share it with your audience via email, social media or by embedding it on your website or blog .

So, as you gear up for your next presentation, whether it’s for business, education or pure creative expression, don’t forget to keep these visual presentation ideas in your back pocket.

Feel free to experiment and fine-tune your approach and let your passion and expertise shine through in your presentation. With practice, you’ll not only build presentations but also leave a lasting impact on your audience – one slide at a time.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 06 April 2021

Limits to visual representational correspondence between convolutional neural networks and the human brain

  • Yaoda Xu   ORCID: orcid.org/0000-0002-8697-314X 1 &
  • Maryam Vaziri-Pashkam 2  

Nature Communications volume  12 , Article number:  2065 ( 2021 ) Cite this article

12k Accesses

42 Citations

14 Altmetric

Metrics details

  • Neural decoding
  • Object vision

A Publisher Correction to this article was published on 06 May 2021

This article has been updated

Convolutional neural networks (CNNs) are increasingly used to model human vision due to their high object categorization capabilities and general correspondence with human brain responses. Here we evaluate the performance of 14 different CNNs compared with human fMRI responses to natural and artificial images using representational similarity analysis. Despite the presence of some CNN-brain correspondence and CNNs’ impressive ability to fully capture lower level visual representation of real-world objects, we show that CNNs do not fully capture higher level visual representations of real-world objects, nor those of artificial objects, either at lower or higher levels of visual representations. The latter is particularly critical, as the processing of both real-world and artificial visual stimuli engages the same neural circuits. We report similar results regardless of differences in CNN architecture, training, or the presence of recurrent processing. This indicates some fundamental differences exist in how the brain and CNNs represent visual information.

Similar content being viewed by others

visual representation about

Highly accurate protein structure prediction with AlphaFold

John Jumper, Richard Evans, … Demis Hassabis

visual representation about

Natural language instructions induce compositional generalization in networks of neurons

Reidar Riveland & Alexandre Pouget

visual representation about

Real-time analysis of large-scale neuronal imaging enables closed-loop investigation of neural dynamics

Chun-Feng Shang, Yu-Fan Wang, … Jiu-Lin Du

Introduction

Recent hierarchical convolutional neural networks (CNNs) have achieved human-like object categorization performance 1 , 2 , 3 , 4 . It has additionally been shown that representations formed in lower and higher layers of the network track those of the human lower and higher visual processing regions, respectively 5 , 6 , 7 , 8 . Similar results have also been obtained in monkey neurophysiological studies 9 , 10 . CNNs incorporate the known architectures of the primate lower visual processing regions and then repeat this design motif multiple times. Although the detailed neural mechanisms governing high-level primate vision remain largely unknown, the brain–CNN correspondence has generated the excitement that perhaps the algorithms governing high-level vision would automatically emerge in CNNs to provide us with a shortcut to fully understand and model primate vision. Consequently, CNNs have been regarded by some as the current best models of primate vision (e.g., 11 , 12 ). So much so that it has recently become common practice in human functional magnetic resonance imaging (fMRI) studies to compare fMRI measures to CNN outputs (e.g., 13 , 14 , 15 ).

Here, we reevaluate the key fMRI finding showing that representations formed in lower and higher layers of the CNN could track those of the human lower and higher visual processing regions, respectively. Our goal here is neither to deny that CNNs can capture some aspects of brain responses better than previous models nor to enter a “glass half empty” vs. “glass half full” subjective debate. But rather, we aim to evaluate CNN modeling as a viable scientific method to understand primate vision and whether there are fundamental differences in visual processing between the brain and CNNs that would limit CNN modeling as a shortcut for understanding primate vision.

Two approaches have been previously used for establishing a close brain and CNN representation correspondence 5 , 6 , 7 , 8 . One approach has used linear transformation to link individual fMRI voxels to the units of CNN layers through training and cross-validation 6 , 7 . While this is a valid approach, it is computationally costly and requires large amounts of training data to map a large number of fMRI voxels to an even larger number of CNN units. The other approach has bypassed this direct voxel-to-unit mapping, and instead, has examined the correspondence in visual representational structures between the human brain and CNNs using representational similarity analysis (RSA 16 ). With this approach, both Khaligh-Razavi and Kriegeskorte 8 and Cichy et al. 5 reported a close correspondence in the representational structure of lower and higher human visual areas to lower and higher CNN layers, respectively. Khaligh-Razavi and Kriegeskorte 8 additionally showed that such correlations exceeded the noise ceiling for both brain regions, indicating that the representations formed in a CNN could fully capture those of human visual areas (but see ref. 17 ).

These human findings are somewhat at odds with results from neurophysiological studies showing that the current best CNNs can only capture about 50–60% of the explainable variance of macaque V4 and IT 9 , 10 , 18 , 19 . Khaligh-Razavi and Kriegeskorte 8 and Cichy et al. 5 were also underpowered by a number of factors, raising concerns regarding the robustness of their findings. Most importantly, none of the above fMRI studies tested altered real-world object images (such as images that have been filtered to contain only the high or low spatial frequency components). As human participants have no trouble recognizing such filtered real-world object images, it is critical to know if a brain–CNN correspondence exists for these filtered real-world object images. Decades of vision research has successfully utilized simple and artificial visual stimuli to uncover the complexity of visual processing in the primate brain, showing that the same algorithms used in the processing of natural images would manifest themselves in the processing of artificial visual stimuli. If CNNs are to be used as working models of the primate visual brain, it is equally critical to test whether a close brain–CNN correspondence exists for the processing of artificial objects.

Here, we compared human fMRI responses from three experiments with those from 14 different CNNs (including both shallow and very deep CNNs and a recurrent CNN) 20 . In particular, following Khaligh-Razavi and Kriegeskorte 8 and Cichy et al. 5 and using the lower bound of the noise ceiling from the human brain data as our threshold, we examined how well visual representational structures in the human brain may be captured by CNNs, with “fully capture” meaning that the brain-CNN correlation would be as good as the brain-brain correlation between the human participants, which in turn would indicate that CNN is able to fully account for the total amount of explainable brain variance. We found that while a number of CNNs were successful at fully capturing the visual representational structures of lower-level human visual areas during the processing of both the original and filtered real-world object images, none could do so for these object images at higher-level visual areas. In addition, none of the CNNs tested could fully capture the visual representations of artificial objects in lower-level human visual areas, with all but one also failing to do so for these objects in higher-level human visual areas. Some fundamental differences thus exist between the human brain and CNNs and preclude CNNs from fully modeling the human visual system at their current states.

In this study, we reexamined previous findings that showed close brain–CNN correspondence in visual processing 5 , 6 , 7 , 8 . We noticed the two studies that used the RSA approach were underpowered in two aspects. First, both Khaligh-Razavi and Kriegeskorte 8 and Cichy et al. 5 used an event-related fMRI design, known to produce a low signal-to-noise ratio (SNR). This can be seen in the low brain–CNN correlation values reported, with the highest correlation being less than 0.2 in both studies. While Cichy et al. 5 did not calculate the noise ceiling, thus making it difficult to assess how good the correlations were, the lower bounds of the noise ceiling were around 0.15–0.2 in Khaligh-Razavi and Kriegeskorte 8 , which is fairly low. Second, both studies defined human brain regions anatomically rather than functionally in each individual participant. This could affect the reliability of fMRI responses, potentially contributing to the low noise ceiling and low correlation obtained. Here, we took advantage of existing data sets from three fMRI experiments that overcome these drawbacks and compared visual processing in the human brain with those of 14 different CNNs. These data sets were collected while human participants viewed both unfiltered and filtered real-world object images and artificial object images. This allowed us to test not only the robustness of brain–CNN correlation, but also its generalization across different image sets. Because the RSA approach allows easy comparisons of multiple fMRI data sets with multiple CNNs, and because a noise ceiling can be easily derived to quantify the degree of the brain–CNN correspondence, we used this approach in the present study.

Our fMRI data were collected with a block design in which responses were averaged over a whole block of multiple exemplars to increase SNR. In three fMRI experiments, human participants viewed blocks of sequentially presented cut-out images on a gray background at fixation and pressed a response button whenever the same image repeated back to back (Fig.  1a ). Each image block contained different exemplars from the same object category, with the exemplars varied in identity, viewpoint/orientation, and pose (for the animal categories) to minimize the low-level similarities among them (see Supplementary Figs.  1 and 2 for the full set of images used). A total of eight real-world natural and manmade object categories were used, including bodies, cars, cats, chairs, elephants, faces, houses, and scissors 21 , 22 . In Experiment 1, both the original images and the controlled version of the same images were shown (Fig.  1b ). Controlled images were generated using the SHINE technique to achieve spectrum, histogram, and intensity normalization and equalization across images from the different categories 23 . In Experiment 2, the original, high and low SF contents of an image from six of the eight real-world object categories were shown (Fig.  1b ). In Experiment 3, both the images from the eight real-world image categories and images from nine artificial object categories 24 were shown (Fig.  1b ).

figure 1

A An illustration of the block design paradigm used. Participants performed a one-back repetition detection task on the images. An actual block in the experiment contained ten images with two repetitions per block. See “Methods” for more details. B The stimuli used in the three fMRI experiments. Experiment 1 included the original and the controlled images from eight real-world object categories. Experiment 2 included the images from six of the eight real-world object categories shown in the original, high SF, and low SF format. Experiment 3 included images from the same eight real-world object categories and images from nine artificial object categories. Each category contained ten different exemplars varying in identity, viewpoint/orientation, and pose (for the animal categories) to minimize the low-level image similarities among them. See Supplementary Figs.  1 and 2 for the full set of images used. C The human visual regions examined. They included topographically defined early visual areas V1–V4 and functionally defined higher object processing regions LOT and VOT. D The representational similarity analysis used to compare the representational structural between the brain and CNNs. In this approach, a representational dissimilarity matrix was first formed by computing all pairwise Euclidean distances of fMRI response patterns or CNN layer output for all the object categories. The off-diagonal elements of this matrix were then used to form a representational dissimilarity vector. These dissimilarity vectors were correlated between each brain region and each sampled CNN layer to assess the similarity between the two. C is reproduced from Xu and Vaziri-Pashkam 61 with permission.

For a given brain region, we averaged fMRI responses from a block of trials containing exemplars of the same category and extracted the beta weights (from a general linear model) for the entire block from each voxel. The responses from all the voxels in a given region were then taken as the fMRI response pattern for that object category in that brain region. Following this, fMRI response patterns were extracted for each category from six independently defined visual regions along the human occipito-temporal cortex (OTC). They included lower visual areas V1 to V4 and higher visual object processing regions in lateral occipito-temporal (LOT) and ventral occipito-temporal (VOT) cortex (Fig.  1c ). LOT and VOT have been considered as the homolog of the macaque inferotemporal (IT) cortex involved in visual object processing 25 . Their responses have been shown to correlate with successful visual object detection and identification 26 , 27 , and their lesions have been linked to visual object agnosia 28 , 29 .

The 14 CNNs we examined here included both shallower networks, such as Alexnet, VGG16, and VGG 19, and deeper networks, such as Googlenet, Inception-v3, Resnet-50, and Resnet-101 (Supplementary Table  1 ). We also included a recurrent network, Cornet-S, that has been shown to capture the recurrent processing in macaque IT cortex with a shallower structure 12 , 19 . This CNN is argued to be the current best model of the primate ventral visual regions 19 . All CNNs were pretrained with ImageNet images 30 . To understand how the specific training images would impact CNN representations, we also examined Resnet-50 trained with stylized ImageNet images 31 . Following a previous study (O’Connor et al., 2018 32 ), we sampled from 6 to 11 mostly pooling layers of each CNN (see Supplementary Table  1 for the CNN layers sampled). Pooling layers were selected because they typically mark the end of processing for a block of layers when information is pooled to be passed on to the next block of layers. We extracted the response from each sampled CNN layer for each exemplar of a category and then averaged the responses from the different exemplars to generate a category response, similar to how an fMRI category response was extracted. Following Khaligh-Razavi and Kriegeskorte 8 and Cichy et al. 5 , using RSA, we compared the representational structures of real-world and artificial object categories between the different CNN layers and different human visual regions.

The existence of brain–CNN correspondence for representing real-world object images

In Experiments 1 and 2, we wanted to verify the previously reported brain–CNN correspondence for representing real-world object images. We also tested if this finding can be generalized to filtered real-world images.

To compare the representational structure between the human brain and CNNs, in each brain region examined, we first calculated pairwise Euclidean distances of the z-normalized fMRI response patterns among the different object categories in each experiment, with shorter Euclidean distance indicating greater similarity between a pair of fMRI response patterns. From these pairwise Euclidean distances, we constructed a category representational dissimilarity matrix (RDM, see Fig.  1d ) for each of the six brain regions examined. Likewise, from the z-normalized category responses of each sampled CNN layer, we calculated pairwise Euclidean distances among the different categories to form a CNN category RDM for that layer. We then correlated category RDMs between brain regions and CNN layers using Spearman rank correlation following Nili et al. 33 and Cichy et al. 5 (Fig.  1d ). A Spearman rank correlation compares the representational geometry between the brain and a CNN without requiring the two to have a strictly linear relationship. All our results remained the same when Pearson correlation was applied and when correlation measures, instead of Euclidean distance measures, were used to construct the category RDMs (see Supplementary Figs.  3 , 6 , 7 , and 16 ).

Previous studies have reported a correspondence in representation between lower and higher CNN layers to lower and higher visual processing regions, respectively 5 , 8 . To evaluate the presence of such correspondence in our data, for each CNN, we identified the layer that showed the best RDM correlation with each of the six included brain regions in each participant. We then assessed whether the resulting layer numbers increased from low-to-high visual regions using Spearman rank correlation. If a close brain–CNN correspondence in representation exists, then the Fisher-transformed correlation coefficient of this Spearman rank correlation should be significantly above zero at the group level (one-tailed t tests were conducted to test for significance; one-tailed t tests were used as only values above zero are meaningful; all stats reported were corrected for multiple comparisons for the number of comparisons included in each experiment using the Benjamini–Hochberg procedure at false discovery rate q  = 0.05, see ref. 34 ).

In Experiment 1, we contrasted original real-world object images with the controlled version of these images. Figure  2a shows the average CNN layer that best correlated with each brain region for each CNN during the processing of these images (the exact significance levels of the brain–CNN correspondence are marked with asterisks at the top of each plot). Here, 10 out of the 14 CNNs examined showed a significant brain–CNN correspondence for the original images. The same correspondence was also seen for the controlled images, with 11 out of the 14 CNNs showing a significant brain–CNN correspondence.

figure 2

A The results from Experiment 1, in which original and controlled images from real-world object categories were shown. N  = 6 human participants. B The results from Experiment 2, in which full, high and low SF components of the images from real-world object categories were shown. N  = 10 human participants. C The results from Experiment 3, in which unaltered images from both real-world (natural) and artificial object categories were shown. N  = 6 human participants. Plotting here are the averaged CNN layer numbers across the human participants that showed the greatest RDM correlation for each brain region in each experimental condition, with the error bars indicating the standard errors of the mean across participants. To evaluate brain–CNN correspondence, in each human participant, the CNN layer that showed the highest RDM correlation with each of the six brain regions was identified. A Spearman rank correlation was carried out for each participant to assess whether the resulting layer numbers increased from low to high human visual regions. The resulting correlation coefficients (Fisher-transformed) were tested for greater than zero at the participant group level using one-tailed t tests. The asterisks at the top of each plot mark the significance level of these statistical tests, with a significant result indicating that the RDMs from lower CNN layers better correlated with those of lower than higher visual regions and the reverse is true for higher CNN layers. All t -tests were corrected for multiple comparisons for the number of image conditions included in each experiment using the Benjamini–Hochberg procedure. † p  < 0.1, * p  < 0.05, ** p  < 0.01, *** p  < 0.001. Source data are provided as a Source Data file.

In Experiment 2, we contrasted original real-world images with the high and low SF component versions of these images (Fig.  2b ). For the original images, we replicated the findings from Experiment 1, with 13 out of the 14 CNNs showing a significant brain–CNN correspondence. The same correspondence was also present in 13 CNNs for the high SF images and in 8 CNNs for the low SF images. In fact, Alexnet, Cornet-S, Googlenet, Inception-v3, Mobilenet-v2, Resnet-18, Resnet-50, Squeezenet, and VGG16 showed a significant brain–CNN correspondence for all five image sets across the two experiments. These results remained the same when correlations, instead of Euclidean distance measures, were used to construct the category RDMs, and Pearson, instead of Spearman, the correlation was applied to compare CNN and brain RDMs (Supplementary Fig.  3 ).

These results replicate previous findings using the RSA approach 5 , 8 and show that there indeed existed a brain–CNN correspondence, with representations in lower and higher visual areas better resembling those of lower and higher CNN layers, respectively. Importantly, such a brain–CNN correspondence is generalizable to filtered real-world object images.

Quantifying the amount of brain–CNN correspondence for representing real-world object images

A linear correspondence between CNN and brain representations, however, only tells us that lower CNN layers are relatively more similar to lower than higher visual areas and that the reverse is true for higher CNN layers. It does not tell us about the amount of similarity. To assess this, we evaluated how successfully the category RDM from a CNN layer could capture the RDMs from a brain region. To do so, we first obtained the reliability of the category RDM in a brain region across human participants by calculating the lower and upper bounds of the fMRI noise ceiling 33 . Overall, the lower bounds of fMRI noise ceilings for the different brain regions were much higher in our two experiments than those of Khaligh-Razavi and Kriegeskorte 8 (Supplementary Figs.  4A and 5A). These results indicate that the object category representational structures in our data are fairly similar and consistent across participants.

If the category RDM from a CNN layer successfully captures that from a brain region, then the correlation between the two should exceed the lower bound of the fMRI noise ceiling. This can be re-represented as the proportion of explainable brain RDM variance captured by the CNN (by dividing the brain–CNN RDM correlation by the lower bound of the corresponding noise ceiling and then taking the square of the resulting ratio; all correlation results are reported in Supplementary Figs.  4 – 7 ). For the original real-world object images in Experiment 1, the brain RDM variance from lower visual areas was fully captured by three CNNs (Fig.  3a ), including Alexnet, Googlenet, and Vgg16 (with no difference between 1 and the highest proportion of variance explained by a CNN layer for V1–V3, one-tailed t tests, ps  > 0.1; see the asterisks marking the exact significance levels at the top of each plot; one-tailed t tests were used here as only testing the values below 1 was meaningful; all p values reported were corrected for multiple comparisons for the 6 brain regions included using the Benjamini–Hochberg procedure at false discovery rate q  = 0.05). However, no CNN layer was able to fully capture the RDM variance from visual areas LOT and VOT (with significant differences between 1 and the highest proportion of variance explained by a CNN layer for LOT and VOT, p s < 0.05, one-tailed and corrected). The same pattern of results was observed when the controlled images were used in Experiment 1 (Fig.  3b ): several CNNs were able to fully capture the RDM variance of lower visual areas but none was able to do so for higher visual areas. We obtained similar results for the original, high SF, and low SF images in Experiment 2 (Fig.  4a–c ). Here again, a number of CNNs fully captured the RDM variance of lower visual areas, but none could do so for higher visual areas. All these results remained the same when correlations, instead of Euclidean distance measures, were used to construct the category RDMs, and Pearson, instead of Spearman, correlations were applied to compare CNN and brain RDMs (see the correlation results in Supplementary Figs.  4 – 7 ; note that although using Euclidean distance measures after pattern z-normalization to construct the RDMs produced highly similar results as those from correlation measures, they were not identical).

figure 3

A Results for the Original images. B Results for the Controlled images. N  = 6 human participants. The asterisks at the top of each plot mark the significance level of the difference between 1 and the highest proportion of variance explained by a CNN for each brain region; one-tailed t -tests were used as only values below 1 were meaningful here; all p values reported were corrected for multiple comparisons for the six brain regions included using the Benjamini–Hochberg procedure. Error bars indicate standard errors of the means. † p  < 0.1, * p  < 0.05, ** p  < 0.01, *** p  < 0.001. Source data are provided as a Source Data file.

figure 4

A Results for Full SF images. B Results for High SF images. C Results for Low SF images. N  = 10 human participants. The asterisks at the top of each plot mark the significance level of the difference between 1 and the highest proportion of variance explained by a CNN for each brain region; one-tailed t tests were used and all p values reported were corrected for multiple comparisons for the six brain regions included using the Benjamini–Hochberg procedure. Error bars indicate standard errors of the means. † p  < 0.1, * p  < 0.05, ** p  < 0.01, *** p  < 0.001. Source data are provided as a Source Data file.

In our fMRI experiments, we used a randomized presentation order for each of the experimental runs with two image repetitions. When we simulated the exact fMRI design in Alexnet by generating a matching number of randomized presentation sequences with image repetitions and then averaging CNN responses for these sequences, we obtained virtually identical Alexnet results as those without this simulation (Supplementary Fig.  4D ). Thus, the disagreement between our fMRI and CNN results could not be due to a difference in stimulus presentation. The very fact that CNN could fully capture the brain RDM variance in lower visual areas for real-world objects further supports this idea and additionally shows that the non-linearity in fMRI measures had a minimal impact on RDM extraction. The latter speaks to the robustness of the RSA approach as extensively reviewed elsewhere 16 .

Together, these results showed that, although lower layers of several CNNs could fully capture the explainable brain RDM variance for lower-level visual representations of both the original and filtered real-world object images in the human brain, none could do so for higher-level neural representations of these images. In fact, the highest amount of explainable brain RDM variance that could be captured by CNNs from higher visual regions LOT and VOT was about 60%, on par with previous neurophysiological results from macaque IT cortex 9 , 10 , 18 , 19 .

To directly visualize the object representational structures in different brain regions and CNN layers, using multi-dimensional scaling (MDS, Shepard, 1980 35 ), we placed the RDMs on 2D spaces with the distances among the categories approximating their relative similarities to each other. Figure  5a, b shows the MDS plots from the two lowest and the two highest brain regions examined (i.e., V1, V2, LOT, and VOT) and from the two lowest and the two highest layers sampled from four examples CNNs (i.e., Alexnet, Cornet-S, Googlenet, and Vgg-19) from Experiments 1 and 2 (see Supplementary Figs.  8–12 for the MDS plots from all brain regions and CNN layers sampled). Consistent with our quantitative analysis, for the real-world objects, there were some striking brain–CNN representational similarities at lower levels of object representation (such as in Alexnet and Googlenet). At higher levels, both the brain and CNNs showed a broad distinction between animate and inanimate objects (i.e., bodies, cats, elephants, and faces vs. cars, chairs, houses, and scissors), but they differed in how these categories were represented relative to each other. For example, within the animate objects, while faces and bodies are far apart in both VOT and LOT, they are next to each other in higher CNN layers (see the objects marked by the dotted circles in Fig.  5 ); and within the inanimate objects, while cars, chairs, houses, and scissors tend to form a square in VOT and LOT, they tend to form a line in higher CNN layers (see the objects marked by the dashed ovals in Fig.  5 ).

figure 5

A Results for the Original real-world object images. B Results for the Controlled real-world object images. C Results for the artificial object images. Brain responses included here are those for the original real-world images from both Experiments 1 and 3, those for the controlled real-world images from Experiment 1, and those for the artificial object images from Experiment 3. The distances among the object categories in each MDS plot approximate their relative similarities to each other in the corresponding RDM. Only MDS plots from the two lowest and the two highest brain regions examined (i.e., V1, V2, LOT, and VOT) and from the two lowest and two highest layers sampled from four examples, CNNs (i.e., Alexnet, Cornet-S, Googlenet, and Vgg-19) are included here. See Supplementary Figs.  8 – 12 and 17 for MDS plots from all brain regions and CNN layers examined. Since rotations and flips preserve distances on these MDS plots, to make these plots more informative and to see how the representational structure evolved across brain regions and CNN layers, we manually rotated and/or flipped each MDS when necessary. For real-world objects, there were some remarkable brain–CNN similarities at lower levels of object representations (see Alexnet and Googlenet). At higher levels, although both showed a broad distinction between animate and inanimate objects (i.e., bodies, cats, elephants, and faces vs. cars, chairs, houses, and scissors), they differ in how categories are represented from each other. For example, within the animate objects, while faces and bodies are far apart in both VOT and LOT, they are next to each other in higher CNN layers (see the objects marked by the dotted circles in ( A ); and within the inanimate objects, while cars, chairs, houses, and scissors tend to form a square in VOT and LOT, they tend to form a line in higher CNN layers (see the objects marked by the dashed ovals in ( A ). For the artificial object images, brain–CNN differences at the lower level are not easily interpretable. Differences at the higher level suggest that while the brain takes both local and global shape similarities into account when grouping objects, CNNs rely mainly on local shape similarities. This can be seen in the grouping of the objects at higher CNN layers and by comparing the purple and fuchsia shapes that share the same global but different local features (see the objects marked by the dotted circles in ( C )). Source data are provided as a Source Data file.

LOT and VOT included a large swath of the ventral and lateral OTC and likely overlapped to a great extent with regions selective for specific object categories, such as faces, bodies, or scenes. Because CNNs may not automatically develop category-selective units during object categorization training, it is possible that the brain–CNN RDM discrepancy we observed so far at higher levels of visual processing is solely driven by the category-selective voxels in the human brain. To investigate this possibility, using the main experimental data, we evaluated the category selectivity of each voxel in LOT and VOT (see “Methods”). We then excluded all voxels showing a significant category selectivity for faces, bodies, or scenes (i.e., houses) and repeated our analysis. In most cases, the amount of the brain RDM variance that could be capture by CNNs remained unchanged whether or not category-selective voxels were included or excluded (see Supplementary Figs.  13 and 14 ). Significant differences were observed in only 6% of the comparisons ( ps  < 0.05, uncorrected, see the caption of Supplementary Figs.  13 and 14 for a list of these cases). However, even in these cases, the maximum amount of LOT and VOT RDM variance captured by CNNs was still significantly less than 1 ( ps  < 0.05, corrected). Moreover, when the same unaltered images were shown across the different experiments, the improvement seen in one experiment was not replicated in another experiment (e.g., the improvement seen in Alexnet for Experiment 2 Full-SF was not replicated in Experiment 3 Natural, see Supplementary Figs.  14 and 18 ). Consistent with these results, MDS plots for LOT and VOT look quite similar whether or not category-selective voxels were included (see Supplementary Figs.  8–12 ). As such, the failure of CNNs to fully capture brain RDM at higher levels of visual processing cannot be attributed to the presence of category-selective voxels in LOT and VOT.

One could argue that CNNs generally do not encounter disembodied heads or headless bodies in their training data. They are thus unlikely to have distinctive representations for heads and bodies. Note that the human visual system generally does not see such stimuli in its training data either. The goal of the study is, therefore, not to test images that a system has been exposed to during training, but rather how it handles images that it has not. If the two systems are similar in their underlying representation, then they should still respond similarly to images that they have not been exposed to during training. If not, then it indicates that the two systems represent visual objects in different ways. We present a stronger test case in the next experiment by comparing the representations of artificial visual stimuli between the brain and CNNs.

The brain–CNN correspondence for representing artificial object images

Previous comparisons of visual processing in the brain and CNN have focused entirely on the representation of real-world objects. Decades of visual neuroscience research, however, has successfully utilized simple and artificial visual stimuli to uncover the complexity of visual processing in the primate brain (e.g., 36 , 37 , 38 , 39 ), with Tanaka and colleagues, in particular, showing that IT responses to some real-world objects are highly similar to their responses to artificial shapes 39 . The same algorithms used in the processing of natural images thus manifest themselves in the processing of artificial visual stimuli. If CNNs are to be used as working models of the primate visual brain, it would be critical to test if this principle applies to CNNs.

Testing simple and artificial visual stimuli also allows us to address a remaining concern for the results obtained so far. It could be argued that the reason CNNs performed poorly in fully tracking high-level processing of the real-world objects even when category-selective voxels were removed was due to interactions between category-selective and non-selective brain regions. With the artificial visual stimuli, however, no preexisting category information, semantic knowledge, as well as experience with the stimuli could affect visual processing at a higher level. This would put the brain and CNN on even grounds. If CNNs still fail to track the processing of the artificial visual stimuli at higher levels, it would indicate some fundamental differences in how the brain and CNNs process visual information, rather than the particularity of the stimuli used.

In Experiment 3, we compared the processing of both real-world objects and artificial objects between the brain and CNNs. As in Experiments 1 and 2, the processing of real-world objects showed a consistent brain–CNN correspondence in 8 out of the 14 CNNs tested (Fig.  2c ). The same correspondence was also obtained in eight CNNs when artificial objects were shown, with lower visual representations in the brain better resembling those of lower than higher CNN layers and the reverse is true for higher visual representations in the brain (Fig.  2c and Supplementary Fig.  3 ). In fact, across Experiments 1–3, Alexnet, Cornet-S, Googlenet, Resnet-18, Resnet-50, Squeezenet, and VGG16 were the seven CNNs showing a consistent brain–CNN correspondence across all our image sets, including the original and filtered real-world object images, as well as the artificial object images.

As before, for real-world objects, while some of the CNNs were able to fully capture the brain RDM variance from lower visual areas, none could do so for higher visual areas (Fig.  6a ). For artificial object images, while the majority of the CNNs still failed to fully capture the brain RDM variance of higher visual areas, surprisingly, no CNN was able to do so for lower visual areas anymore (with significant differences between 1 and the highest proportion of variance explained by a CNN layer for V1 and V2, all p s < 0.05, one-tailed and corrected; see the asterisks marking the exact significance levels at the top of each plot for the full stats). In fact, the amount of the brain RDM variance captured in lower visual areas dropped significantly or marginally significantly between the natural and artificial objects in several CNNs (Alexnet, p  = 0.062 for V1, p  = 0.074 for V2; Googlenet, p  = 0.012 for V1, p  = 0.023 for V2; Mobilenet-v2, p  = 0.032 for V2; Squeezenet, p  = 0.022 for V1, p  = 0.0085 for V2; Vgg-16, p  = 0.003 for V1, p  = 0.0042 for V2, p  = 0.094 for V3; and Vgg-19, p  = 0.048 for V1, p  = 0.0077 for V2; all reported p values were corrected for multiple comparisons for the six brain regions examined). This rendered the few CNNs that were capable of fully capturing the brain variance from the lower visual areas during the processing of real-world objects no longer able to do so during the processing of artificial objects (Fig.  6b ; see also the correlation results in Supplementary Figs.  15 and 16 ). In other words, as a whole, CNNs performed much worse in capturing visual processing of artificial than real-world objects in the human brain, and their ability to capture lower-level visual processing of real-world objects in the brain did not generalize to the processing of artificial objects.

figure 6

A Results for real-world object images. B Results for artificial object images. N  = 6 human participants. The asterisks at the top of each plot mark the significance level of the difference between 1 and the highest proportion of variance explained by a CNN for each brain region; one-tailed t tests were used and all p values reported were corrected for multiple comparisons for the six brain regions included using the Benjamini–Hochberg procedure. Error bars indicate standard errors of the means. † p  < 0.1, * p  < 0.05, ** p  < 0.01, *** p  < 0.001. Source data are provided as a Source Data file.

For artificial objects, RDM differences between lower brain regions and lower CNN layers were not easily interpretable from the MDS plots (Fig.  5c and Supplementary Fig.  17 ). RDM differences between higher brain regions and higher CNN layers suggest that while the brain takes both local and global shape similarities into consideration when grouping objects, CNNs rely mainly on local shape similarities (e.g., compare higher brain and CNN representations of the shapes marked by purple and fuchsia colors that share the same global but different local features; see the objects marked by the dotted circles in Fig.  5c ). This is consistent with other findings that specifically manipulated local and global shape similarities (see “Discussion”). Lastly, as in Experiments 1 and 2, removing the category-selective voxels in LOT and VOT did not improve CNN performance (see Supplementary Fig.  18 ).

Overall, taking both the linear correspondence and RDM correlation into account, none of the CNNs examined here could fully capture lower or higher levels of neural processing of artificial objects. This is particularly critical given that a number of CNNs were able to fully capture the lower-level neural processing of real-world objects.

The effect of training a CNN on original vs. stylized image-net images

Although CNNs are believed to explicitly represent object shapes in the higher layers 1 , 40 , 41 , emerging evidence suggests that CNNs may largely use local texture patches to achieve successful object classification 42 , 43 or local rather than global shape contours for object recognition 44 . In a recent demonstration, CNNs were found to be poor at classifying objects defined by silhouettes and edges. In addition, when texture and shape cues were in conflict, they classified objects according to texture rather than shape cues 31 (see also ref. 44 ). However, when Resnet-50 was trained with stylized ImageNet images in which the original texture of every single image was replaced with the style of a randomly chosen painting, object classification performance significantly improved, relied more on shape than texture cues, and became more robust to noise and image distortions 31 . It thus appears that a suitable training data set may overcome the texture bias in standard CNNs and allow them to utilize more shape cues.

We tested if the category RDM in a CNN may become more brain-like when a CNN was trained with stylized ImageNet images. To do so, we compared the representations formed in Resnet-50 pretrained with ImageNet images with those from Resnet-50 pretrained with three other procedures: 31 trained only with the stylized ImageNet Images, trained with both the original and the stylized ImageNet Images, and trained with both sets of images and then fine-tuned with the stylized ImageNet images. Despite differences in training, the category RDM correlations between brain regions and CNN layers were remarkably similar among these Resnet-50s, and all were substantially different from those of the human visual regions (Supplementary Fig.  19 ). If anything, training with the original ImageNet images resulted in a better brain–CNN correspondence in several cases than the other training conditions. The incorporation of stylized ImageNet images in training thus did not result in more brain-like visual representations in Resnet-50.

It has become common practice in recent human fMRI research to regard CNNs as a working model of the human visual system. This is largely based on fMRI studies showing that representations formed in CNN lower and higher layers track those of the human lower and higher visual processing regions, respectively 5 , 6 , 7 , 8 . Here, we reevaluated this finding with more robust fMRI data sets from 3 experiments and 14 different CNNs and tested the generality of this finding to filtered real-world object images and artificial object images.

We found a significant correspondence in visual representational structure between the CNNs and the human brain across various image manipulations for both real-world and artificial object images, with representations formed in CNN lower layers more closely resembling those of lower than higher human visual areas and the reverse being true for higher CNN layers. In addition, we found that lower layers of several CNNs fully captured the representational structures of real-world objects of human lower visual areas for both the original and the filtered versions of these images. This replicated earlier results and showed that CNNs are capable of capturing some aspects of visual processing in the human brain.

Despite these successes, however, no CNN tested could fully capture the representational structure of the real-world object images in human higher visual areas. The same results were obtained regardless of whether or not category-selective voxels were included in human higher visual areas. Overall, the highest amount of explainable brain RDM variance that could be captured by CNNs from higher visual regions was about 60%. This is in agreement with previous neurophysiological studies on Macaque IT cortex 9 , 10 , 18 , 19 . When artificial object images were used, not only did most of the CNNs still fail to capture visual processing in higher human visual areas but also none could do so for lower human visual areas. Overall, no CNN examined could fully capture all levels of visual processing for both real-world and artificial objects, with similar performance observed in both shallow and deep CNNs (e.g., Alexnet vs. Googlenet). Although the recurrent CNN examined here, Cornet-S closely models neural processing and is argued to be the current best model of the primate ventral visual regions 12 , 19 , it did not outperform the other CNNs. The same results were also obtained when a CNN was trained with stylized object images that emphasized shape features in its representation. The main results across the three experiments are summarized in Fig.  7 , with Fig.  7a showing the results from the six conditions across the three experiments examining the real-world objects (i.e., the results from Figs.  3 , 4 , and 6a ) and Fig.  7b showing the results for the artificial objects (i.e., the results from Fig.  6b ). Alexnet, Googlenet, Squeezenet, and Vgg-16 showed the best brain–CNN correspondence overall for representing real-world objects among the 14 CNNs examined.

figure 7

A Summary of results from the six conditions across the three experiments that examined the processing of real-world object images (i.e., a summary of results from Figs.  3 , 4 , and 6 ). B Summary of results for the processing of artificial objects (i.e., results from Fig.  6b ). In A , each colored bar represents the averaged proportion of brain variance explained, with that from each condition marked by a black symbol. For real-world objects, a few CNNs (i.e., Alexnet, Googlenet, Squeezenet, and Vgg-16) were able to consistently capture brain RDM variance from lower human visual regions (i.e., V1–V3). No CNN was able to do so for higher human visual regions (i.e., LOT and VOT). The CNNs capable of fully capturing lower-level brain RDM variance for real-world objects all failed to capture that of the artificial objects from neither lower nor higher human visual regions. Source data are provided as a Source Data file.

Although we examined object category responses averaged over multiple exemplars rather than responses to each object, previous research has shown similar category and exemplar response profiles in macaque IT and human lateral occipital cortex with more robust responses for categories than individual exemplars due to an increase in SNR 45 , 46 . Rajalingham et al. 2 additionally reported better behavior-CNN correspondence at the category but not at the individual exemplar level. Thus, comparing the representational structure at the category level, rather than at the exemplar level, should have increased our chance of finding a close brain–CNN correspondence. Yet despite the overall brain and CNN correlations for object categories being much higher here than in previous studies for individual objects 5 , 8 , CNNs failed to fully capture the representational structure of real-world objects in the human brain and performed even worse for artificial objects. Object category information is shown to be better represented by higher than lower visual regions (e.g., 47 ). Our use of object category was thus not optimal for finding a close brain–CNN correspondence at lower levels of visual processing. Yet we found better brain–CNN correspondence at lower than higher levels of visual processing for real-world object categories. This suggests that information that defines the different real-world object categories is present at lower levels of visual processing and is captured by both lower visual regions and lower CNN layers. This is not surprising as many categories may be differentiated based on low-level features even with a viewpoint/orientation change, such as curvature and the presence of unique features (e.g., the large round outline of a face/head, the protrusion of the limbs in animals) 48 . Finally, it could be argued that the dissimilarity between the brain and CNNs at higher levels of visual processing for real-world object categories could be driven by feedback from high-level nonvisual regions and/or feedback from category-selective regions in the human ventral cortex for some of the categories used (i.e., faces, bodies, and houses). However, such feedback should greatly decrease for artificial object categories. Yet we failed to see much improvement in brain–CNN correspondence at higher levels of processing for these objects. If anything, even the strong correlation at lower levels of visual processing for real-world objects no longer existed for these artificial objects.

Decades of vision science research has relied on using simple and artificial visual stimuli to uncover the complexity of visual processing in the primate brain, showing that the same algorithms used in the processing of natural images would manifest themselves in the processing of artificial visual stimuli. The artificial object images tested here have been used in previous fMRI studies to understand object processing in the human brain (e.g., Op de Beeck et al., 2008 21 , 24 , 27 ). In particular, we showed that the transformation of visual representational structures across occipito-temporal and posterior parietal cortices follows a similar pattern for both the real-world objects and the artificial objects used here 21 . The disconnection between the representation of real-world and artificial object images in CNNs is in disagreement with this long-held principle in primate vision research and suggests that, even at lower levels of visual processing, CNNs differ from the primate brain in fundamental ways. Such a divergence will undoubtedly contribute to even greater divergence at higher levels of processing between the primate brain and CNNs.

Using real-world object images, recent studies have tried to improve brain and CNN RDM correlation by incorporating brain responses during CNN training. Using a recurrent network architecture, Kietzmann et al. 49 used both brain RDM and object categorization to guide CNN training and found that brain and CNN RDM correlation was still significantly below the noise ceiling in all human ventral visual regions examined. Khaligh-Razavi et al. 50 used a mixed RSA approach by first finding the best linear transformation between fMRI voxels and CNN layer units and then performing RDM correlations (see also ref. 10 ). The key idea here is that CNNs may contain all the right brain features in visual processing but that these features are improperly combined. Training enables remixing and recombination of these features and can result in a better brain–CNN alignment in representational structure. Using the mixed RSA approach, Khaligh-Razavi et al. 50 reported that the correlation between brain and CNN was able to reach the noise ceiling for LO. However, brain–CNN correlations were fairly low for all brain regions examined (i.e., V1–V4 and LO), with noise ceiling being just below 0.5 in V1 to just below 0.2 in LO (thus the amount of explainable variance was less than 4% in LO, which is really low). The low LO noise ceiling again raises concerns about the robustness of this finding (as it did for ref. 8 ). Khaligh-Razavi et al. 50 used a large data set from Kay et al. 51 , which contained 1750 unique training images with each shown twice, and 120 unique testing images with each shown 13 times. Our data in comparison are limited, containing between 16 and 18 different stimulus conditions, each shown between 16 to 18 times. We are thus underpowered to perform the mixed RSA analysis here to provide an objective evaluation of this approach. It should be noted that applying the mixed RSA analysis is not as straightforward as it seems, as we do not fully understand the balance between decreased model performance due to overfitting and increased model performance due to feature mixing, as well as the minimum amount of data needed for training and testing. In addition, a mixed RSA approach requires brain responses from a large number of single images. This will necessarily result in lower power and lower reliability across participants. In other words, due to noise, only a small amount of consistent neural responses are preserved across participants (as in Khaligh-Razavi et al. 50 ), resulting in much of the neural data used to train the model likely just being subject-specific noise. This can significantly weaken the mixed RSA approach. In addition, whether a mixed RSA model trained with one kind of object image (e.g., real-world object images) may accurately predict the responses from another kind of object image (e.g., artificial object images) has not been tested. Thus, although the general principle of a mixed RSA approach is promising, what it can actually deliver remains to be seen. In our study, we found good brain–CNN correspondence between lower CNN layers and lower visual areas for processing real-world objects. Thus, the mixing of the different features in lower CNN layers is well-matched with that of lower visual areas. Yet these lower CNN layers fail to capture lower visual areas’ responses for artificial objects. This indicates that some fundamental differences exist between the brain and CNNs at lower levels of visual processing that may not be overcome by remixing the CNN features.

What could be driving the difference between the brain and CNNs in visual processing? In recent studies, Baker et al. 44 and Geirhos et al. 31 , 52 reported that CNNs rely on local texture and shape features rather than global shape contours. This may explain why in our study lower CNN layers were able to fully capture the representational structures of real-world object images in lower visual areas, as processing in these brain areas likely relies more on local contours and texture patterns given their smaller receptive field sizes. As high-level object vision relies more on global shape contour processing (e.g., 53 ), the lack of such processing in CNNs may account for CNNs’ inability to fully capture processing in higher visual areas. This can be seen more directly in higher-level representations of our artificial objects (which share similar texture and contour elements at the local level but differ in how these elements are conjoined at the local and global levels). Specifically, while the brain takes both local and global shape similarities into consideration when grouping these objects, CNNs may rely mainly on local shape similarities (see the MDS plots in Fig.  5 and Supplementary Fig.  17 ). At lower levels of visual processing, the human brain likely encodes both shape elements and how they are conjoined at the local level to help differentiate the different artificial objects. CNNs, on the other hand, may rely more on the presence/absence of a particular texture patch or a shape element than on how they are conjoined at the local level to differentiate these objects. This may account for the divergence between the brain and CNNs at lower levels of visual processing for these artificial objects. Training with stylized images did not appear to improve performance in Resnet-50, suggesting that the differences between CNNs and the human brain may not be overcome by this type of training.

In two other studies involving real-world object images, we found additional differences between the human brain and CNNs in the development of transformation tolerant visual representations and the relative coding strength of object identity and nonidentity features 54 , 55 . Forming transformation-tolerant object identity representation has been argued to be the hallmark of primate vision, as it reduces the complexity of learning by requiring much fewer training examples and with the resulting representations being more generalizable to new instances of an object (e.g., in different viewing conditions) and to new exemplars of a category not included in the training. It could potentially dictate how objects are organized in the representational space in the brain, as examined in this study. While the magnitude of invariant object representation increases from lower to higher visual areas in the human brain, in the same 14 CNNs tested here, such invariance actually goes down from lower to higher CNN layers 54 . With its vast computing power, CNNs likely associate different instances of an object via a brute force approach (e.g., by simply grouping all instances of an object encountered under the same object label) without necessarily preserving the relationships among the objects across transformations and forming transformation-tolerant object representations. This again suggests that CNNs use a fundamentally different mechanism to group objects and solve the object recognition problem compared to the primate brain. In another study 55 , we documented the relative coding strength of object identity and nonidentity features during visual processing in the human brain and CNNs. We found that identity representation increased and nonidentity feature representation decreased along the ventral visual pathway. In the same 14 CNNs examined here, while identity representation increased over the course of visual processing, nonidentity feature representation showed an initial large increase followed by a decrease at later stages of processing, different from the brain responses. As a result, higher CNN layers deviated more from the corresponding brain regions than lower layers did in how object identity and nonidentity features are coded with respect to each other. This is consistent with the RDM comparison results reported in this study.

CNNs’ success in object categorization and their response correspondence with the primate visual areas have opened the exciting possibility that perhaps we can use CNN modeling as a viable scientific method to study primate vision. Presently, the detailed computations performed by CNNs are difficult for humans to understand, rendering them poorly understood information processing systems 3 , 56 . By analyzing results from three fMRI experiments and comparing visual representations in the human brain with 14 different CNNs, we found that CNNs’ performance is related to how they are built and trained: they are built following the known architecture of the primate lower visual areas and are trained with real-world object images. Consequently, the best performing CNNs (i.e., Alexnet, Googlenet, Squeezenet, and Vgg-16) are successful at fully capturing the visual representational structures of lower human visual areas during the processing of both original and filtered real-world images, but not those of higher human visual areas during the processing of these images or that of artificial images at either level of processing. The close brain–CNN correspondence found in earlier fMRI studies thus might have been overly optimistic by including only real-world objects (which CNNs are generally trained on) and testing on data with relatively lower power. When we expanded the comparisons here to a broader set of filtered real-world stimuli and to artificial stimuli as well as testing on brain data with a higher power, we see large discrepancies between the brain and CNNs at both lower and higher levels of visual processing. While CNNs are successful in object recognition, some fundamental differences likely exist between the human brain and CNNs and preclude CNNs from fully modeling the human visual system at their current states. This is unlikely to be remedied by simply changing the training images, changing the depth of the network, and/or adding recurrent processing. But rather, some fundamental changes may be needed to make CNNs more brain-like. This may only be achieved by our continuous research effort on understanding the precise algorithms used by the primate brain in visual processing to further guide CNN model development.

fMRI experimental details

Details of the fMRI experiments have been described in two previously published studies 21 , 22 . They are summarized here for the readers’ convenience (see also Table  1 ).

Six, ten, and six healthy human participants with normal or corrected to normal visual acuity, all right-handed and aged between 18 and 35, took part in Experiments 1–3, respectively. The sample size for each fMRI experiment was chosen based on prior published studies (e.g., 57 , 58 ). All participants gave their written informed consent before the experiments and received payment for their participation. The experiments were approved by the Committee on the Use of Human Subjects at Harvard University. Each main experiment was performed in a separate session lasting between 1.5 and 2 h. Each participant also completed two additional sessions for topographic mapping and functional localizers. MRI data were collected using a Siemens MAGNETOM Trio, A Tim System 3T scanner, with a 32-channel receiver array head coil. For all the fMRI scans, a T2*-weighted gradient echo pulse sequence with TR of 2 s and a voxel size of 3 mm × 3 mm × 3 mm was used. fMRI data were analyzed using FreeSurfer (surfer.nmr.mgh.harvard.edu), FsFast 59 , and in-house MATLAB codes. FMRI data preprocessing included 3D motion correction, slice timing correction, and linear and quadratic trend removal. Following standard practice, a general linear model was applied to the fMRI data to extract beta weights as response estimates.

In Experiment 1, we used cut-out gray-scaled images from eight real-world object categories (faces, bodies, houses, cats, elephants, cars, chairs, and scissors) and modified them to occupy roughly the same area on the screen (Fig.  1b ). For each object category, we selected ten exemplar images that varied in identity, viewpoint/orientation, and pose (for the animal categories) to minimize the low-level similarities among them (see Supplementary Fig.  1 for the full set of images used). In this and the two experiments reported below, objects were always presented at fixation, and object positions never varied. In the original image condition, unaltered images were shown. In the controlled image condition, images were shown with contrast, luminance, and spatial frequency equalized across all the categories using the SHINE toolbox 23 (see Fig.  1b ). Participants fixated at a central red dot throughout the experiment. Eye-movements were monitored in all the fMRI experiments to ensure proper fixation.

During the experiment, blocks of images were shown. Each block contained a random sequential presentation of ten exemplars from the same object category. Each image was presented for 200 ms followed by a 600 ms blank interval between the images (Fig.  1a ). Participants detected a one-back repetition of the exact same image. This task-focused participants’ attention on the object shapes and ensured robust fMRI responses. However, similar visual representations may be obtained when participants attended to the color of the objects 60 , 61 (see also 62 ). Two image repetitions occurred randomly in each image block. Each experimental run contained 16 blocks, one for each of the 8 categories in each image condition (original or controlled). The order of the eight object categories and the two image conditions were counterbalanced across runs and participants. Each block lasted 8 s and was followed by an 8-s fixation period. There was an additional 8-s fixation period at the beginning of the run. Each participant completed one scan session with 16 runs for this experiment, each lasting 4 min 24 s.

In Experiment 2, only six of the original eight object categories were used including faces, bodies, houses, elephants, cars, and chairs. Images were shown in 3 conditions: Full-SF, High-SF, and Low-SF. In the Full-SF condition, the full spectrum images were shown without modification of the SF content. In the High-SF condition, images were high-pass filtered using an FIR filter with a cutoff frequency of 4.40 cycles per degree (Fig.  1b ). In the Low-SF condition, the images were low-pass filtered using an FIR filter with a cutoff frequency of 0.62 cycles per degree (Fig.  1b ). The DC component was restored after filtering so that the image backgrounds were equal in luminance. Each run contained 18 blocks, one for each of the category and SF condition combinations. Each participant completed a single scan session containing 18 experimental runs, each lasting 5 min. Other details of the experiment design were identical to that of Experiment 1.

In Experiment 3, we used unaltered images from both real-world and artificial object categories. The real-world categories were the same eight categories used in Experiment 1, with the exemplars varying in identity, viewpoint/orientation, and pose (for the animal categories) to minimize the low-level similarities among them. The artificial object categories were nine categories of computer-generated 3D shapes (ten images per category) adopted from Op de Beeck et al. 24 and shown in random orientations to increase image variation within a category and to match the image variation of the exemplars used for the real-world object categories (see Fig.  1b ; for the full set of artificial object images used, see Supplementary Fig.  2 ). Each run of the experiment contained 17 stimulus blocks, one for each object category (either real-world or artificial). Each participant completed 18 runs, each lasting 4 min 40 s. Other details of the experiment design were identical to that of Experiment 1.

We examined responses from independent localized lower visual areas V1–V4 and higher visual processing regions LOT and VOT. V1–V4 were mapped with flashing checkerboards using standard techniques 63 . Following the detailed procedures described in Swisher et al. 64 and by examining phase reversals in the polar angle maps, we identified areas V1–V4 in the occipital cortex of each participant (see also ref. 65 ) (Fig.  1c ). To identify LOT and VOT, following Kourtzi and Kanwisher 66 , participants viewed blocks of face, scene, object, and scrambled object images. These two regions were then defined as a cluster of contiguous voxels in the lateral and ventral occipital cortex, respectively, that responded more to the original than the scrambled object images (Fig.  1c ). LOT and VOT loosely correspond to the location of LO and pFs 66 , 67 , 68 but extend further into the temporal cortex in an effort to include as many object-selective voxels as possible in occipito-temporal regions.

LOT and VOT included a large swath of the ventral and lateral OTC and likely overlapped to a great extent with regions selective for specific object categories, including faces, bodies or scenes. To understand how the inclusion of these category-specific regions may affect the brain–CNN correlation, we also constructed LOT and VOT ROIs without the category-selective voxels. This was done by testing the category selectivity of each voxel in these two ROIs using the data from the main experiment. Specifically, since there were at least 16 runs in each experiment, using paired t tests, we defined a LOT or a VOT voxel as face-selective if its response was higher for faces than for each of the other non-face categories at p  < 0.05. Similarly, a voxel was defined as body-selective if its response was higher for the average of bodies, cats, and elephants (in Experiment 2, only the average of bodies and elephants was used as cats were excluded in the experiment) than for each of the non-body categories at p  < 0.05. Finally, a voxel was defined as scene-selective if its response was higher for houses than for each of the other non-scene categories at p  < 0.05. In this analysis, a given object category’s responses in the different formats (e.g., original and controlled) were averaged together. Given that each experiment contained at least 16 runs, using the main experimental data to define the category-selective voxels in LOT and VOT is comparable to how these voxels are traditionally defined. We used a relatively lenient threshold hold of p  < 0.05 here to ensure that we excluded any voxels that exhibited any category selectivity, even if this occurred just by chance.

To generate the fMRI response pattern for each ROI in a given run, we first convolved an 8-s stimulus presentation boxcar (corresponding to the length of each image block) with a hemodynamic response function to each condition; we then conducted a general linear model analysis to extract the beta weight for each condition in each voxel of that ROI. These voxel beta weights were used as the fMRI response pattern for that condition in that run. Following Tarhan and Konkle 69 , we selected the top 75 most reliable voxels in each ROI for further analyses. This was done by splitting the data into odd and even halves, averaging the data across the runs within each half, correlating the beta weights from all the conditions between the two-halves for each voxel, and then selecting the top 75 voxels showing the highest correlation. This is akin to including the best units in monkey neurophysiological studies. For example, Cadieu et al. 10 only selected a small subset of all recorded single units for their brain–CNN analysis. We obtained the fMRI response pattern for each condition from the 75 most reliable voxels in each ROI of each run. We then averaged the fMRI response patterns across all runs and applied z-normalization to the averaged pattern for each condition in each ROI to remove amplitude differences between conditions and ROIs.

CNN details

We tested 14 CNNs in our analyses (see Supplementary Table  1 ). They included both shallower networks, such as Alexnet, VGG16, and VGG 19, and deeper networks, such as Googlenet, Inception-v3, Resnet-50, and Resnet-101. We also included a recurrent network, Cornet-S, that has been shown to capture the recurrent processing in macaque IT cortex with a shallower structure 12 , 19 . This CNN has been recently argued to be the current best model of the primate ventral visual processing regions 19 . All the CNNs used were trained with ImageNet images 30 .

To understand how the specific training images would impact CNN representations, besides CNNs trained with ImageNet images, we also examined Resnet-50 trained with stylized ImageNet images 31 . We examined the representations formed in Resnet-50 pretrained with three different procedures 31 : trained only with the stylized ImageNet Images (RN50-SIN), trained with both the original and the stylized ImageNet Images (RN50-SININ), and trained with both sets of images and then fine-tuned with the stylized ImageNet images (RN50-SININ-IN).

Following O’Connel & Chun 32 , we sampled between 6 and 11 mostly pooling and FC layers of each CNN (see Supplementary Table  1 for the specific CNN layers sampled). Pooling layers were selected because they typically mark the end of processing for a block of layers when information is pooled to be passed on to the next block of layers. When there were no obvious pooling layers present, the last layer of a block was chosen. For a given CNN layer, we extracted the CNN layer output for each object image in a given condition, averaged the output from all images in a given category for that condition, and then z-normalized the responses to generate the CNN layer response for that object category in that condition (similar to how fMRI category responses were extracted). Cornet-S and the different versions of Resnet-50 were implemented in Python. All other CNNs were implemented in Matlab. The output from all CNNs was analyzed and compared with brain responses using Matlab.

Comparing the representational structures between the brain and CNNs

To determine the extent to which object category representations were similar between brain regions and CNN layers, we correlated the object category representational structure between brain regions and CNN layers. To do so, we obtained the RDM from each brain region by computing all pairwise Euclidean distances for the object categories included in an experiment and then taking the off-diagonal values of this RDM as the category dissimilarity vector for that brain region. This was done separately for each participant. Likewise, from the CNN layer output, we computed pairwise Euclidean distances for the object categories included in an experiment to form the RDM and then taking the off-diagonal values of this RDM as the category dissimilarity vector for that CNN layer. We applied this procedure to each sampled layer of each CNN.

We then correlated the category dissimilarity vectors between each brain region of each participant and each sampled CNN layer. Following Cichy et al. 5 , all correlations were calculated using Spearman rank correlation to compare the rank order, rather than the absolute magnitude, of the category representational similarity between the brain and CNNs (see also ref. 33 similar results were obtained, however, when Pearson correlation was used instead, see the results reported in Supplementary Figs.  3 , 6 , 7 , and 16 ). All correlation coefficients were Fisher z-transformed before group-level statistical analyses were carried out.

To evaluate the correspondence in representation between lower and higher CNN layers to lower and higher visual processing regions, for each CNN examined, we identified, in each human participant, the CNN layer that showed the best RDM correlation with each of the six brain regions included. We then assessed whether the resulting layer numbers increased from low to high visual regions using Spearman rank correlation. Finally, we tested the resulting correlation coefficients at the participant group level. If a close correspondence in representation exists between the brain and CNNs, the averaged correlation coefficients should be significantly above zero. All stats reported were from one-tailed t tests. One-tailed t tests were used here as only values above zero were meaningful. In addition, all stats reported were corrected for multiple comparisons for the number of comparisons included in each experiment using the Benjamini–Hochberg procedure with the false-discovery rate (FDR) controlled at q  = 0.05 34 .

To assess how successfully the category RDM from a CNN layer could capture the RDM from a brain region, we first obtained the reliability of the category RDM in a brain region across the group of human participants by calculating the lower and upper bounds of the noise ceiling of the fMRI data following the procedure described by Nili et al. 33 . Specifically, the upper bound of the noise ceiling for a brain region was established by taking the average of the correlations between each participant’s RDM and the group average RDM including all participants, whereas the lower bound of the noise ceiling for a brain region was established by taking the average of the correlations between each participant’s RDM and the group average RDM excluding that participant.

To evaluate the degree to which CNN category RDMs may capture those of the different brain regions, for each CNN, using one-tailed t tests, we examined how close the highest correlation between a CNN layer and a brain region was to the lower bound of the noise ceiling of that brain region. These correlation results are reported in Supplementary Figs.  6 , 7 , and 16 . To transform these correlation results into the proportion of explainable brain RDM variance captured by the CNN, we divided the brain–CNN RDM correlation by the corresponding lower bound of the noise ceiling and then squared the resulting value. We evaluated whether a CNN could fully capture the RDM variance of a brain region by testing the difference between 1 and the highest proportion of variance captured by the CNN using one-tailed t tests. One-tailed t tests were used as only testing values below the lower bound of the noise ceiling (for measuring correlation values) or below 1 (for measuring the amount of variance captured) were meaningful here. The t test results were corrected for multiple comparisons for the six brain regions included using the Benjamini–Hochberg procedure at q  = 0.05. If a CNN layer was able to fully capture the representational structure of a brain region, then its RDM correlation with the brain region should exceed the lower bound of the noise ceiling of that brain region, and the proportion of variance explained should not differ from 1. Because the lower bound of the noise ceiling varied somewhat among the different brain regions, for illustration purposes, in Supplementary Figs.  6 , 7 , 16 , and 19 , we plotted the lower bound of the noise ceiling from all brain regions at 0.7 while maintaining the differences between the CNN and brain correlations with respect to their lower bound noise ceilings (i.e., by subtracting the difference between the actual noise ceiling and 0.7 from each brain–CNN correlation value). This did not affect any statistical test results.

To directly visualize the object representational structures in different brain regions and CNN layers, using classical multidimensional scaling, we placed the category RDMs onto 2D spaces with the distances among the categories approximating their relative similarities to each other. The same scaling factor was used to plot the MDS plot for each sampled layer of each CNN. Thus the distance among the categories may be directly compared across the different sampled layers of a given CNN and across CNNs. The scaling factor was doubled for the brain MDS plots for Experiments 1 and 3 and was quadrupled for Experiment 2 to allow better visibility of the different categories in each plot. Thus the distance among the categories may still be directly compared across the different brain regions within a given experiment and between Experiments 1 and 3. Since rotations and flips preserve distances on these MDS plots, to make these plots more informative and to see how the representational structure evolved across brain regions and CNN layers, we manually rotated and/or flipped each MDS plot when necessary. In some cases, to maintain consistency across plots, we arbitrarily picked a few categories as our anchor points and then rotated and/or flipped the MDS plots accordingly.

Reporting summary

Further information on research design is available in the  Nature Research Reporting Summary linked to this article.

Data availability

Data supporting the findings of this study are available at https://osf.io/tsz47/ .  Source data are provided with this paper.

Code availability

Standard code from the listed software was used. No special code was developed for this study.

Change history

06 may 2021.

A Correction to this paper has been published: https://doi.org/10.1038/s41467-021-23110-2

Kriegeskorte, N. Deep neural networks: a new framework for modeling biological vision and brain information processing. Annu. Rev. Vis. Sci. 1 , 417–446 (2015).

Article   PubMed   Google Scholar  

Rajalingham, R. et al. Large-scale, high-resolution comparison of the core visual object recognition behavior of humans, monkeys, and state-of-the-art deep artificial neural networks. J. Neurosci. 38 , 7255–7269 (2018).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Serre, T. Deep learning: the good, the bad, and the ugly. Annu. Rev. Vis. Sci. 5 , 21.1–21.28 (2019).

Article   Google Scholar  

Yamins, D. L. K. & DiCarlo, J. J. Using goal-driven deep learning models to understand sensory cortex. Nat. Neurosci. 19 , 356–365 (2016).

Article   CAS   PubMed   Google Scholar  

Cichy, R. M., Khosla, A., Pantazis, D., Torralba, A. & Oliva, A. Comparison of deep neural networks to spatiotemporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Sci. Rep. 6 , 27755 (2016).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Eickenberg, M., Gramfort, A., Varoquaux, G. & Thirion, B. Seeing it all: convolutional network layers map the function of the human visual system. NeuroImage 152 , 184–194 (2017).

Güçlü, U. & van Gerven, M. A. J. Increasingly complex representations of natural movies across the dorsal stream are shared between subjects. NeuroImage 145 , 329–336 (2017).

Khaligh-Razavi, S.-M. & Kriegeskorte, N. Deep supervised, but not unsupervised, models may explain IT cortical representation. PLOS Comput. Biol. 10 , e1003915 (2014).

Article   PubMed   PubMed Central   Google Scholar  

Yamins, D. L. K. et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl Acad. Sci. USA 111 , 8619–8624 (2014).

Cadieu, C. F. et al. Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLOS Comput. Biol. 10 , e1003963 (2014).

Cichy, R. M. & Kaiser, D. Deep neural networks as scientific models. Trends Cogn. Sci. 23 , 305–317 (2019).

Kubilius, J., et al. Brain-like object recognition with high-performing shallow recurrent ANNs. in Advances in Neural Information Processing Systems, 32, NeurIPS Proceedings . (2019).

Long, B. & Konkle, T. Mid-level visual features underlie the high-level categorical organization of the ventral stream. Proc. Natl Acad. Sci. USA 115 , E9015–E9024 (2018).

Bracci, S., Ritchie, J. B., Kalfas, I. & Op de Beeck, H. P. The ventral visual pathway represents animal appearance over animacy, unlike human behavior and deep neural networks. J. Neurosci. 39 , 6513–6525 (2019).

King, M. L., Groen, I. I. A., Steel, A., Kravitz, D. J. & Baker, C. I. Similarity judgments and cortical visual responses reflect different properties of object and scene categories in naturalistic images. NeuroImage 197 , 368–382 (2019).

Kriegeskorte, N. & Kievit, R. A. Representational geometry: integrating cognition, computation, and the brain. Trends Cogn. Sci. 17 , 401–412 (2013).

Storrs, K. R., Khaligh-Razavi, S.-M. & Kriegeskorte, N. Noise ceiling on the cross validated performance of reweighted models of representational dissimilarity: Addendum to Khaligh-Razavi & Kriegeskorte (2014). Preprint at bioRxiv https://doi.org/10.1101/2020.03.23.003046 (2020).

Bao, P., She, L., McGill, M. & Tsao, D. Y. A map of object space in primate inferotemporal cortex. Nature 583 , 103–108 (2020).

Kar, K., Kubilius, J., Schmidt, K., Issa, E. B. & DiCarlo, J. J. Evidence that recurrent circuits are critical to the ventral stream’s execution of core object recognition behavior. Nat. Neurosci. 22 , 974–983 (2019).

Xu, Y. Comparing visual object representations in the human brain and convolutional neural networks. https://doi.org/10.17605/OSF.IO/TSZ47 (2021).

Vaziri-Pashkam, M. & Xu, Y. An information-driven two-pathway characterization of occipito-temporal and posterior parietal visual object representations. Cereb. Cortex 29 , 2034–2050 (2019).

Vaziri-Pashkam, M., Taylor, J. & Xu, Y. Spatial frequency tolerant visual object representations in the human ventral and dorsal visual processing pathways. J. Cogn. Neurosci. 31 , 49–63 (2019).

Willenbockel, V. et al. Controlling low-level image properties: the SHINE toolbox. Behav. Res. Methods 42 , 671–684 (2010).

Op de Beeck, H. P., Torfs, K. & Wagemans, J. Perceived shape similarity among unfamiliar objects and the organization of the human object vision pathway. J. Neurosci. 28 , 10111–10123 (2008).

Orban, G. A., Van Essen, D. & Vanduffel, W. Comparative mapping of higher visual areas in monkeys and humans. Trends Cogn. Sci. 8 , 315–324 (2004).

Grill-Spector, K., Kushnir, T., Hendler, T. & Malach, R. The dynamics of object-selective activation correlate with recognition performance in humans. Nat. Neurosci. 3 , 837–843 (2000).

Williams, M. A., Dang, S. & Kanwisher, N. G. Only some spatial patterns of fMRI response are read out in task performance. Nat. Neurosci. 10 , 685–686 (2007).

Farah, M. J. Visual Agnosia . (MIT Press, Cambridge, Mass, 2004).

Goodale, M. A., Milner, A. D., Jakobson, L. S. & Carey, D. P. A neurological dissociation between perceiving objects and grasping them. Nature 349 , 154–156 (1991).

Article   ADS   CAS   PubMed   Google Scholar  

Deng, J., et al. ImageNet: a largescale hierarchical image database. in Proc. IEEE conference on computer vision and pattern recognition (CVPR) 248–255 (2009).

Geirhos, R., et al. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. in Proc. International Conference on Learning Representations (2019).

O’Connell, T. P. & Chun, M. M. Predicting eye movement patterns from fMRI responses to natural scenes. Nat. Commun. 9 , 5159 (2018).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Nili, H. et al. A toolbox for representational similarity analysis. PLOS Comput. Biol. 10 , e1003553 (2014).

Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate—a practical and powerful approach to multiple testing. J. R. Stat. Soc. B Methods 57 , 289–300 (1995).

MathSciNet   MATH   Google Scholar  

Shepard, R. N. Multidimensional scaling, tree-fitting, and clustering. Science 210 , 390–398 (1980).

Hubel, D. H. Eye, Brain, and Vision . (WH Freeman, New York, 1988).

von der Heydt, R. Form analysis in visual cortex. in The Cognitive Neurosciences (ed Gazzaniga M. S.), 365–382. (MIT Press, Cambridge, Mass, 1994).

Kourtzi, Z. & Connor, C. E. Neural representations for object perception: structure, category, and adaptive coding. Annu. Rev. Neurosci. 34 , 45–67 (2011).

Tanaka, K. Columns for complex visual object features in the inferotemporal cortex: clustering of cells with similar but slightly different stimulus selectivities. Cereb. Cortex 13 , 90–99 (2003).

Kubilius, J., Bracci, S. & Op de Beeck, H. P. Deep neural networks as a computational model for human shape sensitivity. PLOS Comput. Biol. 12 , e1004896 (2016).

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521 , 436–444 (2015).

Gatys, L. A., Ecker, A. S. & Bethge, M. Texture and art with deep neural networks. Curr. Opin. Neurobiol. 46 , 178–186 (2017).

Ballester, P. & de Araújo, R. M. On the Performance of GoogLeNet and AlexNet Applied to Sketches. in AAAI 1124–1128 (2016).

Baker, N., Lu, H., Erlikhman, G. & Kellman, P. J. Deep convolutional networks do not classify based on global object shape. PLOS Comput. Biol. 14 , e1006613 (2018).

Cichy, R. M., Chen, Y. & Haynes, J. D. Encoding the identity and location of objects in human LOC. Neuroimage 54 , 2297–2307 (2011).

Hung, C. P., Kreiman, G., Poggio, T. & DiCarlo, J. J. Fast readout of object identity from macaque inferior temporal cortex. Science 310 , 863–866 (2005).

Hong, H., Yamins, D. L. K., Majaj, N. J. & DiCarlo, J. J. Explicit information for category-orthogonal object properties increases along the ventral stream. Nat. Neurosci. 19 , 613–622 (2016).

Rice, G. E., Watson, D. M., Hartley, T. & Andrews, T. J. Low-level image properties of visual objects predict patterns of neural response across category selective regions of the ventral visual pathway. J. Neurosci. 34 , 8837–8844 (2014).

Kietzmann, T. et al. Recurrence required to capture the dynamic computations of the human ventral visual stream. Proc. Natl Acad. Sci. USA 116 , 21854–21863 (2019).

Khaligh-Razavi, S.-M.., Henriksson, L., Kay, K. & Kriegeskorte, N. Fixed versus mixed RSA: Explaining visual representations by fixed and mixed feature sets from shallow and deep computational models. J. Math. Psychol. 76 , 184–197 (2017).

Kay, K. N., Naselaris, T., Prenger, R. J. & Gallant, J. L. Identifying natural images from human brain activity. Nature 452 , 352–355 (2008).

Geirhos, R., et al. Generalisation in humans and deep neural networks. in Advances in Neural Information Processing Systems 31, (ed S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, R. Garnett), 7549–7561. (Curran Assoc., Red Hook, NY, 2018).

Biederman, I. Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94 , 115–147 (1987).

Xu, Y. & Vaziri-Pashkam, M. The development of transformation tolerant visual representations differs between the human brain and convolutional neural networks. Preprint at bioRxiv https://doi.org/10.1101/2020.08.11.246934 (2020a).

Xu, Y. & Vaziri-Pashkam, M. The coding of object identity and nonidentity features in human occipito-temporal cortex and convolutional neural networks. J. Neurosci. https://doi.org/10.1101/2020.08.11.246967 . (In press).

Kay, K. N. Principles for models of neural information processing. NeuroImage 180 , 101–109 (2018).

Haxby, J. V. et al. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293 , 2425–2430 (2001).

Kamitani, Y. & Tong, F. Decoding the visual and subjective contents of the human brain. Nat. Neurosci. 8 , 679–685 (2005).

Dale, A. M., Fischl, B. & Sereno, M. I. Cortical surface-based analysis. I. Segmentation and surface reconstruction. Neuroimage 9 , 179–194 (1999).

Vaziri-Pashkam, M. & Xu, Y. Goal-directed visual processing differentially impacts human ventral and dorsal visual representations. J. Neurosci. 37 , 8767–8782 (2017).

Xu, Y. & Vaziri-Pashkam, M. Task modulation of the 2-pathway characterization of occipitotemporal and posterior parietal visual object representations. Neuropsychologia 132 , 107140 (2019).

Xu, Y. A tale of two visual systems: invariant and adaptive visual information representations in the primate brain. Annu. Rev. Vis. Sci. 4 , 311–336 (2018).

Sereno, M. I. et al. Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science 268 , 889–893 (1995).

Swisher, J. D., Halko, M. A., Merabet, L. B., McMains, S. A. & Somers, D. C. Visual topography of human intraparietal sulcus. J. Neurosci. 27 , 5326–5337 (2007).

Bettencourt, K. C. & Xu, Y. Understanding location- and feature-based processing along the human intraparietal sulcus. J. Neurophysiol. 116 , 1488–1497 (2016).

Kourtzi, Z. & Kanwisher, N. Cortical regions involved in perceiving object shape. J. Neurosci. 20 , 3310–3318 (2000).

Grill‐Spector, K. et al. A sequence of object‐processing stages revealed by fMRI in the human occipital lobe. Hum. Brain Mapp. 6 , 316–328 (1998).

Malach, R. et al. Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex. Proc. Natl Acad. Sci. USA 92 , 8135–8139 (1995).

Tarhan, L. & Konkle, T. Reliability-based voxel selection. Neuroimage 207 , 116350 (2020).

Download references

Acknowledgements

We thank Martin Schrimpf for help implementing CORnet-S, JohnMark Tayler for extracting the features from the three Resnet-50 models trained with the stylized images, and Thomas O’Connell, Brian Scholl, JohnMark Taylor, and Nick Turk-Brown for helpful discussions and feedback on the results. The project is supported by NIH grants 1R01EY022355 and 1R01EY030854 to Y.X. M.V.P. was supported in part by NIH Intramural Research Program ZIA MH002035.

Author information

Authors and affiliations.

Psychology Department, Yale University, New Haven, CT, USA

Laboratory of Brain and Cognition, National Institute of Mental Health, Bethesda, MD, USA

Maryam Vaziri-Pashkam

You can also search for this author in PubMed   Google Scholar

Contributions

The fMRI data used here were from two prior publications, with M.V.-P. and Y.X. designing the fMRI experiments and M.V.-P. collecting and analyzing the fMRI data. Y.X. conceptualized the present study and performed the brain–CNN correlation analyses reported here. Y.X. wrote the paper with comments from M.V.-P.

Corresponding author

Correspondence to Yaoda Xu .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks Mark Lescroart and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer review reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemenatary information, peer review file, reporting summary, source data, source data, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Xu, Y., Vaziri-Pashkam, M. Limits to visual representational correspondence between convolutional neural networks and the human brain. Nat Commun 12 , 2065 (2021). https://doi.org/10.1038/s41467-021-22244-7

Download citation

Received : 30 March 2020

Accepted : 05 March 2021

Published : 06 April 2021

DOI : https://doi.org/10.1038/s41467-021-22244-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Improved modeling of human vision by incorporating robustness to blur in convolutional neural networks.

Nature Communications (2024)

Deep convolutional neural networks are not mechanistic explanations of object recognition

  • Bojana Grujičić

Synthese (2024)

On the ability of standard and brain-constrained deep neural networks to support cognitive superposition: a position paper

  • Max Garagnani

Cognitive Neurodynamics (2024)

Multiple visual objects are represented differently in the human brain and convolutional neural networks

  • Su Keun Jeong

Scientific Reports (2023)

Denoised Internal Models: A Brain-inspired Autoencoder Against Adversarial Attacks

  • Kai-Yuan Liu

Machine Intelligence Research (2022)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

visual representation about

  • Business Essentials
  • Leadership & Management
  • Credential of Leadership, Impact, and Management in Business (CLIMB)
  • Entrepreneurship & Innovation
  • *New* Digital Transformation
  • Finance & Accounting
  • Business in Society
  • For Organizations
  • Support Portal
  • Media Coverage
  • Founding Donors
  • Leadership Team

visual representation about

  • Harvard Business School →
  • HBS Online →
  • Business Insights →

Business Insights

Harvard Business School Online's Business Insights Blog provides the career insights you need to achieve your goals and gain confidence in your business skills.

  • Career Development
  • Communication
  • Decision-Making
  • Earning Your MBA
  • Negotiation
  • News & Events
  • Productivity
  • Staff Spotlight
  • Student Profiles
  • Work-Life Balance
  • Alternative Investments
  • Business Analytics
  • Business Strategy
  • Business and Climate Change
  • Design Thinking and Innovation
  • Digital Marketing Strategy
  • Disruptive Strategy
  • Economics for Managers
  • Entrepreneurship Essentials
  • Financial Accounting
  • Global Business
  • Launching Tech Ventures
  • Leadership Principles
  • Leadership, Ethics, and Corporate Accountability
  • Leading with Finance
  • Management Essentials
  • Negotiation Mastery
  • Organizational Leadership
  • Power and Influence for Positive Impact
  • Strategy Execution
  • Sustainable Business Strategy
  • Sustainable Investing
  • Winning with Digital Platforms

17 Data Visualization Techniques All Professionals Should Know

Data Visualizations on a Page

  • 17 Sep 2019

There’s a growing demand for business analytics and data expertise in the workforce. But you don’t need to be a professional analyst to benefit from data-related skills.

Becoming skilled at common data visualization techniques can help you reap the rewards of data-driven decision-making , including increased confidence and potential cost savings. Learning how to effectively visualize data could be the first step toward using data analytics and data science to your advantage to add value to your organization.

Several data visualization techniques can help you become more effective in your role. Here are 17 essential data visualization techniques all professionals should know, as well as tips to help you effectively present your data.

Access your free e-book today.

What Is Data Visualization?

Data visualization is the process of creating graphical representations of information. This process helps the presenter communicate data in a way that’s easy for the viewer to interpret and draw conclusions.

There are many different techniques and tools you can leverage to visualize data, so you want to know which ones to use and when. Here are some of the most important data visualization techniques all professionals should know.

Data Visualization Techniques

The type of data visualization technique you leverage will vary based on the type of data you’re working with, in addition to the story you’re telling with your data .

Here are some important data visualization techniques to know:

  • Gantt Chart
  • Box and Whisker Plot
  • Waterfall Chart
  • Scatter Plot
  • Pictogram Chart
  • Highlight Table
  • Bullet Graph
  • Choropleth Map
  • Network Diagram
  • Correlation Matrices

1. Pie Chart

Pie Chart Example

Pie charts are one of the most common and basic data visualization techniques, used across a wide range of applications. Pie charts are ideal for illustrating proportions, or part-to-whole comparisons.

Because pie charts are relatively simple and easy to read, they’re best suited for audiences who might be unfamiliar with the information or are only interested in the key takeaways. For viewers who require a more thorough explanation of the data, pie charts fall short in their ability to display complex information.

2. Bar Chart

Bar Chart Example

The classic bar chart , or bar graph, is another common and easy-to-use method of data visualization. In this type of visualization, one axis of the chart shows the categories being compared, and the other, a measured value. The length of the bar indicates how each group measures according to the value.

One drawback is that labeling and clarity can become problematic when there are too many categories included. Like pie charts, they can also be too simple for more complex data sets.

3. Histogram

Histogram Example

Unlike bar charts, histograms illustrate the distribution of data over a continuous interval or defined period. These visualizations are helpful in identifying where values are concentrated, as well as where there are gaps or unusual values.

Histograms are especially useful for showing the frequency of a particular occurrence. For instance, if you’d like to show how many clicks your website received each day over the last week, you can use a histogram. From this visualization, you can quickly determine which days your website saw the greatest and fewest number of clicks.

4. Gantt Chart

Gantt Chart Example

Gantt charts are particularly common in project management, as they’re useful in illustrating a project timeline or progression of tasks. In this type of chart, tasks to be performed are listed on the vertical axis and time intervals on the horizontal axis. Horizontal bars in the body of the chart represent the duration of each activity.

Utilizing Gantt charts to display timelines can be incredibly helpful, and enable team members to keep track of every aspect of a project. Even if you’re not a project management professional, familiarizing yourself with Gantt charts can help you stay organized.

5. Heat Map

Heat Map Example

A heat map is a type of visualization used to show differences in data through variations in color. These charts use color to communicate values in a way that makes it easy for the viewer to quickly identify trends. Having a clear legend is necessary in order for a user to successfully read and interpret a heatmap.

There are many possible applications of heat maps. For example, if you want to analyze which time of day a retail store makes the most sales, you can use a heat map that shows the day of the week on the vertical axis and time of day on the horizontal axis. Then, by shading in the matrix with colors that correspond to the number of sales at each time of day, you can identify trends in the data that allow you to determine the exact times your store experiences the most sales.

6. A Box and Whisker Plot

Box and Whisker Plot Example

A box and whisker plot , or box plot, provides a visual summary of data through its quartiles. First, a box is drawn from the first quartile to the third of the data set. A line within the box represents the median. “Whiskers,” or lines, are then drawn extending from the box to the minimum (lower extreme) and maximum (upper extreme). Outliers are represented by individual points that are in-line with the whiskers.

This type of chart is helpful in quickly identifying whether or not the data is symmetrical or skewed, as well as providing a visual summary of the data set that can be easily interpreted.

7. Waterfall Chart

Waterfall Chart Example

A waterfall chart is a visual representation that illustrates how a value changes as it’s influenced by different factors, such as time. The main goal of this chart is to show the viewer how a value has grown or declined over a defined period. For example, waterfall charts are popular for showing spending or earnings over time.

8. Area Chart

Area Chart Example

An area chart , or area graph, is a variation on a basic line graph in which the area underneath the line is shaded to represent the total value of each data point. When several data series must be compared on the same graph, stacked area charts are used.

This method of data visualization is useful for showing changes in one or more quantities over time, as well as showing how each quantity combines to make up the whole. Stacked area charts are effective in showing part-to-whole comparisons.

9. Scatter Plot

Scatter Plot Example

Another technique commonly used to display data is a scatter plot . A scatter plot displays data for two variables as represented by points plotted against the horizontal and vertical axis. This type of data visualization is useful in illustrating the relationships that exist between variables and can be used to identify trends or correlations in data.

Scatter plots are most effective for fairly large data sets, since it’s often easier to identify trends when there are more data points present. Additionally, the closer the data points are grouped together, the stronger the correlation or trend tends to be.

10. Pictogram Chart

Pictogram Example

Pictogram charts , or pictograph charts, are particularly useful for presenting simple data in a more visual and engaging way. These charts use icons to visualize data, with each icon representing a different value or category. For example, data about time might be represented by icons of clocks or watches. Each icon can correspond to either a single unit or a set number of units (for example, each icon represents 100 units).

In addition to making the data more engaging, pictogram charts are helpful in situations where language or cultural differences might be a barrier to the audience’s understanding of the data.

11. Timeline

Timeline Example

Timelines are the most effective way to visualize a sequence of events in chronological order. They’re typically linear, with key events outlined along the axis. Timelines are used to communicate time-related information and display historical data.

Timelines allow you to highlight the most important events that occurred, or need to occur in the future, and make it easy for the viewer to identify any patterns appearing within the selected time period. While timelines are often relatively simple linear visualizations, they can be made more visually appealing by adding images, colors, fonts, and decorative shapes.

12. Highlight Table

Highlight Table Example

A highlight table is a more engaging alternative to traditional tables. By highlighting cells in the table with color, you can make it easier for viewers to quickly spot trends and patterns in the data. These visualizations are useful for comparing categorical data.

Depending on the data visualization tool you’re using, you may be able to add conditional formatting rules to the table that automatically color cells that meet specified conditions. For instance, when using a highlight table to visualize a company’s sales data, you may color cells red if the sales data is below the goal, or green if sales were above the goal. Unlike a heat map, the colors in a highlight table are discrete and represent a single meaning or value.

13. Bullet Graph

Bullet Graph Example

A bullet graph is a variation of a bar graph that can act as an alternative to dashboard gauges to represent performance data. The main use for a bullet graph is to inform the viewer of how a business is performing in comparison to benchmarks that are in place for key business metrics.

In a bullet graph, the darker horizontal bar in the middle of the chart represents the actual value, while the vertical line represents a comparative value, or target. If the horizontal bar passes the vertical line, the target for that metric has been surpassed. Additionally, the segmented colored sections behind the horizontal bar represent range scores, such as “poor,” “fair,” or “good.”

14. Choropleth Maps

Choropleth Map Example

A choropleth map uses color, shading, and other patterns to visualize numerical values across geographic regions. These visualizations use a progression of color (or shading) on a spectrum to distinguish high values from low.

Choropleth maps allow viewers to see how a variable changes from one region to the next. A potential downside to this type of visualization is that the exact numerical values aren’t easily accessible because the colors represent a range of values. Some data visualization tools, however, allow you to add interactivity to your map so the exact values are accessible.

15. Word Cloud

Word Cloud Example

A word cloud , or tag cloud, is a visual representation of text data in which the size of the word is proportional to its frequency. The more often a specific word appears in a dataset, the larger it appears in the visualization. In addition to size, words often appear bolder or follow a specific color scheme depending on their frequency.

Word clouds are often used on websites and blogs to identify significant keywords and compare differences in textual data between two sources. They are also useful when analyzing qualitative datasets, such as the specific words consumers used to describe a product.

16. Network Diagram

Network Diagram Example

Network diagrams are a type of data visualization that represent relationships between qualitative data points. These visualizations are composed of nodes and links, also called edges. Nodes are singular data points that are connected to other nodes through edges, which show the relationship between multiple nodes.

There are many use cases for network diagrams, including depicting social networks, highlighting the relationships between employees at an organization, or visualizing product sales across geographic regions.

17. Correlation Matrix

Correlation Matrix Example

A correlation matrix is a table that shows correlation coefficients between variables. Each cell represents the relationship between two variables, and a color scale is used to communicate whether the variables are correlated and to what extent.

Correlation matrices are useful to summarize and find patterns in large data sets. In business, a correlation matrix might be used to analyze how different data points about a specific product might be related, such as price, advertising spend, launch date, etc.

Other Data Visualization Options

While the examples listed above are some of the most commonly used techniques, there are many other ways you can visualize data to become a more effective communicator. Some other data visualization options include:

  • Bubble clouds
  • Circle views
  • Dendrograms
  • Dot distribution maps
  • Open-high-low-close charts
  • Polar areas
  • Radial trees
  • Ring Charts
  • Sankey diagram
  • Span charts
  • Streamgraphs
  • Wedge stack graphs
  • Violin plots

Business Analytics | Become a data-driven leader | Learn More

Tips For Creating Effective Visualizations

Creating effective data visualizations requires more than just knowing how to choose the best technique for your needs. There are several considerations you should take into account to maximize your effectiveness when it comes to presenting data.

Related : What to Keep in Mind When Creating Data Visualizations in Excel

One of the most important steps is to evaluate your audience. For example, if you’re presenting financial data to a team that works in an unrelated department, you’ll want to choose a fairly simple illustration. On the other hand, if you’re presenting financial data to a team of finance experts, it’s likely you can safely include more complex information.

Another helpful tip is to avoid unnecessary distractions. Although visual elements like animation can be a great way to add interest, they can also distract from the key points the illustration is trying to convey and hinder the viewer’s ability to quickly understand the information.

Finally, be mindful of the colors you utilize, as well as your overall design. While it’s important that your graphs or charts are visually appealing, there are more practical reasons you might choose one color palette over another. For instance, using low contrast colors can make it difficult for your audience to discern differences between data points. Using colors that are too bold, however, can make the illustration overwhelming or distracting for the viewer.

Related : Bad Data Visualization: 5 Examples of Misleading Data

Visuals to Interpret and Share Information

No matter your role or title within an organization, data visualization is a skill that’s important for all professionals. Being able to effectively present complex data through easy-to-understand visual representations is invaluable when it comes to communicating information with members both inside and outside your business.

There’s no shortage in how data visualization can be applied in the real world. Data is playing an increasingly important role in the marketplace today, and data literacy is the first step in understanding how analytics can be used in business.

Are you interested in improving your analytical skills? Learn more about Business Analytics , our eight-week online course that can help you use data to generate insights and tackle business decisions.

This post was updated on January 20, 2022. It was originally published on September 17, 2019.

visual representation about

About the Author

Haig Kouyoumdjian, Ph.D.

Learning Through Visuals

Visual imagery in the classroom.

Posted July 20, 2012

A large body of research indicates that visual cues help us to better retrieve and remember information. The research outcomes on visual learning make complete sense when you consider that our brain is mainly an image processor (much of our sensory cortex is devoted to vision), not a word processor. In fact, the part of the brain used to process words is quite small in comparison to the part that processes visual images.

Words are abstract and rather difficult for the brain to retain, whereas visuals are concrete and, as such, more easily remembered. To illustrate, think about your past school days of having to learn a set of new vocabulary words each week. Now, think back to the first kiss you had or your high school prom date. Most probably, you had to put forth great effort to remember the vocabulary words. In contrast, when you were actually having your first kiss or your prom date, I bet you weren’t trying to commit them to memory . Yet, you can quickly and effortlessly visualize these experiences (now, even years later). You can thank your brain’s amazing visual processor for your ability to easily remember life experiences. Your brain memorized these events for you automatically and without you even realizing what it was doing.

There are countless studies that have confirmed the power of visual imagery in learning. For instance, one study asked students to remember many groups of three words each, such as dog, bike, and street. Students who tried to remember the words by repeating them over and over again did poorly on recall. In comparison, students who made the effort to make visual associations with the three words, such as imagining a dog riding a bike down the street, had significantly better recall.

Various types of visuals can be effective learning tools: photos, illustrations, icons, symbols, sketches, figures, and concept maps, to name only a few. Consider how memorable the visual graphics are in logos, for example. You recognize the brand by seeing the visual graphic, even before reading the name of the brand. This type of visual can be so effective that earlier this year Starbucks simplified their logo by dropping their printed name and keeping only the graphic image of the popularly referred to mermaid (technically, it’s a siren). I think we can safely assume that Starbucks Corporation must be keenly aware of how our brains have automatically and effortlessly committed their graphic image to memory.

So powerful is visual learning that I embrace it in my teaching and writing. Each page in the psychology textbooks I coauthor has been individually formatted to maximize visual learning. Each lecture slide I use in class is presented in a way to make the most of visual learning. I believe the right visuals can help make abstract and difficult concepts more tangible and welcoming, as well as make learning more effective and long lasting. This is why I scrutinize every visual I use in my writing and teaching to make sure it is paired with content in a clear, meaningful manner.

Based upon research outcomes, the effective use of visuals can decrease learning time, improve comprehension, enhance retrieval, and increase retention. In addition, the many testimonials I hear from my students and readers weigh heavily in my mind as support for the benefits of learning through visuals. I hear it often and still I can’t hear it enough times . . . by retrieving a visual cue presented on the pages of a book or on the slides of a lecture presentation, a learner is able to accurately retrieve the content associated with the visual.

McDaniel, M. A., & Einstein, G. O. (1986). Bizarre imagery as an effective memory aid: The importance of distinctiveness. Journal of Experimental Psychology: Learning, Memory, and Cognition , 12(1), 54-65.

Meier, D. (2000). The accelerated learning handbook. NY: McGraw-Hill.

Patton, W. W. (1991). Opening students’ eyes: Visual learning theory in the Socratic classroom. Law and Psychology Review, 15, 1-18.

Schacter, D.L. (1966). Searching for memory. NY: Basic Books.

Verdi, M. P., Johnson, J. T., Stock, W. A., Kulhavy, R. W., Whitman-Ahern, P. (1997). Organized spatial displays and texts: Effects of presentation order and display type on learning outcomes. Journal of Experimental Education , 65, 303-317.

Haig Kouyoumdjian, Ph.D.

Haig Kouyoumdjian, Ph.D. , is a clinical psychologist and coauthor of Introduction to Psychology , 9th ed. and the innovative Discovery Series: Introduction to Psychology.

  • Find a Therapist
  • Find a Treatment Center
  • Find a Psychiatrist
  • Find a Support Group
  • Find Teletherapy
  • United States
  • Brooklyn, NY
  • Chicago, IL
  • Houston, TX
  • Los Angeles, CA
  • New York, NY
  • Portland, OR
  • San Diego, CA
  • San Francisco, CA
  • Seattle, WA
  • Washington, DC
  • Asperger's
  • Bipolar Disorder
  • Chronic Pain
  • Eating Disorders
  • Passive Aggression
  • Personality
  • Goal Setting
  • Positive Psychology
  • Stopping Smoking
  • Low Sexual Desire
  • Relationships
  • Child Development
  • Therapy Center NEW
  • Diagnosis Dictionary
  • Types of Therapy

March 2024 magazine cover

Understanding what emotional intelligence looks like and the steps needed to improve it could light a path to a more emotionally adept world.

  • Coronavirus Disease 2019
  • Affective Forecasting
  • Neuroscience

The Daily Sound

Everything You Need to Know About Visual Representation

School reports, proposal presentations, office meetings, or teaching students —  whatever it is you’re doing that involves numbers and big amounts of data, visual representation can make your job easier. For many years, we have been using it to make our data perceivable in the eyes of our audience. It helps us give a clear idea of what information means by visual context.

The creativity and effort it takes to come up with a general and effective way to display informative data is tiring. Fortunately, there are graph makers like Venngage that can help you make your presentations vividly and communicatively.

VISUAL-REPRESENTATION

    Campaign creators/Unsplash

What is a visual representation and why is it important?

Visual presentation is an effective and fast-paced method to convey figures and statistical data in a comprehensible manner. It makes information easily accessible by feeding into your audience’s natural tendency to learn by seeing and interacting.

It uses visual data to aid companies in identifying which areas need improvement or in comparing the company’s sales from the last six months. This practice represents numbers in a study, provides shape, pattern, and a brief comparison of data which is crucial in analysis and decision making.

Without a visual representation of data and knowledge, identifying the correlations between the relationship of an independent variable will be challenging.

Who uses visual representation?

With today’s technology, people are inclined to use an easier way to complete tasks. With the help of desktops, laptops, and smartphones, we can easily gather data and digitalize reports.

That’s why almost everyone uses visual representation. Students, employees, and big companies all benefit from it. It’s universal, fast, and efficient in conveying data and knowledge.

Types of Visual Representations

WHY-IS-IT-IMPORTANT

Ruthson Zimmerman/ Unsplash

Visual representation comes in different forms and shapes that help us organize and understand data in a much simpler way. Venngage also offers a wide range of ready-to-use templates for such projects. It lets you choose from a simple line graph to much more complex and detailed presentations.

To give you an idea, here are some examples:

Graphs and Charts

Graphs

Isaac Smith/ Unsplash

Graphs and charts compress a large quantity of information into a comprehensible format that clearly and effectively communicates important elements.

To present your data effectively , you need to know the purpose of your graph/chart and what you want to present. Check and filter what you want to include and whether they should be expressed as frequencies or categories.

Graphs and charts have many different types, familiarizing them makes it easier for a business to choose the one that fits their need most.

maps

Clay Banks/ Unsplash

You can use a map to clearly illustrate gathered demographic illustrations like age, ethnicity, race, gender, and marital status. This type of representation is best used in plotting population data and surveys .

Data relatable illustrations

Using relatable icons or symbols for your data report could also help convey your message to your audience. It also adds life and color to your presentations making your report comprehensible and at the same time pleasing to the eye.

Tree diagram

A tree diagram represents variables that are in sequence or an independent event or condition relating to possible actions. It is also best used in organizing a hierarchy of systems.

How to use visual representation effectively?

Behind every effective visual representation is a hardworking person who puts his ideas and plans into action. Here are some quick tips to help you create a successful visual presentation:

Gather all the needed data

Gathering all the needed data for your report is a crucial step in making a successful presentation. Incomplete data would highly reflect on your visual presentation’s outcome. Always remember that every variable is important in statistical reports.

Organize all data

After gathering all the important data for your presentation, review them one by one and match them according to their relatability. You can pair up each data or group them to make the next step easier.

After organizing your data, analyze what type of data visualization would best suit your report. Choosing the right format would increase your visual representation’s success.

Be creative and fun

Do not forget to add colors to your presentation. Reports can sometimes become too dragging and one way to spice it up is to make your presentation visually enticing.

It sets a light and fun mood that can make presentations and/or learning exciting and easy.

Using visual representation is easy and is a sure way to present your variables effectively. All you have to do is consider your resources and make sure that you have all the essential tools in crafting your visuals. Graph maker platforms can also be of big help if you’re confused about which template would best fit your data.

Now that you’ve had all the information you need in making data presentations, you can now start and make one today!

Most Searched Topics

  • Yahoo Mail Down
  • Avast Premier Key
  • MP3 Converter Simple
  • DNS Server not Responding
  • Avast Safezone Browser Update

Recent Posts

  • Streaming Airwaves
  • How to level up in FF 14 using your own efforts and professional services
  • A Complete Guide to Fix “Excel cannot open the file” Error [Updated]
  • Slot Games with the Best Soundtracks
  • Pro Ways to Protect Your Laptop from Viruses and Malware

TheDailySound is a destination for those who have an interest in tech news, latest trends, Upcoming Infos, Articles and more. If you are looking for cool and useful stuff to share with your friends, then this could be the right platform for you.

Email: [email protected]

What is visual communication, and how can it revolutionize your workflow?

From emojis to GIFs and video calls to presentations, visuals have a strong influence on our everyday lives. But how can we use visual information as a communication process—and can this help productivity? Find the answers to these questions in this guide to visual communication.

In modern life, we’re surrounded by visuals—phone calls have been swapped for FaceTime and Zoom, social media is long past the days of text-only posts, and marketing has become increasingly reliant on images, videos, and illustrations to capture audiences’ attention.

These visual elements are also being adopted in communication strategies, particularly in the workplace. New digital technologies, such as screen capture tools for sharing information asynchronously , are harnessing the power of visual communication to help teams work in more flexible and productive ways.

In this article, we’re going to explain how you can use visual communication to improve team collaboration and streamline workflows. But first, let’s dive into the different types of visual communication and why visual information is so important.

What is visual communication?

Put simply, visual communication is the process of conveying meaning—be it ideas, instructions, data, or other kinds of information—through graphics rather than text or audio. For many, this is a more efficient and accessible way of sharing knowledge and adding context than written communication.

Visual communication can be achieved in a variety of different ways. Examples of visual communication include:

  • Videos and photos
  • Graphs, charts, infographics, and other types of data visualization
  • Maps (such as mind maps and content maps)
  • Illustrations and graphic design
  • Slide decks and presentations
  • Screen capture and recordings

These types of visual assets are commonly used in social media and content marketing, to communicate ideas and information where more text-heavy formats fail to make an impact.

Why is visual communication important?

According to research, 50% of people are visual learners , and prefer visual content to learn information more effectively. This means that visual communication doesn’t just matter in the workplace—it’s beneficial for wider society as well.

Presenting information visually allows you to convey your message with more impact than text can achieve. A visual communication strategy should be an essential part of your business activities—especially your content marketing .

Visual elements are crucial in the content creation process. On certain social media platforms like Instagram and TikTok, text content simply doesn’t have the same impact as images and videos.

According to research, our collective attention span is narrowing and the always-on nature of digital life presents all kinds of distractions. Visual communication helps you cut through the noise and get your message across through high-quality, memorable content.

These aren’t the only reasons why visual communication is important. This form of communication:

  • Attracts attention and boosts engagement
  • Evokes stronger emotions from your target audience
  • Improves information recall
  • Saves time, as information is relayed faster and more efficiently
  • Solidifies brand identity, ensuring a shared experience and a unified message

How visual communication can improve your workflow

It can be difficult to connect teams working remotely —and often it seems easier to set up a quick meeting over a video call to collaborate on ideas or discuss projects. But any virtual-first worker will tell you that video meetings can sometimes actually waste time rather than improve productivity.

With an effective visual communication strategy, you can reduce unnecessary meetings . The use of visual elements and visual aids enables you to relay complex information—such as instructions for using certain tools, or the specifics of a new project—in a more focused, engaging, and digestible way. In turn, this helps to streamline your workflows and simplifies any decision-making processes.

Because there are so many different types of visual communication, you can get creative with how you share information and collaborate with clients and colleagues. With innovative new technologies, you’re no longer restricted to pie charts and presentations for creating and sharing a visual message. 

In fact, visual communication can benefit your business in all kinds of ways:

  • Make the employee onboarding process quicker and more efficient with narrated screen captures in Dropbox Capture
  • Use Capture to create GIFs that explain organization systems, programs, and training tools
  • Take screenshots of ideas and sources of inspiration to share with your team and refer back to when it’s time to use them
  • Host async meetings to keep workflows on track and avoid wasting time in unnecessary or unfocused meetings!
  • Tools like Dropbox Replay help you give clearer feedback, as you can pinpoint specific points of improvement in videos using on-screen markups

Tips for implementing visual communication in the workplace

The best visual communication strategy for your team will depend on a few different things, including the team size and goals. To facilitate this process and make the most of the benefits of visual communication, there is a range of things you can do.

Consistency is key

Firstly, you’ll need to be consistent with the styling and branding of your visual assets—and not just for external resources such as marketing materials. Value quality over quantity, so that everything you produce achieves its potential impact. Create templates , video tutorials, and brand guidelines, and ensure everyone on your team knows where these are stored and how to use them when creating visual elements. 

Don’t forget narrative

Additionally, you’ll want to consider your storytelling strategies and how you can weave narrative into the visual materials you create. For both internal and external communications, this will come down to knowing your audience. It may be beneficial to build audience personas in collaborative tools like Dropbox Paper , to guide your strategy for customer-facing content.

Prioritize ease of use and accessibility

Introducing software that is too complicated to use could discourage your team members from adopting visual communication into their day-to-day work. Make sure you only use collaborative tools that are accessible to everyone—this includes how you organize your visual assets. 

With Dropbox Capture , you can create screenshots, GIFs, or simple videos recorded right on your screen. Your creations will be saved to your Dropbox account , where you can easily share them with your colleagues to watch or review on their own time. 

Create team folders for templates, training videos, meeting minutes, and other communications, so that everyone in your team has access to the visual materials they need when they need them.

Harness the power of visual storytelling

With Dropbox Capture , you can clearly say what you mean without scheduling anything. Replace lengthy emails and meetings, streamline your onboarding and support processes, and walk through ideas, proposals, tutorials, and projects in a way that gives everyone the complete picture.

Say more, meet less.

Try Dropbox Capture

Help | Advanced Search

Computer Science > Sound

Title: xlavs-r: cross-lingual audio-visual speech representation learning for noise-robust speech perception.

Abstract: Speech recognition and translation systems perform poorly on noisy inputs, which are frequent in realistic environments. Augmenting these systems with visual signals has the potential to improve robustness to noise. However, audio-visual (AV) data is only available in limited amounts and for fewer languages than audio-only resources. To address this gap, we present XLAVS-R, a cross-lingual audio-visual speech representation model for noise-robust speech recognition and translation in over 100 languages. It is designed to maximize the benefits of limited multilingual AV pre-training data, by building on top of audio-only multilingual pre-training and simplifying existing pre-training schemes. Extensive evaluation on the MuAViC benchmark shows the strength of XLAVS-R on downstream audio-visual speech recognition and translation tasks, where it outperforms the previous state of the art by up to 18.5% WER and 4.7 BLEU given noisy AV inputs, and enables strong zero-shot audio-visual ability with audio-only fine-tuning.

Submission history

Access paper:.

  • Download PDF
  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

  • CBSSports.com
  • Fanatics Sportsbook
  • CBS Sports Home
  • NCAA Tournament
  • W. Tournament
  • Champions League
  • Motor Sports
  • High School
  • Horse Racing 

mens-brackets-180x100.jpg

Men's Brackets

womens-brackets-180x100.jpg

Women's Brackets

Fantasy Baseball

Fantasy football, football pick'em, college pick'em, fantasy basketball, fantasy hockey, franchise games, 24/7 sports news network.

cbs-sports-hq-watch-dropdown.jpg

  • CBS Sports Golazo Network
  • March Madness Live
  • PGA Tour on CBS
  • UEFA Champions League
  • UEFA Europa League
  • Italian Serie A
  • Watch CBS Sports Network
  • TV Shows & Listings

The Early Edge

201120-early-edge-logo-square.jpg

A Daily SportsLine Betting Podcast

With the First Pick

wtfp-logo-01.png

NFL Draft is coming up!

  • Podcasts Home
  • Eye On College Basketball
  • The First Cut Golf
  • NFL Pick Six
  • Cover 3 College Football
  • Fantasy Football Today
  • Morning Kombat
  • My Teams Organize / See All Teams Help Account Settings Log Out

LOOK: Dodgers' Shohei Ohtani unveils logo with New Balance, a 'visual representation' of his baseball journey

The mlb superstar now has his own logo.

gettyimages-2036334264-2.jpg

So many of the top athletes in the world have their own brand, which is represented by their own logo. MLB superstar Shohei Ohtani joined that club today when he unveiled his logo in a partnership with New Balance.

Ohtani, who just signed a massive 10-year contract worth $700 million with the Los Angeles Dodgers in the offseason, has been with New Balance for several years now. As of Tuesday, Ohtani now has his own logo with the sports apparel company.

The image features Ohtani rounding first base, and it will be featured on a number of different products. "In that run, every soul delights," New Balance wrote in the caption on social media.

View this post on Instagram A post shared by New Balance (@newbalance)

"To finally reveal this special logo that I've worked closely on is truly an exciting moment for me," Ohtani said in a statement. "It is a visual representation of my journey in baseball and I am excited to share it with the world. I also look forward to using this logo with future projects that we will reveal throughout the 2024 season."

Through his first six MLB seasons, Ohtani has established himself as a household name by playing both ways. In his 701 games, Ohtani has hit .274 with 171 home runs and 437 RBI. In 86 starts on the mound, Ohtani has posted a 3.01 ERA while piling up 608 strikeouts.

Our Latest MLB Stories

strahm-usatsi.png

Phillies extend Matt Strahm through 2025

Mike axisa • 1 min read.

ohtani-ippei-usatsi.png

What to know about Ohtani's now-fired interpreter

Kate feldman • 2 min read.

Baseball: World Baseball Classic Exhibion Game-Puerto Rico at Colorado Rockies

2024 MLB season previews

Cbs sports staff • 2 min read.

edlc-usatsi.png

2024 MLB season preview: Cincinnati Reds

Matt snyder • 5 min read.

peter-angelos-orioles-owner-g.jpg

Longtime Orioles owner Peter Angelos dead at 94

Dayn perry • 2 min read.

jackson-holliday-2024-still-fielding.png

Holliday demotion: Orioles GM on why IF was reassigned

R.j. anderson • 2 min read.

visual representation about

LOOK: Shohei Ohtani unveils logo with New Balance

visual representation about

Ohtani investigation: MLB begins formal probe

visual representation about

Ohtani punishment? Expert weighs in

visual representation about

What we know about gambling scandal

visual representation about

What's going on at MLB players union?

visual representation about

Holliday demotion: Orioles GM on the decision

visual representation about

Report: Guardians waive former Gold Glove winner

visual representation about

When All-Star DH Martinez could join Mets lineup

visual representation about

IMAGES

  1. A visual representation 605622 Vector Art at Vecteezy

    visual representation about

  2. 6 Reasons Visual Representation is Significant for Small Businesses

    visual representation about

  3. Visual Representation in Mathematics

    visual representation about

  4. Visual Representation Teaching Resources

    visual representation about

  5. 33 Ways to Visualize Ideas Choose among different charts, diagrams, and

    visual representation about

  6. How to Design Attractive Data Visualizations for a Business Blog

    visual representation about

COMMENTS

  1. What is Visual Representation?

    Visual Representation refers to the principles by which markings on a surface are made and interpreted. Designers use representations like typography and illustrations to communicate information, emotions and concepts. Color, imagery, typography and layout are crucial in this communication. Alan Blackwell, cognition scientist and professor ...

  2. The role of visual representations in scientific practices: from

    The use of visual representations (i.e., photographs, diagrams, models) has been part of science, and their use makes it possible for scientists to interact with and represent complex phenomena, not observable in other ways. Despite a wealth of research in science education on visual representations, the emphasis of such research has mainly been on the conceptual understanding when using ...

  3. What is visual representation? » Design Match

    Defining Visual Representation: Visual representation is the act of conveying information, ideas, or concepts through visual elements such as images, charts, graphs, maps, and other graphical forms. It's a means of translating the abstract into the tangible, providing a visual language that transcends the limitations of words alone.

  4. What is Information Visualization?

    Information visualization is the process of representing data in a visual and meaningful way so that a user can better understand it. Dashboards and scatter plots are common examples of information visualization. Via its depicting an overview and showing relevant connections, information visualization allows users to draw insights from abstract ...

  5. (PDF) Effective Use of Visual Representation in Research and Teaching

    Therefore, visual representation has great potential to enhance learning and teaching, an issue that has been extensively explored and well-documented in extant literature (Eilam, 2012), (Buckley ...

  6. Learning by Drawing Visual Representations: Potential, Purposes, and

    Visual representations can also be simple illustrations that young children add to stories to enhance a story they are telling together (Gelmini-Hornsby et al., 2011). By externalizing one's understanding in a visual representation, not only is a peer's understanding improved, ...

  7. Visualization (graphics)

    As a subject in computer science, scientific visualization is the use of interactive, sensory representations, typically visual, of abstract data to reinforce cognition, hypothesis building, and reasoning. Scientific visualization is the transformation, selection, or representation of data from simulations or experiments, with an implicit or explicit geometric structure, to allow the ...

  8. Visual Representations: Unleashing the Power of Data Visualization

    Here are a handful of different types of data visualization tools that you can begin using right now. 1. Spider Diagrams. Use this template. Spider diagrams, or mind maps, are the master web-weavers of visual representation. They originate from a central concept and extend outwards like a spider's web.

  9. Visual Representation

    Visual representation of the external world has been exercised by humans for thousands of years and, in recent history, this has extended to abstract worlds as well. Visual metaphors have been used so widely that human cognition is considered tightly interweaved, and sometimes even identified, with human vision. ...

  10. Visual Representation

    The analysis in this article addresses the most important principles of visual representation for screen design, introduced with examples from the early history of graphical user interfaces. In most cases, these principles have been developed and elaborated within whole fields of study and professional skill - typography, cartography ...

  11. IRIS

    Page 5: Visual Representations. Yet another evidence-based strategy to help students learn abstract mathematics concepts and solve problems is the use of visual representations. More than simply a picture or detailed illustration, a visual representation—often referred to as a schematic representation or schematic diagram— is an accurate ...

  12. The Pitfalls of Visual Representations: A Review and Classification of

    Despite the notable number of publications on the benefits of using visual representations in a variety of fields (Meyer, Höllerer, Jancsary, & Van Leeuwen, 2013), few studies have systematically investigated the possible pitfalls that exist when creating or interpreting visual representations.Some information visualization researchers, however, have raised the issue and called to action ...

  13. Step 2: Understanding Visual Representation(s)

    Consequently, a visual representation is an event, process, state, or object that carries meaning and that is perceived through the visual sensory channel. Of course, this is a broad definition. It includes writing, too, because writing is perceived visually and refers to a given meaning. According to Giardino and Greenberg (2015, pp. 1-3 ...

  14. Visual Representation of Scientific Information

    Visual representation of data offers insights that can lead to new understanding, whether the purpose is analysis or communication. This presentation shows how art, design, and traditional illustration can enable scientific discovery. Examples will be drawn from the Broad Institute's Data Visualization Initiative, aimed at establishing ...

  15. Creating visual explanations improves learning

    Chemists routinely use visual representations to investigate relationships and move between the observable, physical level and the invisible particulate level (Kozma, Chin, Russell, & Marx, 2002). Generating explanations in a visual format may be a particularly useful learning tool for this domain.

  16. Learning high-level visual representations from a child's ...

    Visual representations are thought to develop from visual experience and inductive biases. Orhan and Lake show that modern machine learning algorithms can learn visual knowledge from a few hundred ...

  17. 15 Effective Visual Presentation Tips To Wow Your Audience

    CUSTOMIZE THIS VISUAL PRESENTATION 2. Infographics. Infographics are visual representations of information, data or knowledge. They combine text, images and graphics to convey complex concepts or data in a concise and visually appealing manner. Infographics are often used in marketing, reporting and educational materials.

  18. Limits to visual representational correspondence between ...

    The same correspondence was also obtained in eight CNNs when artificial objects were shown, with lower visual representations in the brain better resembling those of lower than higher CNN layers ...

  19. 17 Important Data Visualization Techniques

    Bullet Graph. Choropleth Map. Word Cloud. Network Diagram. Correlation Matrices. 1. Pie Chart. Pie charts are one of the most common and basic data visualization techniques, used across a wide range of applications. Pie charts are ideal for illustrating proportions, or part-to-whole comparisons.

  20. Learning Through Visuals

    Posted July 20, 2012. A large body of research indicates that visual cues help us to better retrieve and remember information. The research outcomes on visual learning make complete sense when you ...

  21. Visualizations That Really Work

    Visualizations That Really Work. Know what message you're trying to communicate before you get down in the weeds. Summary. Not long ago, the ability to create smart data visualizations (or ...

  22. Visual Representation

    Types of Visual Representations. Ruthson Zimmerman/ Unsplash. Visual representation comes in different forms and shapes that help us organize and understand data in a much simpler way. Venngage also offers a wide range of ready-to-use templates for such projects. It lets you choose from a simple line graph to much more complex and detailed ...

  23. Full article: Using Visual Representations to Enhance Students

    2. Earlier Research: Learning About Price Through Visual Representations. Economics teaching frequently uses graphs to help students understand complex relationships (Cohn et al., Citation 2004; Reimann, Citation 2004; Wheat, Citation 2007a).This presumes that graphs facilitate learning about pricing by illustrating and simplifying complex processes and relations (Wheat, Citation 2007b).

  24. What is Visual Communication? Why is it Important?

    Visual communication helps you cut through the noise and get your message across through high-quality, memorable content. These aren't the only reasons why visual communication is important. This form of communication: Attracts attention and boosts engagement. Evokes stronger emotions from your target audience. Improves information recall.

  25. XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for

    To address this gap, we present XLAVS-R, a cross-lingual audio-visual speech representation model for noise-robust speech recognition and translation in over 100 languages. It is designed to maximize the benefits of limited multilingual AV pre-training data, by building on top of audio-only multilingual pre-training and simplifying existing pre ...

  26. LOOK: Dodgers' Shohei Ohtani unveils logo with New Balance, a 'visual

    "It is a visual representation of my journey in baseball and I am excited to share it with the world. I also look forward to using this logo with future projects that we will reveal throughout the ...

  27. ProgBar

    Effortlessly incorporate ProgBar into your reports and dashboards to enhance the visual impact of your data presentations. 4. Simple use: just add your measure, choose the icon and your colors Unlock the Potential of Your Data: Gone are the days of static, uninspiring data visuals. With ProgBar, transform your presentations into engaging ...

  28. Just a visual representation of how complicated it is (I ...

    559 likes, 34 comments - wolfsrh on March 3, 2024: "Just a visual representation of how complicated it is (I have to back up far af just to get depth) #gym #gymgirl #fitness #gymhumo ...