Using an Amylase Activity to Introduce Statistics
Embedding Statistics in Biology Classes
In my high school biology classes, I decided to focus on scaffolding statistics skills right from the start of the year and embed them into lessons I was already teaching, instead of having a skills unit and then moving on to content.
This approach — of embedding a science practice like data analysis into teaching content — reflects shifts in teaching science in the last decade, from the minutiae of content to the actual practice of science. Both the AP® Sciences and the Next Generation Science Standards stress teaching content through skills using the science practices. Teachers are encouraged to have their students “do science” by constructing explanations and engaging in argumentation. Providing evidence for scientific claims and engaging in argument from evidence often requires statistical analysis of data, which makes statistics a key science practice.
Challenges and Solutions for Teachers
Despite the importance of statistical skills, many teachers worry that they don’t have enough stats background to teach it to their students. Many biology teachers have never taken a course in statistics; I know I hadn’t had a statistics course since college and had to go back to my textbooks and to the internet to brush up.
I was worried that I didn’t have the background to be able to write good lessons that incorporated both biology content and statistics skills. So I looked for activities I could give to my students that combined biology and statistics, and found that BioInteractive has a wide variety of resources that do just that. One of these resources is the activity “Diet and the Evolution of Salivary Amylase”, which embeds statistical analysis in a study of enzymes, genes, and human evolution. Just what I was looking for!
What Is Salivary Amylase?
Salivary amylase is a digestive enzyme that helps to begin the breakdown of complex carbohydrates in the mouth. I teach about biomolecules and enzymes early in the year, as do many teachers, which makes it easy to introduce BioInteractive’s amylase activity early on.
By doing the activity and discussing carbohydrates in class, students learn about one particular complex carbohydrate, starch, that is made of repeating units of glucose. Salivary α-amylase (the type of salivary amylase found in humans) breaks the bond between the glucose units in starch, making the disaccharide maltose. Maltose is broken down by maltase to glucose in the small intestine. Figure 1 summarizes this process.

As discussed in the activity, the ability to digest starch more efficiently may be the result of having multiple copies of the salivary amylase gene AMY1. During crossing over, it is possible for genes to be duplicated, giving some individuals more copies of this gene. Consequently, they could have more amylase in their saliva.
The availability of starch-rich foods increased during the agricultural revolution. As a result, more copies of AMY1 led to a survival advantage because they allowed higher-starch foods to be metabolized more efficiently, making energy available more quickly.
Using Part A of the Amylase Activity
The “Diet and the Evolution of Salivary Amylase” activity has two parts, A and B, that each focus on a different published data set. The first task students encounter in Part A is to analyze data on AMY1 copy numbers and amylase production (Table 1). They must determine whether there is a correlation between the number of copies of AMY1 an individual has and the amount of salivary amylase, i.e., AMY1 protein, that person produces.
Table 1. AMY1 copy number and amylase production among European-American population
Individual | Number of AMY1 Gene Copies | AMY1 Protein in Saliva (mg/mL) |
---|---|---|
1 | 7 | 3.85 |
2 | 5 | 1.09 |
3 | 12 | 5.17 |
4 | 6 | 3.24 |
5 | 8 | 2.80 |
6 | 6 | 3.30 |
7 | 7 | 2.89 |
8 | 11 | 3.76 |
9 | 6 | 2.65 |
10 | 3 | 0.93 |
11 | 8 | 2.46 |
12 | 5 | 1.37 |
13 | 5 | 2.33 |
14 | 7 | 3.37 |
15 | 9 | 3.72 |
16 | 7 | 5.67 |
17 | 6 | 4.61 |
18 | 6 | 4.33 |
19 | 3 | 3.13 |
20 | 4 | 4.24 |
21 | 7 | 4.33 |
22 | 8 | 1.89 |
23 | 8 | 3.48 |
24 | 4 | 1.83 |
25 | 7 | 3.41 |
Source: Data from G. H. Perry et al., “Diet and the evolution of human amylase gene copy number variation,” Nature Genetics 39 (2007): 1256–1260.
Consider the data above. Which data display do you think is most appropriate for it?
After examining the data, students determine which type of graph to use for plotting this data. Early in the year, I like to have my students learn to make graphs they might not be familiar with by hand. (If you teach AP® Biology, you may want to start having them make graphs by hand again closer to the exam to give them practice.) Alternatively, students can use the spreadsheet from the “Data File (Excel)” download on the activity page. The data from Table 1 is already populated in the first tab of the spreadsheet, so it is quick to have students choose the type of graph they want in Excel or Google Sheets. If your students have never used a spreadsheet program to do graphing or statistics, BioInteractive has tutorials on how to do this.
Since they are correlating two variables, the appropriate graph for this data is a scatter plot (example shown in Figure 2). Most students have been exposed to line graphs and bar graphs, but fewer are familiar with scatter plots. Many try to make a histogram or line graph and struggle with how to represent and/or interpret the data. For example, looking at the way the data is represented in the data table, some of my students tried to make a bar for each individual along the x-axis, but got confused as to what should go on the y-axis, since there are two variables. I would ask them if they had ever heard of a scatter plot and some had, but for those who hadn’t, a quick “just in time” lesson usually did the trick. Another option is to give them a short rundown of different types of graphs and then let them decide where to use them.

The activity then asks students whether there is a correlation between the number of amylase genes and the concentration of amylase produced. In general, students agree that there appears to be a correlation. However, the only way to know whether the correlation is statistically significant is to do a regression analysis. Once students have the appropriate preparation in statistics, they can do this analysis themselves in the “Math Extension for Part A” at the end of the activity. Once a regression line is drawn and the Pearson’s correlation coefficient (r) is calculated, students see that the correlation is significant, though not strong, which surprises many of them.
Using Part B of the Amylase Activity
In Part B of the activity, students explore populations that traditionally eat a high-starch diet versus a low-starch diet. They analyze a new data set showing how many copies of AMY1 individuals from these populations have (Table 2).
Table 2. AMY1 copy number and dietary starch in individuals from different populations
High-Starch Diet Profile |
Low-Starch Diet Profile |
||
Population |
# of AMY1 Gene Copies |
Population |
# of AMY1 Gene Copies |
European-American |
4 |
Biaka |
8 |
European-American |
8 |
Biaka |
4 |
European-American |
11 |
Biaka |
2 |
European-American |
6 |
Biaka |
5 |
European-American |
5 |
Biaka |
4 |
European-American |
6 |
Biaka |
4 |
European-American |
6 |
Biaka |
6 |
European-American |
15 |
Biaka |
7 |
European-American |
8 |
Biaka |
4 |
European-American |
8 |
Mbuti |
4 |
European-American |
7 |
Mbuti |
7 |
Hadza |
15 |
Mbuti |
4 |
Hadza |
5 |
Mbuti |
4 |
Hadza |
7 |
Mbuti |
5 |
Hadza |
6 |
Mbuti |
4 |
Hadza |
3 |
Yakut |
9 |
Hadza |
7 |
Yakut |
4 |
Japanese |
10 |
Yakut |
5 |
Japanese |
6 |
Yakut |
5 |
Japanese |
6 |
Yakut |
9 |
Japanese |
5 |
Yakut |
10 |
Japanese |
6 |
Yakut |
8 |
Japanese |
5 |
Yakut |
5 |
Japanese |
6 |
Datog |
2 |
Japanese |
7 |
Datog |
8 |
Source: Data from G. H. Perry et al., “Diet and the evolution of human amylase gene copy number variation,” Nature Genetics 39 (2007): 1256–1260.
Consider the data above. Which data display do you think is most appropriate for it?
This part of the activity offers another opportunity to allow students to practice graphing and also introduces adding error bars to a bar graph. As my students were doing this part, I walked around the room looking over their shoulders to see how they were graphing the data. Most knew that a bar graph was the appropriate graph in this situation, but some graphed each data point as its own bar. The graph was very “busy,” and I asked if they were able to make any sense of this data. I stepped back and let them discuss it, and they realized that graphing the means for each population with error bars (example in Figure 3) would be easier to make meaning of. I have learned that allowing students to struggle with how to represent the data is important for them to gain an understanding of how data is analyzed. It is so much easier to just tell a student what mistakes they are making; it’s tougher to let them figure it out on their own.
In Part B, students also practice calculating and plotting means, standard deviations, and 95% confidence intervals. You can have the students calculate the statistics, graph the data, and add the error bars for the 95% confidence intervals by hand. Alternatively, they can do this in the spreadsheet from the “Data File (Excel)” download on the activity page, by using built-in spreadsheet functions on the data in the “Table 2” tab. BioInteractive has a spreadsheet tutorial that explains how to use such functions, including how to add error bars, in both Excel and Google Sheets.

My students noticed that the error bars overlap for the two diet profiles. I had taught them that overlapping confidence intervals mean that there isn’t enough evidence to determine whether the means are significantly different. To determine whether the difference in the means is statistically significant, a t-test should be done. Once students have the appropriate preparation in statistics, they can perform the test themselves in the “Math Extension for Part B” at the end of the activity. When students perform the t-test, they find that there is a statistically significant difference between the means. The t-tests can also be performed within the spreadsheet, and BioInteractive has a tutorial for this as well if your students need help.
Extensions
The amylase activity can be supplemented with other activities and BioInteractive resources. For example, before my students did the activity, I had them watch the BioInteractive short film Got Lactase? The Co-Evolution of Genes and Culture to spark their interest in enzymes and human evolution. Students learn about the enzyme lactase, which, like amylase, is a digestive enzyme involved in recent human adaptations. The film discusses the evolution of lactase persistence and how a change in the regulation of the lactase gene in populations that live a pastoral lifestyle may have been a selective advantage.
I also like having my students investigate their own salivary amylase. To do this, I give each student a starch agar plate and a sterile swab. I have them wet the swab with saliva, “draw a picture” with the swab on the starch plate, and set it aside for five minutes or so. They then flood the plate with iodine, which stains starch a bluish black; the picture they drew appears as a clearing on this plate because amylase in their saliva broke down the starch to maltose.
A fun extension to this is to give a student a swab to take home to get a saliva sample from their dog or cat and bring it back to test. The dog or cat will test negative for salivary amylase, because they do not have the AMY1 gene. However, they (and all other mammals) do produce pancreatic amylase. Dogs actually have multiple copies of the pancreatic amylase gene AMY2B compared to wolves (Figure 4). When dogs were domesticated, it seems that there was selective pressure to adapt to a human diet, which included higher amounts of starch than the wolf diet. Students can learn more about amylase genes in dogs and how they have evolved over time in the BioInteractive video Dog Genomics and Dogs as Model Organisms (starting around 14:45).

Conclusions
I have found that starting the year by introducing statistics skills right away has given my students more confidence in using them. By embedding the stats in context, they start to make sense of why it is necessary to include statistics as evidence for their claims. As the year goes on, I add other BioInteractive activities that teach statistics skills when we get to genetics, evolution, ecology, etc. I’m hopeful that when students leave me and are confronted with claims that are made to them, they will naturally ask “What is the evidence?”
Kathy Van Hoeck taught high school biology for 24 years. Since retiring in 2017, she keeps busy consulting, writing curricula, and giving (and attending) professional development. Kathy loves traveling and spending time with family, especially at her cabin in the Northwoods of Wisconsin.