Skip to Main Content / Passer au contenu
 

Resource Library

Resource Library  |  Science Processes  |  K to 3, Grades 4 to 6, Grades 7 to 9, Grades 10 to 12, Inquiry

Sample Size and Reproducibility

Above Image: Tomato plants grow in a greenhouse © Goldlocki, Wikimedia Commons

Even a well-planned experimental inquiry can include variability and errors beyond the experimenter’s control. There may be small variations in environmental conditions or the experimenter may not take the most accurate readings. So, how do experimenters know if the results they are observing are affected by one of these errors or by the effects of the independent variable (what the experimenter changed on purpose)?

Let’s look at an example. Say you decided to grow three tomato plants and you wanted to observe how a certain type of fertilizer affected the plants’ height after a certain number of days. You treated all three plants the same and controlled what you thought were all of the variables. But in the end the three plants were very different sizes. With so few plants, and so much variation, it would be impossible to determine how the fertilizer affected the height of the plants or if it had any affect at all.

Sample Size

One way that an experimenter can work to minimize the effects of errors and bias, such as natural variation, is by having a greater sample size. The term sample size refers to the number of repeated measurements (e.g., the number of plants grown, etc.). In the example above, the sample size was three. In statistics, the sample size is represented by the letter “n.”

In general, the larger the sample size, the more confident the experimenter can be about the results and findings. With a larger sample size, there is a greater likelihood that random factors will cancel each other and the average result will more accurately represent the phenomena overall, or the population as a whole.

Let’s say that we take an average of the heights of the three plants in the example above.

  • Heights: 16 cm, 20 cm, 30 cm
  • Average: 22 cm

We may ask ourselves, does this average tell us the whole story? If we were to grow other plants, what height would we expect them to be?

Let’s say that we increased our sample size to 10, and all variables being the same, had the following results:

  • Heights: 15 cm, 17 cm, 18 cm, 17 cm, 20 cm, 25 cm, 31 cm, 15 cm, 16 cm, 16 cm
  • Average: 19 cm

Now we have a much more accurate idea of what height plants would be when given the fertilizer and could conclude with greater certainty. We can also see that most of the plants (6/10) are between 15 cm and 17 cm, and the rest are taller. We could even try to find out why some plants are taller. Were they closer to the window? Did they get more fertilizer? Etc.

tomato plants in window

So you may be wondering, what is a good sample size to choose? Too small a sample and you might miss the real effect the independent variable. Too large a sample and you might waste resources and time. The size of the sample really depends on the type of experiment and what sort of a difference you are expecting to observe. If you are interested in very tiny differences (e.g., differences in leaf size), you need a very large sample size, but if you only care about big differences (e.g., living or dead after the test), you can use a smaller sample size.

For classroom inquiries, the practical side of sample size also needs to be taken into consideration. In general, it is best to have as large a sample size as you can without jeopardizing the experiment or having the experiment take over the classroom!

Reproducibility and Repeatability

The ways things behave in the natural world are pretty consistent. Nature is not capricious; she doesn’t make errors, systematic or random. Scientists could not make progress if Nature could not be trusted. Imagine you stirred a spoonful of sugar into a glass of warm water. You notice that you can no longer see the sugar. Interesting! You wonder if you did the same thing again, if you would get the same results. So, you do exactly what you did the first time and you get observations very similar to those of your original test. You try the same thing many more times and each time you get the same results.

Reproducing, or repeating, an experiment in the exact same way is also known as replicating it. By doing an experiment more than once, an experimenter is able to check that the method works as expected and gives reliable results. An experiment in which the original experimenter repeats the experiment using the same procedure, the same equipment, the same measuring devices, in the same location and obtains similar results, is said to be repeatable.

But say no one has ever tried your sugar in the water experiment before. Maybe you have made a new discovery! What you need now is for someone else to try your experiment, using the same variables and methods that you used. The ability to duplicate an experiment by another experimenter is known as reproducibility.

Reproducibility is one of the key aspects of scientific processes. If an experiment done by one experimenter gets certain results and other experimenters obtain the same results, then they can be pretty confident that the results are reliable, which is how new knowledge comes about. On the other hand, sometimes experiments cannot be reproduced. This may be because the experimenter was biased or there were errors in the methods or the observations. This is also useful information and can be one of the ways that scientists discover flaws in each other’s thinking.

For more on bias and errors:

Bias and Error

Sample Size, Reproducibility and Tomatosphere™

The Tomatosphere™ Seed Investigation has been done by students every year since 2001. Thousands of classes each year repeat the simple experiment first created in partnership with Dr. Michael Dixon from the University of Guelph. In 2015 alone, 19 000 classes across Canada and the United States grew tomato seeds and recorded the germination – that’s a big sample size! To see the results of the Seed Investigation submitted by all of the participating classes, make sure you upload your results on the Submit your results page.

students with tomato plants

Guided Practice

Exercise 1

What is the average surface area for the following samples of plant leaves?

  1. Areas: 15 cm2, 17 cm2, 18 cm2
  2. Areas: 15 cm2, 17 cm2, 18 cm2, 17 cm2, 20 cm2
  3. Areas: 15 cm2, 17 cm2, 18 cm2, 17 cm2, 20 cm2, 16 cm2, 17 cm2, 18 cm2
  4. How did the sample size affect the average?

Exercise 2

What is the average number of seeds germinated?

  1. Number germinated: 15, 20, 25
  2. Number germinated: 21, 20, 22, 21, 20, 25, 21, 24
  3. Number germinated: 21, 20, 22, 21, 20, 25, 21, 24, 20, 21, 22, 22
  4. How did the sample size affect the average?

Exercise 3

Have the students find the mean, median and mode of the height for the following sample of 10 plants.

Heights: 15 cm, 17 cm, 18 cm, 17 cm, 20 cm, 25 cm, 15 cm, 16 cm, 16 cm, 16 cm

Exercise 4

Provide students with the following scenario and have them answer the questions below it.

A student developed an experiment in which she tested different brands of fertilizer on the growth of tomato plants. The student found that plants growing in Fertilizer A grew MUCH taller than the plants growing in the other fertilizer brands. Another student who was interested in the results grew some more of the same tomato plants and kept all of the variables the same, but did not find that the plants growing in Fertilizer A grew much taller than any of the other brands.

  1. Was the experiment reproducible?
  2. What might have happened to give the observed results?
  3. What kinds or errors or biases may have occurred?

Exercise 5

Brainstorm with students in which domains of science reproducibility might be difficult if not impossible.

Answers

Exercise 1

  1. 16.7
  2. 17.4
  3. 17.5
  4. The larger the sample size, the more accurately the data represents the surface area.

Exercise 2

  1. 20
  2. 21.75
  3. 21.58
  4. The larger the sample size, the more accurately the data represents the number of seeds that germinated.

Exercise 3

Mean = 17.5, Median = 16.5, Mode = 16

Exercise 4

Possible responses: palaeontology, climatology, meteorology, oceanography, seismology, etc.

Additional Resources