Once you have established the hypothesis you want to test, decided on an experimental design, and selected the response and predictor variables you will measure, you need to figure out how to collect the information necessary to answer your study question. This tutorial will introduce you to some of the sampling methods you can use to collect your data. At the end of this tutorial, you will find a list of references that will describe how to establish sampling strategies and collect field data in much greater detail.
Sampling is about selecting and measuring representative examples of your subject of interest. As such, it is fundamentally concerned about the spatial (and temporal) arrangement of your population of interest within its environment. In order to figure out the appropriate sampling technique, you need to ask yourself several questions.
The answer to this question should be firmly embedded in your hypothesis. It is the first question to answer when establishing a sampling design because its answer will set the spatial and temporal boundaries on your sampling area. As in the developing hypotheses tutorial example of bird diversity in two fields, the subject of interest would be the bird communities present at these two locations. From a sampling standpoint, your study area would then be delimited by the two fields (and possibly their surrounding habitats, depending on your hypothesis). Similarly, if bird use of the fields varied during the day, you may also need to set temporal boundaries by restricting your sampling to the morning hours when many bird species are more active.
If your hypothesis concerns the abundance or distribution of dandelions in neighborhood lawns, your sample unit will be either individual lawns or sections of one lawn. However, if you are looking at the length of leaves on dandelions, then your sample unit will be individual dandelion plants. This may seem obvious, but it is important to clearly understand where you will (and will not) be collecting information.
If you are sampling attributes of individuals (for example, the length of leaves on dandelions, elytra on ladybugs, or the number of fruit on shrubs), the sample unit is the individual. Ecologists will often define a limited spatial area in which their research subjects may be found. Within this area, called a quadrat or plot, they search for occurrences of their research subjects, often measuring attributes of any examples they find. When using quadrats or plots, it is important to make the size of the sample unit proportionate to the size of the organism(s) being studied. For example, herbaceous vegetation, such as wildflowers or grasses, is often sampled using a 1 m2 quadrat; however, sampling trees will require much larger plots—100 m2 to 400 m2—or plotless distance-based methods. For mobile organisms, such as birds, you might record the number of individuals at a certain location over a fixed duration of time. See the sample methods section on the next pages for more detailed information about how to implement these various techniques.
This is an important consideration in your sampling design. Randomly (or systematically) locating your sample units is an important way to limit bias in your study. Random sample selection is also a fundamental assumption of most statistical analyses (as is the assumption that samples are independent from one another). For the purposes of your research project, randomly located sample units are not essential in all circumstances—a study of bird behavior or abundance at different bird feeders would be an example—but it is best practice and should be employed whenever possible. Different ways of placing or selecting sample units are discussed on the following pages.
Using this scheme, each sample unit is randomly selected. Specifically, this means that each sample unit has an equal probability of being selected and that the selection of any one sample unit does not influence the selection of any other sample unit.
For example, say you wanted to sample the wetland pictured here (click on the image to see it full size). With a simple random scheme, you could “construct” imaginary baselines (the x- and y-axes) and then use a random number generator (a computer, a table of random numbers, or a list of numbers blindly pulled out of a hat) to randomly select x- and y-coordinates. Using a compass, you could then pace off or measure your way to the coordinates. This is where you would then place or locate your sample unit (for example a plot or the nearest shrub). You could also use GPS, or even Google Maps or Earth on a tablet or smart phone, to select and locate coordinates. You would repeat this exercise until you had collected a sufficient number of samples.
This is a useful technique to employ when your study area is not uniform but contains subareas with different physical or environmental conditions. Examples of this would be an old field that contains both grassland and shrub patches or a wetland that contains different vegetation zones (open water, marsh, wet meadow, upland transition) depending on water level. Each differentiated subarea, internally environmentally uniform but different from one another, is termed a stratum. A purely random selection method will not necessarily ensure that all the different strata will be sufficiently sampled. As the name implies, stratified random sampling first divides the study area by strata and then selects random samples from within each stratum. Sampling effort by strata can be determined based on the size of each strata (for example, a strata occupying 40% of the study area would receive 40% of the sample effort) or by other variables (perhaps some strata are harder to reach than others and so receive a reduced sampling effort).
Consider the fen example again. Let’s say that the total sampling effort consists of 20 plots. After the random selection, this image shows the locations that were selected (click on the image to see it full size).
Another approach is to systematically place samples. This is often occurs with transect sampling where plots are placed at regular distances along the transect. Systematic sampling can be more time efficient than randomly choosing each sampling location. If the initial sample point is randomly chosen, then a systematic sampling scheme can also satisfy the statistical assumption of randomness. Systematic sampling can be problematic if the sample points correspond to some underlying environmental pattern.
Let’s use the Placid Lake example from the developing hypotheses tutorial. If the undulating mound-hollow topography had a regular occurrence, with mounds—then hollows—recurring every 5 m, and you systematically sampled every 5 m, you would only ever sample the same topographical position. However, nature is not usually so tidy, and in practice this is not typically a concern.
Finally, samples can be placed subjectively or opportunistically. There is a long tradition of subjective sampling when developing classifications of plant communities. In this tradition, researchers often sample representative examples of plant communities, which are subjectively chosen. Sometimes researchers sample unique habitats or features, such as salt licks or bird feeders, which are attractants to the organisms that they want to study.
Strictly speaking, haphazard sampling does not allow for the use of conventional statistical analysis.
Plots and quadrats delimit spatial areas within which the researcher searches for their subject of interest. This approach is very commonly used in studies of vegetation. As previously mentioned, the size of the sample unit is scaled proportionate to the type of vegetation being studied. For herbaceous species, such as wildflowers and grasses, small plots called quadrats are typically used. Quadrats vary in size, but commonly range in area from 0.1 m2 to 1.0 m2 and are usually square or rectangular in shape. Woody vegetation is usually much bigger than 1 m2 and shrub and forested communities are often sampled using larger plots ranging in area from 100 m2 to 400 m2 of square, rectangular or circular shape. Herbaceous vegetation can also be measured in larger plots and often is. For example, in a study comparing vegetation between two forest types using 400 m2 plots, all the vegetation, woody and herbaceous, would be sampled.
Once the sample unit has been laid out, the presence or abundance of the item being studied is measured. Depending on the research question, this could be the number of dandelions, the cover of grass, or presence of all the plant species found in the sample unit. You can also use plots and quadrats to measure other response variables, such as animal tracks and scat or animal and insect browse damage.