Hows and Whens of Citizen Science

The advancement of technology and the scientific process seem to work hand in hand: research in the sciences can lead to new technological developments, and such advancements in technology allow for new scientific avenues to be explored.  Another result of this increasing technology is the automation of data collection.  For the disciplines that require work “in the field,” such as ecology, geology, and natural resource sciences (wildlife, forestry, fisheries), better technology often means easier data collection.  Cue the advent of “citizen science,” or the process of scientific data collection or research done by those not formally educated and trained in a scientific discipline.  Citizen science offers a decrease in the costs of conducting scientific research while educating and mobilizing a portion of the general public that cares about what’s going on in the world.

Does this mean every researcher should consider firing all field and lab technicians to open up more funds for the actual scientific process?  After all, the point of citizen science is that volunteers are helping for free because they want to help.  The simple answer to this question is no, but to understand why—and in which instances it might be beneficial to use volunteers—it is best to identify the conceptually different situations where citizens could and have been employed. 

In terms of data collection, there are circumstances where the experiment has been laid out and all that is needed is to collect a large quantity of data points (the more, the better).  Then, there are also circumstances where the details of the actual data collection process are less clear because the protocol is not easily laid out and implemented.  In the latter case, there is a much greater risk in utilizing citizens for data collection.

The dotted line represents perfect 1:1 agreement between the professionally collected (FDEP, vertical axis) and citizen-collected (LW, horizontal axis) data on total phosphorus, total nitrogen, and chlorophyll in the water.  The dots and solid line represent the actual data values, which is quite close to perfect agreement.  Taken from Hoyer et al. (2012). Click image to enlarge.

The first set of circumstances—“the more, the better” scenario—has a few concerns, but none too difficult to overcome.  Often, people are concerned about the reliability of data collected by volunteers, but the main issues boil down to the methodology and techniques used to physically collect the data.  For example, Florida has an extensive citizen-driven program called Florida Lakewatch, which has the goal of monitoring water chemistry in freshwater bodies across the state.  A two-hour training session is required so that the volunteer learns how to collect water samples.  If the volunteer can do this and then safely deliver the samples, the job has been done with roughly the same accuracy as a professional biologist—not to mention the sheer increase in data collected with the help of so many more people.  Still, the key here is that the logistics of the sampling have already been figured out by the professionals.

Another common example of using volunteers is in the building of databases for invasive species in a region (here for more about invasive species).  In this instance, the problem of citizen “reliability” is also not very disconcerting because no rigorously designed experiment is being conducted per se; if a person can accurately identify a species as a nonnative individual, it will simply contribute to a growing list of detected species.  The question is then whether or not a person can correctly identify plants and animals.  Even if the person is not sure, often a picture submitted to an expert can then allow him or her to determine whether it is introduced or native.  Overall, this is actually a scientific endeavor that requires the use of volunteers, as there are not enough scientists to spend their time digging around for potential invasive species.

Unfortunately, it is not always that easy to get high quality data from citizen-driven initiatives.  In this second set of circumstances, the data collection protocol has not been laid out for the volunteers, and the quality of data depends on the citizens providing random, representative data by chance. 

To illustrate this seemingly complicated situation, consider a lake with ten fishermen.  If a scientist wants to know the impact that fishing has on the fish in that lake, it requires information not only about the fish, but also about the activity of the anglers (aka fishermen or fisher) themselves.  Often times, scientists have to rely solely on information derived from fishing trips to learn about the status of the fish population.  For small lakes and/or small populations of anglers, scientists or agencies can hire people to interview/sample the fishermen who are already collecting the “data points” anyway.

What happens when scientists need to know about fishing activity in the entire Gulf of Mexico?  Or along the coast of California, Oregon, and Washington?  In reality, they still employ contracted interviewers to randomly sample different fishing sites and different anglers.  However, the cost of employing interviewers can be extremely high.

Now think about what citizen science might look like in a system such as this one.  Instead of having paid interviewers, why not just have the fishers report their own fishing trip information?  Here the angler is the citizen scientist.  However, not all fishing trips are equal, so this system only works if all types of fishing trips are “sampled.”  Even assuming an angler honestly reports his or her catches, there are still major concerns. 

Fish catches on a good day (left) and a bad day (right).  Photo credit: Wikimedia Commons

Fish catches on a good day (left) and a bad day (right).  Photo credit: Wikimedia Commons

Most times, it is relatively simple to estimate the number of fishers out there (using fishing license sales, or phone surveys), so scientists often extrapolate the amount of reported catches based on the number of anglers they think there are to estimate the size of the fish population.  What happens if the people who don’t catch any fish are too embarrassed to report?  If only the people who are catching a lot of fish volunteer their “data,” scientists can only assume (incorrectly) that all the anglers they know exist are catching as much.  That leads to scientists thinking more fish are caught than what happens in reality, and that leads to them thinking there are more fish out there than in reality.  When natural resource agencies hire interviewers to sample anglers, they randomly assign them locations to maximize the probability that they collect information from all types of fishers.  This is very difficult to do when asking anglers to voluntarily report fishing information.  If everyone does his or her part, the system works, but how can it be known if that is the case?

The most important point to be taken away is that there are different types of “citizen science” initiatives, and for scientists, it is important to understand the differences.  If the logistics of the experiment are correctly laid out—or, as in the case of the invasive species database, there is no true experiment—volunteers can be trained to serve as reliable scientists in that specific role. 

However, when the protocol is unclear—or is complicated by the misrepresentation of certain elements of the “data”— careful consideration needs to be taken when deciding how to use this information.  This can be a challenge for a team of researchers, and becomes more difficult when trying to work with volunteer data.  In the case of fisheries management, there do appear to be instances where anglers can self-report their data and provide reliable information for scientific assessment of fish populations.  Still, this conclusion could only be reached after extra studies were done to assess the reliability of these data points. 

This idea of validating volunteers is most likely a useful exercise for all types of citizen-driven data collection, even the cases of “the more, the better.”  After all, science can be a tricky business, regardless of the researcher’s amount of experience.  That is why it is such a collaborative effort, where everyone has the ability to contribute his or her skills and resources in the best way possible.

Ryan Jiorle is finishing up his master’s degree in fisheries and aquatic sciences at the University of Florida. His research has focused on determining the potential utility of opt-in, self-reporting smartphone apps as a way to provide recreational fisheries data for stock assessment.  Despite the very “indoorsy” nature of his research, Ryan enjoys snowboarding, surfing, and fishing, as well as many other outdoor activities.  His other interests include watching martial arts films, professional wrestling, and reading and writing fiction.  Sometimes, he grows a beard.  Ryan was born and raised in Phillipsburg, New Jersey and—for better or worse—will gladly tell you all about it.

References:

Hoyer, MV, N Wellendorf, R Frydenborg, D Bartlett, and DE Canfield Jr.  2012.  A comparison between professionally (Florida Department of Environmental Protection) and volunteer (Florida LAKEWATCH) collected trophic state chemistry data in Florida.  Lake and Reservoir Management 28(4): 277-281.

For further reading:

Instructable on how to get started in citizen science

 Examples of citizen science projects

 Florida Lakewatch

 Invasive species initiative in Florida

 Angler self-reporting