# Citizen Science restores trust in science; isn’t exploitative

A recent review comment made the claim that Citizen Science (referring to my Jungle Rhythms project) is exploitative. With due diligence on part of researchers, not only is this comment misguided, it also is a testament to a pervasive ivory tower way of thinking about science.

Science is often perceived as a field of the select few with limited interaction between the scientific world and the public. Or as aptly put in the Guardian Science section: “Science is the invisible profession. Most people have no idea what scientists do, and may harbour a vague feeling of suspicion or uneasiness about the whole endeavour.”

This lack of transparency has been abused many times over to create doubt and confusion in order to push a political agenda, fueling among others climate skepticism. In addition, a lack in transparency and limited communications creates a less educated public and one which is less used to dealing with complexity.

Citizen science provides a way to counter all these issues. It allows citizens to actively contribute to science, directly communicate with scientists and at times attain PhD worthy knowledge through self-study. In today’s society with increasing distrust in science through fake news and “alternative” facts this direct and transparent communication about science between scientists or science communicators and the public is key to retain or restore trust in science.

# R arctic polar plots

For a project I needed to create an appealing plot of the arctic, showing the location of some field sites. I’ve posted this map on twitter earlier today. Below, I’ll outline a simple routine to recreate this plot, and if need be adjust it to your liking.

First of all I downloaded an appealing background from the Blue Marble dataset as created by NASA. Geotiffs can be downloaded here or by direct download following this link. Alternatively you can download the less realistic and more summary style graphics as produced by Natural Earth.

After downloading the Blue Marble geotiff you trim the data to the lowest latitude you want to plot. Subsequently, I reproject the data to the EPSG 3995 projection (or arctic polar stereographic). All this is done using GDAL. This step could be done using rgdal, but at times this doesn’t play nice. For now I post the command line GDAL code.

UPDATE: the below command line gdal code is not necessary anymore as I call the raster library in R now which works fine in dealing with the reprojection after some fiddling.

gdalwarp -te -180 55 180 90 world.topo.bathy.200407.3x5400x2700.tif tmp.tif
gdalwarp -wo SOURCE_EXTRA=200 -s_srs EPSG:4326 -t_srs EPSG:3995 -dstnodata "255 255 255" tmp.tif blue_marble.tif

The remaining R code ingests this background image and overlays a graticule and some labels. For this I heavily borrowed from the sp map gallery.

# load required libraries
library(sp)
library(maps)
library(rgeos)

# function to slice and dice a map and convert it to an sp() object
maps2sp = function(xlim, ylim, l.out = 100, clip = TRUE) {
stopifnot(require(maps))
m = map(xlim = xlim, ylim = ylim, plot = FALSE, fill = TRUE)
p = rbind(cbind(xlim[1], seq(ylim[1],ylim[2],length.out = l.out)),
cbind(seq(xlim[1],xlim[2],length.out = l.out),ylim[2]),
cbind(xlim[2],seq(ylim[2],ylim[1],length.out = l.out)),
cbind(seq(xlim[2],xlim[1],length.out = l.out),ylim[1]))
LL = CRS("+init=epsg:4326")
IDs = sapply(strsplit(m$names, ":"), function(x) x[1]) stopifnot(require(maptools)) m = map2SpatialPolygons(m, IDs=IDs, proj4string = LL) bb = SpatialPolygons(list(Polygons(list(Polygon(list(p))),"bb")), proj4string = LL) if (!clip) m else { stopifnot(require(rgeos)) gIntersection(m, bb) } } # set colours for map grid grid.col.light = rgb(0.5,0.5,0.5,0.8) grid.col.dark = rgb(0.5,0.5,0.5) # coordinate systems polar = CRS("+init=epsg:3995") longlat = CRS("+init=epsg:4326") # download the blue marble data if it doesn't # exist if (!file.exists("blue_marble.tif")) { download.file("http://neo.sci.gsfc.nasa.gov/servlet/RenderData?si=526312&cs=rgb&format=TIFF&width=5400&height=2700","blue_marble.tif") } # read in the raster map and # set the extent, crop to extent and reproject to polar r = raster::brick("blue_marble.tif") e = raster::extent(c(-180,180,55,90)) r_crop = raster::crop(r,e) # traps NA values and sets them to 1 r_crop[is.na(r_crop)] = 1 r_polar = raster::projectRaster(r_crop, crs = polar, method = "bilinear") # some values are not valid after transformation # (rgb range = 1 - 255) set these back to 1 # as they seem to be the black areas r_polar[r_polar < 1 ] = 1 # define the graticule / grid lines by first specifying # the larger bounding box in which to place them, and # feeding this into the sp() gridlines function # finally the grid lines are transformed to # the EPSG 3995 projection pts=SpatialPoints(rbind(c(-180,55),c(0,55),c(180,85),c(180,85)), CRS("+init=epsg:4326")) gl = gridlines(pts, easts = seq(-180,180,30), norths = seq(50,85,10), ndiscr = 100) gl.polar = spTransform(gl, polar) # I also create a single line which I use to mark the # edge of the image (which is rather unclean due to pixelation) # this line sits at 55 degrees North similar to where I trimmed # the image pts=SpatialPoints(rbind(c(-180,55),c(0,55),c(180,80),c(180,80)), CRS("+init=epsg:4326")) my_line = SpatialLines(list(Lines(Line(cbind(seq(-180,180,0.5),rep(55,721))), ID="outer")), CRS("+init=epsg:4326")) # crop a map object (make the x component a bit larger not to exclude) # some of the eastern islands (the centroid defines the bounding box) # and will artificially cut of these islands m = maps2sp(c(-180,200),c(55,90),clip = TRUE) #----- below this point is the plotting routine # set margins to let the figure "breath" and accommodate labels par(mar=rep(1,4)) # plot the grid, to initiate the area # plotRGB() overrides margin settings in default plotting mode plot(spTransform(gl, polar), lwd=2, lty=2,col="white") # plot the blue marble raster data raster::plotRGB(blue_marble, add = TRUE) # plot grid lines / graticule lines(spTransform(gl, polar), add = TRUE, lwd=2, lty=2,col=grid.col.light) # plot outer margin of the greater circle lines(spTransform(ll, polar), lwd = 3, lty = 1, col=grid.col.dark) # plot continent outlines, for clarity plot(spTransform(m, polar), lwd = 1, lty = 1, col = "transparent", border=grid.col.dark, add = TRUE) # plot longitude labels l = labels(gl.polar, crs.longlat, side = 1) l$pos = NULL
text(l, cex = 1, adj = c( 0.5, 2 ),  col = "black")

# plot latitude labels
l = labels(gl.polar, crs.longlat, side = 2)
l$srt = 0 l$pos = NULL
text(l, cex = 1, adj = c(1.2, -1), col = "white")

# After all this you can plot your own site locations etc
# but don't forget to tranform the data from lat / long
# into the arctic polar stereographic projection using
# spTransform()


# COBECORE project accepted for funding

Past September (2016) I wrote the “Congo basin eco-climatological data recovery and valorisation” or COBECORE proposal together with several partners building upon and inspired by the success of the Jungle Rhythms project.

The Jungle Rhythms uses citizen science to transcribe old colonial records of tree phenology (seasonal changes in the state of the tree). The project illustrates nicely that historical data can still hold significant value for current day research, and data which seem out of reach due to the challenging nature of transcription can be tackled with the generous help of citizen scientists.

Given this notion, I expanded upon the basic idea of Jungle Rhythms in order to digitize and transcribe further historical colonial records of eco-climatological importance as stored in the state archives in Brussels. Although the competition was stiff in the thematic axis 3 & 6 (cultural, historical and scientific heritage) of the Belgian Science Policy Office BRAIN call with close to 90 submissions and a rather rough 16% success rate, the project was still selected!

I’m therefor happy to announce the informal start of the COBECORE project as funded by the Belgian Science Policy Office (pending political approval of the science budget).

# snotelr – a R package for easy access to SNOTEL data

I recently created the MCD10A1 product. This is a combined MODIS MOD10A1 and MYD10A1 product, alleviating some of the low bias introduced by either overpass through a maximum value approach. This approach has been used in the study by Gascoin et al. (2013) but I wanted some additional validation of the retrieved values.

As such I looked at the SNOTEL network which  ”… is composed of over 800 automated data collection sites located in remote, high-elevation mountain watersheds in the western U.S. They are used to monitor snowpack, precipitation, temperature, and other climatic conditions. The data collected at SNOTEL sites are transmitted to a central database, called the Water and Climate Information System, where they are used for water supply forecasting, maps, and reports.” Here, the snowpack metrics could provide the needed validation data for my MCD10A1 product.

Although the SNOTEL website offers plenty of plotting options for casual exploration and the occasional report, but the interface remains rather clumsy with respect to full automation. As such, and similar to my amerifluxr package (both in spirit and execution), I created the snotelr R package. Below you find a brief description of the package and it’s functions.

## Installation

You can quick install the package by installing the following dependencies

install.packages("devtools")

library(devtools)
install_github("khufkens/snotelr")

library(devtools)
install_github("khufkens/snotelr")


## Use

Most people will prefer the GUI to explore data on the fly. To envoke the GUI use the following command:

library(snotelr)
snotel.explorer()

This will start a shiny application with an R backend in your default browser. The first window will display all site locations, and allows for subsetting of the data based upon state or a bounding box. The bounding box can be selected by clicking top-left and bottom-right.

The plot data tab allows for interactive viewing of the soil water equivalent (SWE) data together with a covariate (temperature, precipitation). The SWE time series will also mark snow phenology statistics, mainly the day of:

• first snow melt
• a continuous snow free season (last snow melt)
• first snow accumulation (first snow deposited)
• continuous snow accumulation (permanent snow cover)
• maximum SWE (and it’s amount)

For in depth analysis the above statistics can be retrieved using the snow.phenology() function

# with df a SNOTEL file or data frame in your R workspace
snow.phenology(df)

To access the full list of SNOTEL sites and associated meta-data use the snotel.info() function.

# returns the site info as snotel_metadata.txt in the current working directory
snotel.info(path = ".")

# export to data frame
data = snotel.info(path = NULL)

To query data for e.g. site 924 as shown in the image above use:

download.snotel(site = 924)

# Tree mortality: common causes and preliminary statistics

In the Jungle Rhythms project volunteers tag observations with #hashtags on the online forum. One observation in particular is not only informative towards post-processing of the annotations but also has scientific value in it’s own right. Mainly, the cause of death of an observed tree within the Jungle Rhythms project holds information on the ecology of the tree and human and natural stresses it experiences, which lead to it’s demise.

Within this context I ran some quick statistics on the hashtags of the online forum of the Jungle Rhythms project.

Overall, several sources of tree death exist as nicely summarized by @itsmestephanie and I quote:

“Abattu as in abattoir. Felled, cut down.  / Coupé as in coupon. Cut, presumably down. / Passants - passers-by. Coupé par les passants - cut down by passers-by. / Sec as in desiccation. Dry. / Brûlé as in crème brûlée. Burnt. / Cassé: Broken. / Tombé: Fallen. / Vent as in ventilation. Wind. Tombé par le vent = Fallen (rather pushed down) by a really big vent. / Mort: Mortician, mortality, mortuary. Morbidity, moribund, morbid. Jack Mort. Lord Voldemort.”

I counted all instances of the hashtags on subjects in the forum and summed them using both natural or human causes. Double mentions were excluded, not to count hashtags multiple times within the same forum post.

The largest class is the “coupé” class, with a total of 257 occurrences. Second on the list is the “cassé” class with 74 mentions, followed by “mort” (72) and “tombé” (46). All other classes list smaller numbers.

Summing all human caused events results in a total of 264 deaths, while natural causes only account for approximately half this number (133). Both these values account for 10 and 5 % of the total number of observed trees. With the project currently at 90% completion and no incentive to report the events I’ll have to validate the true numbers.

However, a few conclusions can be drawn from these simple statistics. Firstly, the human influence on the experiment was significant (twice as large as the natural deaths). As far as I could tell most trees were located along forest paths. This increased the likelihood of a tree being cut down due to easy accessibility (e.g. more elaborate description such as: coupé par les passants). Within the natural causes the classes “cassé” and “tombe” account for a large fraction of the deaths. In other words, over 50% of the deaths are related to physical instability and tree fall of either the whole tree (tombé) or the bole supporting the canopy (cassé).

Treefall is an important process in forest regeneration, statistics derived from the Jungle Rhythms project therefor not only give insight into seasonal processes of the trees observed but also provides mortality rates and causes.