in Op-ed / Research / Science / Software on Api, Code, Data, Open access, Open data, Research, Science, Software
Over the past few years, writing software left and right, I noticed a trend. Most of the software I write serves one purpose: making open access data - accessible! This should not be!
There has been a steady push for open data, increasing transparency and reproducibility of scientific reporting. Although many scientific data sources are indeed open, their access is not (easy), especially for the less computer savvy.
Looking at the field of ecology some projects do well and maintain APIs such at Daymet and Tropicos. Although Tropicos requires a personal key which makes writing toolboxes cumbersome. The later left me with no other choice than to scrape the website. The Oak Ridge National Laboratories (ORNL) also offers MODIS land product subsets through an API, with interfaces coded by users. However, these data are truly open access (as in relatively easy to query) and should be considered the way forward.
This contrasts with for example resources such as The Plant List, which offers a wealth of botanical knowledge guarded behind a search box on a web page, only to be resolved by either downloading the whole database or by using a third party website. Similarly the National Snow and Ice Data Center oldest snow and ice data is stored in an incomprehensible format (the more recent data is offered in an accessible geotiff format). Surprisingly, even large projects such as Ameriflux, with a rather prominent internetpresence, suffer the same fate, i.e. a wealth of data largely inaccessible for quick and easy use and review.
Pooling several data access issues in the above examples, I think I’ve illustrated that open access data does not equal easily accessible data. A good few of us write and maintain toolboxes to alleviate these problems for themselves and the community at large. However, these efforts take up valuable time and resources and can’t be academically remunerated as only a handful of tools would qualify as substantial enough to publish.
I therefore would plead that data producers (projects alike) to make their open data easily accessible by:
creating proper APIs to access all data or metadata (for querying)
making APIs truly open so writing plugins can be easy if you don't do it yourself
writing toolboxes that do not rely on proprietary software (e.g. Matlab)
in Research / Science / Software on R, Research, Science, Software
Today I launch the first version of AmerifluxR. The AmeriFluxR package is a R toolbox to facilitate easy Ameriflux Level2 data exploration and downloads through a convenient R shiny based GUI. This toolset was motivated by my need to quickly assess what data was available (metadata) and what the inter-annual variability in ecosystem fluxes looked like (true data).
The package provides a mapping interface to explore the distribution of the data (metadata). Subsets can be made geographically and/or by vegetation type. Summary statistics (# sites / # site years) are provided on top of the page. The Data Explorer tab allows for more in depth analysis of the true data (which is downloaded and merged into one convenient file on the fly). A snapshot of the initial Map and Site Selection landing page is shown below.
In the Data Explorer tab one can plot ecosystem productivity data (GPP / NEE) for a selected site. You can select a plot displaying all data on a daily basis (consecutively) or overlaying data yearly. Note that although all sites are listed, not all of them have accessible data. The plot area will notify you of this.
The package can be conveniently installed using only 3 commands on the R terminal (the first line takes care of dependencies, the second line loads devtools which is required to install from a github repository, line 3).
on the command line and the above screen will pop-up in your favourite browser (preferentially Chrome).
Future development will include higher level products as well as other metrics (yearly summaries, etc…). I welcome anyone to join this effort and potential scientific endeavours that spring from this. Drop me a line by email or on GitHub.
in Software on Gis, Image processing, Research, Software
This is a quick post originating from a discussion I had recently. Sometimes GIS data does not come with it’s original colour map but only as raw numbers. These raw numbers (classes) are fine for calculations, but rather limit the way you visualize things. Here, I’ll show how to map colours to the classes or ranges using the Geospatial Data Abstraction Library (GDAL).
All you need is a list of classes which you want to map to particular colours. The format of this colour table is rather flexible and is described in full on the GRASS r.colors page. For this particular example I used the colours of the 0.5 km MODIS-based Global Land Cover Climatology map, which translates into a table with 16 classes (I attached the table at the end of the blog post). You can download the data form the USGS website if you want to try this example (warning: large file - 4GB unzipped).
# If the colour table is saved in colours.csv the following
# command links a proper colour table to a geotiff file
# without this information.
gdaldem color-relief input.tif colours.csv output.tif
The above command might map the colours to the classes but the map still remains rather static. If you want to create a Google Earth compatible file (mapped onto a 3D sphere), you can do so by translating the file format. The resulting KML file should open in Google Earth if you have a copy running.
in Harvard 360 / Research / Science on 360, Harvard 360, Research, Science, Vr
I’m starting a new series of blog posts called Harvard 360 (a bit of my life and research on the Harvard campus). I’ll be posting 360 immersive pictures to provide a real feel for my work around Harvard University, and in particular during field work.
I kick off the series with an image of Harvard Yard. Harvard Yard houses the oldest buildings on campus as well as all undergraduate housing and several libraries. In front of the white building in the distance is the statue of John Harvard. [click the grey bar to load the image if not loading automatically, or click the link to access the VR Flickr page]