# Rosetta reaches comet 67P/Churyumov-Gerasimenko

Today the Rosetta satellite reached comet 67P. It’s the first time in history a space probe made such a close approach to a comet. In the coming weeks the current distance to the comet of ~100 km will be lowered to ~30 km bringing the probe in orbit. Once this orbit is established a robotic lander “Philae” will be dropped along a ballistic orbit onto the comet itself.

Below you find one of the latest images taken by the Rosetta satellite. Considering that the satellite has travelled 6 billion kilometres, taking 10 years to do so, I consider this a major scientific and technological feat.

Comet 67P close-up - OSIRIS, Credit: ESA/Rosetta/MPS for OSIRIS Team MPS/UPD/LAM/IAA/SSO/INTA/UPM/DASP/IDA[/caption]

I commend ESA for supporting such long term / high risk missions. I hope such wonderful pictures, after this amazing journey, can inspire future engineers and scientists all over the world.

Space programs and science funding in general has seen some serious budget cuts but results such as these should inspire policy makers that good science sometimes takes considerable time and money. My congratulations go out to Rosetta team!

For last minute updates on the project, more images and videos I refer to the Rosetta ESA website.

# IMS / CMC snow product code on bitbucket

In a previous post I mentioned my efforts to standardize and consolidate the IMS and CMC snow products. In order to keep the ball rolling on development of the code I created a new bitbucket project.

You can find the link to the project on my software page. I will add new functions to process the data on a later date. If I find the space I will also upload the consolidated data, if not ask them to be hosted by NSIDC. As of now the code will allow you to compile the data in stacked geotiffs yourself. Happy hacking everyone.

# Georeferencing daily CMC and IMS snow analysis data

Recently I was in need of some snow cover data of the Arctic. The National Snow and Ice Data Centre (NSIDC) is focussed on tracking snow and ice properties, hence “the source” of snow and ice data. Although all data is freely available, figuring out how to read in these data often presents an issue for the uninitiated as it lacks easy to interpret geo-referencing information – presenting most with a rather steep learning curve. Although you can produce statistics from the raw ascii files, which are relatively easy to reformat, it’s rather hard to figure out a given snow depth statistic at any particular location.

For the Canadian Meteorological Centre (CMC) Daily Snow Depth Analysis product as well as the IMS Daily Northern Hemisphere Snow and Ice Analysis, hosted by the NSIDC, I figured out the CRS geo-referencing  details which can be concisely summarized as an extent (x-min, x-max, y-min, y-max) of:

-8405812,8405812,-8405812,8405812 (CMC)

and

-12126597,12126840,-12126597,12126840 (IMS)

and a CRS projection string:

“+proj=stere +lat_0=90 +lat_ts=60 +lon_0=10 +k=1 +x_0=0 +y_0=0 +a=6371200.0 +b=6371200.0 +units=m +no_defs” (CMC/IMS)

These two pieces of information should suffice to either hack together a bash script and some GDAL magic to georeference the data. However, I decided to go the R route this time around as I’ve been doing most of my image processing using the raster() package lately.

CMC processing

Below you find a function in R to process a year of CMC Snow Depth Analysis Data, in ascii format, into compressed geotiffs. The ascii files were clean up by removing the header and the dates which split up the daily matrices (or maps). The preprocessing is done using bash command line tools. As such, you will need a *nix machine to successfully run the code below.

After cleaning the original ascii file the data is read into memory as one big data table, folded into a 3D array with a layer for each day of the year (DOY). Layers are renamed to a their proper day of year value using the “DOY_#” format. All this is finally written to compressed geotiff (~50 MB yr-1), roughly equal in size to the original compressed ascii data but georeferenced and accessible with GDAL compatible GIS software (e.g. QGIS / GRASS). For the details on the procedure I refer to the code embedded below.

# cmc conversion function
georeference_CMC_snow_data <- function(filename="",geotiff=F){

# load required packaages
require(raster)     # GIS functionality
require(lubridate)  # to detect leap years

# the projection as used (polar stereographic (north))
# latitude at natural origin is 60 degrees
# I use the same spherical projection as the IMS product
proj = CRS("+proj=stere +lat_0=90 +lat_ts=60 +lon_0=10 +k=1 +x_0=0 +y_0=0 +a=6371200.0 +b=6371200.0 +units=m +no_defs")

# with a x / y resolution of 23815.5 and the pole at 353,353
e = c(-8405812,8405812,-8405812,8405812)

# some feedback
cat("Converting file:\n")
cat(paste(filename,"\n"))

# grab year from filename
no_path = basename(filename)
no_extension = sub("^([^.]*).*", "\\1", no_path)
year = unlist(strsplit(no_extension,split="_"))[3]

# remove the breaks in the data file due to insertion of
# dates
system(paste("sed '/",year,"/d'", filename ," > tmp.txt",sep=""))

# read the data using scan and force it to use a double format
# this command returns a vector
# adjust the length according to leap years
if ( year == 1998){
nr_layers=153
}else{
if (leap_year(year) == TRUE){
nr_layers=366
}else{
nr_layers=365
}
}
data_vector = scan('tmp.txt',skip=0,nlines=706*nr_layers,what=double())

system("rm tmp.txt")
# convert the vector into a 3D array, using aperm to sort byrow
# this is similar to read.table(...,byrow=T) for tables
data_array = aperm(array(data_vector,c(706,706,nr_layers)),c(2,1,3))

# remove first data vector and free up some memory
rm(data_vector);gc()

# convert the 3D array to a rasterBrick
rb = brick(data_array)

# remove data array and free up some memory
rm(data_array);gc()

# assign the above projection and extent to the raster
projection(rb) = proj
extent(rb) = extent(e)

if (year == 1998){
names(rb) = c(paste("DOY_",213:365,sep=""))
}else{
# set layer names by day of year (DOY)
if (leap_year(year) == TRUE){
names(rb) = c(paste("DOY_",1:366,sep=""))
}else{
names(rb) = c(paste("DOY_",1:365,sep=""))
}
}

# return data as geotiff or raster brick
# keep in mind that these are rather large files
# when returning data to the workspace
if (geotiff==F){
return(rb)
}else{
# write data to file
filename = paste("cmc_analysis_",year,".tif",sep="")
writeRaster(rb,filename,overwrite=TRUE,options=c("COMPRESS=DEFLATE"))
}
}

IMS processing

The IMS processing is similar to the CMC processing although the variable header info makes things more complicated. The first part of the routine takes care of finding the starting line on which the actual data begins. The rest of the routine reads in the data line by line, splitting the long character vector in it’s individual numbers and converting it to a matrix.

This matrix is converted to a raster() object which can be assigned an extent and projection. Finally, this data is written to a compressed geotiff.

georeference_IMS_snow_data <- function(filename="",geotiff=F){

# you need the raster library to georeference
# the image matrix
require(raster)

# Scan for the line on which "Dim Units"
# occurs for the second time, I'll scan for the
# first 200 lines if necessary, assuming no
# header will be longer than this even given
# some changes in the header info through time
# I also assume that the basic naming of the
# header info remains the same (file size etc)

# the projection as used (polar stereographic (north))
proj = CRS("+proj=stere +lat_0=90 +lat_ts=60 +lon_0=10 +k=1 +x_0=0 +y_0=0 +a=6371200.0 +b=6371200.0 +units=m +no_defs")

# set extent based upon documentation info x / y resolution top left coordinate
e = extent(-12126597.0,12126840,-12126597,12126840.0)

# set string occurence to 0
str_occurence = 0

# attache and open file
con <- file(filename)
open(con)
for(i in 1:200){

# read a line of the ascii file
l = readLines(con, n=1)

# find the occurence of "Dim Units"
occurence = grep("Dim Units",l,value=F)

# only process valid results (skip empty strings)
if(length(occurence)!=0){
str_occurence = str_occurence + occurence
}

# when two occurences are found return the
# line of the second one
if (str_occurence == 2){
skip_lines <<- i # set the global variable skip_lines to i
break
}
}
# close file
close(con)

# read in the data after the header data assumes a 1024 x 1024 matrix
data_list = scan(filename,skip=skip_lines,nlines=1024,what=character())

# split every element (a long character string) in the list into it's
# individual characters and return to the original list object
data_list = lapply(data_list,function(x)substring(x,seq(1,nchar(x),1),seq(1,nchar(x),1)))

# unlist the list and wrap it into a matrix
data_matrix = matrix(as.numeric(unlist(data_list)),1024,1024)

# convert to a raster object
r = raster(data_matrix)

# assign the raster a proper extent and projection
extent(r) = e
projection(r) = proj

# return the raster
if (geotiff == F){
return(r)
}else{
no_extension = unlist(strsplit(basename(filename),split='\\.'))[1]
writeRaster(r,paste(no_extension,".tif",sep=""),overwrite=T,options=c("COMPRESS=DEFLATE"))
}
}

# Spatial Panel Analysis Method in R

I just posted a new project on my software page. It's a fixed effects panel analysis function. It uses a method from econometrics to detect trends over time across a window of pixels. This increases the robustness of trend analysis as compared to a pixel by pixel regression analysis. An added benefit is the speed you gain as you downscale the original data by a factor defined by the panel size used. The function makes use of the Panel Data Econometrics "plm" package in R.  A detailed overview can be found in this publication.

For a project in the Alaska I ran a quick analysis on MODIS GPP trends across the whole dataset (2001-2013). Below you see the results of a panel analyis of the MODIS GPP data using a panel size of 11 or roughly ~1/10 a degree. Given the pronounced warming of the Arctic a steady increase in GPP is noted across Alaska (up to 0.03 Kg C m -2).

# The Jornada experimental range

The Jornada LTER just after a thunderstorm.[/caption]

The image above shows a view across the landscape at the Jornada experimental range or Long Term Ecological Research (LTER) site in the northern Chihuahuan Desert, approximately 25 km northeast of Las Cruces, New Mexico, USA (+32.5 N, -106.8 W, elevation 1188 m), the largest desert in North America and one of the largest deserts in the world.

The Jornada LTER was established by the U.S. Department of Agriculture (USDA) in 1912 on Presidential Executive Order, with four main research goals:

1. Methods for assessment and monitoring of rangelands at landscape, watershed, and regional scales
2. Ecologically-based technologies for remediation of degraded rangelands
3. Animal behavior-based strategies for livestock management
4. Predictive models of ecosystem responses to changes in climate and other management-dependent and independent drivers.

My current research focusses on phenology and vegetation growth in arid ecosystems, including rangelands such as those at the Jornada LTER. The image above shows the thunderstorms and associated showers passing across the landscape in the distance, lit by a low evening sun. Unlike forest ecosystems these shrub- and grassland ecosystems derive much of their growth from these intermittent pulses of rain, which allow for fast vegetation growth. However, the pulses of water are not well retained as they are quickly used by the vegetation, evaporated from the surface, drained away. As quick as the growth of these ecosystems can progress just after the rain, is their demise when a few weeks without rain pass.