in Blog on Data_recovery, Digitization, Citizen_science, Meta-data
I previously outlined how dealing with +70K scans in the COBECORE project presents an inssue when it comes to processing and extracting data. Due to template matching a large part of these issues have been automated away. Yet, even when the data can be extracted one hurdles remains, empty cells in table.
in Blog on Data_recovery, Digitization, Citizen_science
With all images scanned and sorted in my COBECORE project the next step involves the transcription of the images into meaningful, machine readable, data. Due to the complexity of the data, such as various handwriting styles in faded or runny ink, automating this process is very difficult. We will therefore aim to crowdsource the transcription of the data. Yet, large tables are difficult to transcribe as the location within a table is of importance, and not only the values. As such, mistakes are more easily made when transcribing tables as a whole.
In a previous blog post I describe how to subset MODIS hdf data. However, this was a rather simple example. Today, a graduate student emailed me to help her out with a subsetting problem she had when running the code, or better the lack of an option to extract a region of interest (rather than point data) in the previous example I gave.