Mendeley grievances and quick hack

I was recently reminded by Mendeley that I ran out of ‘free’ space on my account (i.e. 2.Gb of pdf storage). As a previous paying customer (who reverted back to a free account as they bumped it from 500mb free to 2Gb) I was more than happy to pay again, so I thought.

However, in the 3 years since I last payed, their storage offer has not kept up with the times. For a \$55/yr you now get a WHOPPING - 5Gb - of pdf storage space, a Pro plan goes for \$110/yr and an impressive 10Gb, while unlimited storage will cost you \$165/yr! Given that Mendeley is now owned by Elsevier, a dominant if not the biggest scientific publisher around, their database of pdfs can easily be reduced by an order of magnitude (as most files are already stored by Elsevier!). Given this large overlap their storage overhead should be smaller than before their acquisition by Elsevier, warranting a price cut not a price increase relative to other / previous services. In comparison, a Dropbox account costs me \$100/yr for 1Tb! Even better deals are possible with a Google Drive account (however I like Dropbox’s integration and Linux support better).

In short, given the current state of cheap storage these price points are less than impressive, and it reeks of profit maximization using some user statistics and marketing parameters (at which price point are people likely to pay for  a given feature). Sadly, I’m of the opinion the charge does not reflect the true value of the service.

As such, for anyone who has cloud storage and does not want to pay extortion prices for pdf storage:

  • just disable your pdf syncing feature (click 'Edit Settings' next to the All Documents tab title)
  • create a soft link between wherever your pdf's used to be stored and a location within your cloud storage.

An example below for the standard configuration on OSX:

# move all your files to your cloud storage (e.g. Dropbox)
# on OSX your Mendeley Desktop folder in Documents stores all your
# pdfs

mv ~/Documents/Mendeley Desktop ~/Dropbox

# now create a soft link between the new folder and where the folder
# used to be

ln -s ~/Dropbox/Mendeley\ Desktop ~/Documents/Mendeley\Desktop

# Migrate the database to your Dropbox (for convenience)
# the link below if for linux, look up the locations for Mac
# and Windows on the Mendeley website
ln -s ~/Dropbox/Mendeley\ Ltd./ ~/.local/share/data/Mendeley\ Ltd.

# now do this for all your computers. Your database will keep up to date
# through Mendeley (and Dropbox), while your files keep in sync
# through your cloud storage service

The above hack will allow you to keep your references in sync using the Mendeley database, and platform while at the same time keeping your files in sync through your cloud storage service. This way you can bypass their rather questionable business model.

Given this hack and an ironic twist of fate they now miss out on my \$55/yr. If they would have provided 10Gb I would have payed, as the markup from my current quota is significant and a worthwhile upgrade. Sadly, I now have to resort to tricks, albeit legal, to keep the same functionality (a basic reference manager).

I guess this is one example of software as a service gone wrong. I hope that they will change their business model in the future, so I can become a paying customer again.

DISCLAIMER: You will lose the ability to read your pdf’s from within the Mendeley app on iOS or Android (but you could still do so using e.g. the Dropbox app)!


Jungle Rhythms pre-processing

The Jungle Rhythms project at the surface seems rather straightforward in it’s setup. However, there are a lot of behind the scenes preparations that went into project. One of these tasks was cutting the large tables into yearly sections.

Below you see a picture of one page of the original tables, halfway through pre-processing. Each of these tables was first rectified (making sure that all row and column lines are roughly vertical and horizontal - to the extend possible) and cropped (as shown below).

Next, I marked all column widths and the width of one row, as well as the bottom right corner. Using this information I could calculate the approximate position of all yearly sections (outlined by the red lines).

The yearly sections were then cut out of the original images (with some padding) and saved with additional information on their location (column and row number). The final result of this operation would be an image as shown below and presented to you in the Jungle Rhythms project.


Layman’s notes on #EUCON16

This weekend, for the fourth time, the students of the Harvard Kennedy school (et al.) put together the European Conference. A conference dedicated to European politics. First of all I should commend the students for bringing together a impressive set of speakers, among others, former president to the European Comission, Jose Manuel Barosso as keynote speaker.

Below are a few observations I made as an outsider and concerned European citizen, albeit with enough ‘little grey cells’.

On optimism and the state of the EU -  First of all, the keynote of Barosso underscored the success of the EU despite the setback of the economic crisis. Irrespective of this event, and the rather grim predictions by economist world wide, the EU grew in numbers instead of falling apart. However, he was correct to note that the current migrant crisis might pose a bigger threat to EU integration than the economic crisis. He warned for growing xenophobia in member states, not in the least more recent ones which often don’t share the common historical context of both World Wars, and to cite Kathe Kollwitz, the general sentiment of “nie wieder krieg (faschismus; by extension)”. In general Barosso’s keynote was thought provoking, yet rather positive. Given the tumultuous state of the EU I hope that Barroso was right in citing Jean Monnet “People only accept change when they are faced with necessity, and only recognize necessity when a crisis is upon them.”

On TTIP and trade agreements - The panel discussion on (Transatlantic Trade and Investment Partnership) TTIP provided me with some new insights as well. For one, the panel was unfairly balanced in favour of TTIP, with only Dan Mauer providing some push-back. My most “memorable” moment was the rather gratuitous cop-out by the EU ambassador to the US, David O’Sullivan, on a question regarding transparancy of TTIP. A member of the audience commented on the fact that TTIP is unprecedented in it’s transparency during negotiations, and how this was perceived by negotiating partners? As mentioned ambassador O’Sullivan reposted that, indeed, the negotiations have been relatively open, if not forced due to an initial leak, but that this has little value as most people would only find the documents boring - as such still no full text is provided only legally void position papers and summaries. This rather jaw dropping statement is not only elitist but does injustice to any democratic principles. A surprisingly cheap cop-out to a valid question, and concern that many EU citizens share. I would have expected a more coherent response from O’Sullivan. This lack of respect for genuine concern by citizens, as well as the lackluster response of the EU to increase transparency, is a testament to what I would call a forced hand, rather than due diligence on part of transparency. Sadly, I fear that underhanded changes, such as recently highlighted in TPP, will sure make it’s way into TTIP without full transparency.

On privacy and Safe-Harbor - In a post Snowden age it’s clear that the US will have to start thinking about privacy as a human right. The panel seemed to agree that this is a demand of both industry as privacy NGOs. The panel was in consensus that this should happen in the near future, although current implementations such as Privacy Shield (Safe-Harbour’s replacement) is equally dead on arrival - say this isn’t the final solution. The main take home message is that action will be required in the US, if not forced by the EU. Little was mentioned on how this would interface with for example TTIP, if at all. Yet, overall the outcome for US citizens will only be for the better.

Anyway, back to the business of the day - modelling ecosystem responses to climate change.


Processing Jungle Rhythms data: intermediate results

After a few days of struggling with R code I have the first results of processed data at my fingertips! Below you see a 3 window plot which shows the original image on top, the annotated image in the middle and the final extracted data at the bottom. The middle window gives you an idea of the geometry involved in the calculation of the final data at the bottom. I’ll shortly detail my approach, as I abandoned the idea I proposed in a previous blog post.

Using the most common six coordinates for each yearly sections (red dots overlaying the individual markings as green crosses - middle window) I calculate the approximate location of the rows as marked by the red dots on the vertical bold lines (line - circle intersection). All annotations are then projected on to this ideal row (projection of a point onto a line), rendering the coloured lines for different observation types. Finally, with the overall distance of each given row (on a half year basis - point to point distance) I calculate the location in time which is occupied by an annotation. Classifying the lines into life cycle event types is done by minimizing the distance to the ideal row.

All images are shown to 10 independent citizen scientists, and for each individual annotation these values are summed. Where there is a high degree of agreement among citizen scientists I will see a larger total sum. If there is an unanimous agreement among them the sum would be 10 for a given day of year (DOY). As not all subjects are retired, the value as displayed below only has a maximum count of 8. The spread around the edges of the annotations is due to variability in the classifications.

You already notice some pattens in the life cycle events. More on these patterns in a later post, when I can match the data with species names.


Processing Jungle Rhythms data: coordinate transformations and line intersections

I mentioned that I started working on processing some annotations to data. This will get me a feeling for the data quality, but more so get me thinking about how to process the data efficiently.

In this blog post I’ll quickly outline the methodology I’ll use to process the annotations. First I have to give a quick summary on what the data looks like once annotated, how to deconstruct this data into usable data.

Below you see a picture of an annotated yearly section. Red dots outline the most common intersection coordinates for the yearly section (green crosses represent all measurements), while green lines represent the annotated life cycle events within the yearly section. Note the accuracy of all the annotations, rather amazing work by everyone who contributed!

With these key locations (red dots) of the yearly section, mainly: the start, middle and end (providing the general orientation of the yearly section within the image), the annotations can translated into true data (compensating for skewness and warping in the picture).

Each yearly section, will be processed a half year at a time using roughly four steps:

  1. For each year I make sure the bottom axis (bottom left - bottom middle or bottom middle - bottom right) is aligned along the Cartesian x-axis. This is done by rotating the data around the bottom left or bottom middle point, respectively. -> in this image the axis is relatively close to optimal!
  2. Since I only process data within half a yearly section I trim annotated lines to fit each six month period.
  3. After trimming the annotations to fit neatly within the first six months I need to transform all these coordinates to days within a year. I know the spacing between the row is equal. As such, the total length of a row can be calculated as the distance of a line which crosses the beginning and end of half a yearly section.
  4. What remains is to calculate the distance between the beginning and start of an annotated segment relative to the total length to determine the days they cover during the year.

Finally, all these data will be combined into a matrix. This matrix will then be linked to the original species data, kept in a separate file.


© 2018. All rights reserved.

Powered by Hydejack v7.5.1