Measuring Research, Making a Difference
In the research world, we’re often told that the the only way for our work to have a measurable impact is through the citation of publications based on our work. The mechanisms for this are in place, the stakeholders understand the framework, and everyone is happy… or so we’re led to think.
However, the status quo neglects the many research outputs, including software, that may be generated by a research activity and certainly doesn’t encourage their sharing and reuse, let along give proper credit for their production. Current methods rely on publishing a paper on your work, and asking people to cite that (which isn’t always simple as the R community have found).
Many groups are already looking at the issue of measuring impact of data sets and other research outputs, often hand-in-hand with the open science / open research movement. The DataCite consortium are addressing the challenges of making data sets accessible and visible. Alt-Metrics have emerged to suggest alternative views of impact which move away from the more traditional citations and peer opinion, emphasising approaches which make use of tools and semantic information. The Executable Paper Challenge, is asking people to examine ways of ensuring verification and scholarly communication for results which make use of software within the constraints of the current publishing model. And of course, the Beyond Impact project is looking to facilitate a conversation between researchers, their funders, and developers about what we mean by the “impact” of research and how we can make its measurement more reliable, more useful, and more accepted by the research community.
Given this wealth of work, particularly on tracking and understanding the impact of research data outputs, why can we not reuse it for software?
In my opinion, there are a number of issues which make software a particularly specialised form of dataset. Not all are specifically software based – probe into the data publishing and reuse world and you can see similar issues – but they are the ones which need to be addressed if we want to give proper credit to software as a research output, and understand its impact.
In general, when a dataset is deposited in a repository, that specifies its granularity – the dataset can be considered as a unique object. It may consist of a collection of pieces of data that have distinct characteristics (e.g. an album is well-defined as a collection of songs) but importantly there is a point in time at which the distinction is made.
Software has become harder to define, in particular as software has made the leap from a single machine to a distributed system. What do you consider a part of the software, and what might you assign an identifier to? What is an “application”: source code, binaries, workflows, manuals, website? Because the boundaries of a piece of software are less distinct than a dataset, the dependencies – both explicit and tacit, functional and non-functional become more important.
There is also great scope for ambiguity in the internal contents of a “software object”. Examples of this include: alternative routines and libraries; downloadable content (DLC) which is accessed post “release”.
When do you consider software has changed enough that it constitutes a new version? Although best practice for software development has well defined processes for versioning, how do these map to more accessible identifiers for assessing impact? Basically, how can we cite software when software citations are not generally compatible with the speed of evolution in software development?
This is not dissimilar to the current case for datasets: some consider a new version to be created when a new public deposit is made at a repository; others do not change the identifier associated with the dataset, but update the manifest to indicate that the data in the collection has been changed / corrected.
A particular issue for software is the increased likelihood that certain versions of software will have particular bugs which affect the quality of derived work. Understanding the evolution and quality of software objects is an important part of versioning – if an instance of a software object is “wrong”, should it be possible to revoke. Likewise, if software is not “good enough” but has been used to produce results, should it be given an identifier?
As a side note, collections of many versions of a digital object are an important study archive in their own right!
One particular quirk of software development is that it is often hard to define authorship. More specifically, it is hard (and indeed probably bad practice) to attribute authorship of particular pieces of code to particular people. Therefore how do we attribute credit?
Even when you can determine authorship, it is near impossible to define the contribution an individual has had to impact at the time that their contribution is made, and also at the time at which an identifier is created for a particular version of a piece of software. For instance, a small contribution of a bug fix might turn out to have a huge impact in derived research based on a piece of software. How and when do you define who gets what portion of credit?
A probable solution is to divorce the attribution of credit from identifier for software set. This allows each software project to define own contribution assignment model. But clearly there should be some way of being able to collectively work towards best practice in this area.
Looking further afield, other communities have sought to address this issue. The crowdsourced product design site Quirky has the concept of influence, which is their way of attributing future credit shares amongst contributors. As the number of projects increases, and more go to market, the model they use for attributing these shares can be tuned so that it reflects the different weights depending on the size of project, number of participants and stage of the contributions. The music industry has defined process, but no uniform split, for recognising contributions and dividing royalties. Could we do the same for software?
It’s clear that something needs to be done. It’s also clear that it’s not as clear who this should be driven by. For datasets, there is a clear need for data publishers, those tasked with providing long term access to datasets (e.g. libraries, data repositories) to make this work. For software, should it be the developers / producers / distributors / archives / funders / users who deal with this? In many cases there is no clear “middle man publisher”: places like SourceForge are infrastructure providers who have no need of understanding impact of the projects it hosts at the level of detail we seek.
So, as a rallying call to to those who do care, I propose the following manifesto as a starting point – thoughts, comments and improvements welcomed - I promise to give credit for contributions!
As those involved in the use and development of software used in research, we believe that:
To enable this, we subscribe to the following principles:
This does not rescind the values of the current credit system, but reinforces them by acknowledging that there are many forms of output that can lead to indicator events.
A version of this post first appeared on the SSI Blog.
Thank you to the following people who have participated in discussions which have led to this article: Tom Pollard and Max Wilkinson, British Library (software citations); Cameron Neylon, STFC (open science, code deposit, and alt-metrics); Michael Feathers, Object Mentor (behavioural economics for software engineering); Greg Wilson, Software Carpentry (software development by researchers); Dirk Roorda, DANS (economic models for software preservation); Brian Matthews and his SoftPres team at STFC (software and software project metadata for software preservation); Steve Bennett, ANDS (software project metadata); Ross Gardler, OSS-Watch (software project metadata), and anyone else I may have missed.