“A rose by any other name would smell as sweet” is a quotation by William Shakespeare from his play Romeo and Juliet to suggest that the names of things do not matter, only what things are.
A large part of the discussion at the Beyond Impact workshop has looked at mechanisms for citations of “non-traditional” research outputs and it is clear that their names, their identifiers, really do matter.
These research outputs include both physical and virtual; persistent, evolving and ephemeral. Amongst the types of output discussed at the workshop included datasets, software, blog posts, presentations, performances, and teaching.
The reasons for citation normally fall into three categories: enabling access, allowing tracking of impact, and encouraging credit. Likewise, we can split the business of citations into two parts: naming, to enable those who wish to reference an artefact in a consistent way in other outputs (often defined for publications in style guides such as APA style, MLA style or IEEE style) and identifiers to create interoperable frameworks that allow references (“handles”) to the location of artefacts and their associated metadata to be shared and persisted.
What is clearly a challenge, and one which came up in the majority of sessions at the workshop, was that whilst there was an agreement that we wished to include their influence, there was no overarching best practice for citing or attributing non-traditional outputs.
The workshop recognised that there were several initiatives that already look at providing policies and guidance on how to cite some forms of non-traditional outputs. For instance, the UK Data Archive provides guidance
for researchers to cite datasets; DataCite are developing their own policy on how best to cite data. Some producers publish their own conditions for reuse and citation of their data
. It’s important at present that these citations get included in the reference list of journal articles, where it can be captured and harvested. For instance Thomson Reuters has a way of capturing DOIs, though it is not currently used. DOIs will be exposed in the new Web of Knowledge version 5.
Our hope is to make useful research outputs visible, reachable and accessible, to allow them to become reproducible and reusable. However we have to balance a general encouragement towards publishing soon, fast, open with the need to be able to cite it in a scalable way.
One concern raised during the workshop involved reference list limits that are imposed by some publishers (historically from print limitations). A potential solution which came from this is the concept of a Beyond Impact Attribution Repository (BIAR
), which enables “collections” of references to be given identifiers. This has the additional benefit of allowing metadata describing the way that the different sources are being used together to be recorded, and enabling others to cite the same collection when they make use of the same general process.
Another concern is how to provide effective incentives to researchers both to ensure they cite all research outputs that they consume, but also to persuade them to use systems that generate useful identifiers for the outputs they themselves produce. Some current policies include ESRC not paying the final part of the award if data is not offered for deposit to UK Data Archive. However, UKDA is not obliged to take the data. One suggestion was to have a specific license for artefacts that required citation under a particular mechanism when they are used.
Additionally, there is no real mechanism yet for credit to be given to both the producers and consumers of non-traditional inputs when they are reused. The “best” current method is for researchers to have a output described in a traditional publication, but this is often not possible due to differing opinions on novelty. Even then, it’s not always clear whether people, for instance, are citing a paper on the BLAST software because of the algorithm or the software itself. Allowing a researcher to identify the “influence” their entire body of work has had on the rest of the community would be a huge incentive, and one explored by the Total Impact tool
developed as a proof-of-concept at the workshop and described elsewhere
Something which generated significant discussion, if not conclusions, was whether existing identifier systems like DOIs worked more generally for other outputs like software and blog posts. The main issues were around the scope of what would be attached to the identifier (a portion of a blog post, the blog post, the blog itself), how you would give credit, and how you would deal with evolving versions of the artefact (e.g. it is continuously updated, or it ceases to exist). However, as long as you can guarantee some permanence to the identifier, and the metadata describing the artefact, then it should be possible to overcome the other issues.
It’s clear that the current systems such as peer review do not necessarily place enough emphasis on the impact of non-traditional research outputs. By providing better guidance that spans best practice in citations for production (generation of identifiers) and consumption (style guides, use of identifiers), we are able to lay the foundation for a more robust system of metrics that encourages more micro-publication, and lets us construct better measures of a researcher’s influence and impact.