If you’re a researcher, then I’m guessing that you’re looking for the things you need to share your research data and link it up with your other research objects, like preprints, peer reviewed journal articles. I’d bet you’re also looking for ways to make the most of your new knowledge. Evidence suggests this is true, like the surveys reported here and here. Maybe you’re focusing on questions about the social infrastructure you need, like knowing when and how to share data, and who with, and how to make your data count. Or maybe you’re thinking about technical infrastructure to make your data FAIR, like the services you need to prepare your meta-data, ingest and curate your data archive, and to link it up and cite it. Helena Cousijn from DataCite in “Why Share and Cite My Research Data? A Guide to Making Open Research Easier” introduces us to a new data citation roadmap and shares insights into what to expect in 2019. For your social and technical researcher infrastructure, the community that comes together as the Research Data Alliance* (RDA), is wiring-up the circuit boards we need to power data sharing (and thereby open research). In this post, we talk with Simon Goudie, co-chair of the Data Policy Standardization Interest Group at (RDA), and Wiley Research Journal Publisher at in Melbourne, Australia, about the great work that’s going on in the RDA community.
Q. So Simon, please, start by telling us what RDA is all about. What makes it such a special organization?
A. The RDA “aims to build the social and technical bridges that enable open sharing of data.” That’s actually a huge remit, as it covers not only the practical aspects of how to host and maintain the massive amounts of research data that are now being generated, but also how the world interacts with this data: how to encourage the research community to make data available, how to keep it safe, how to remix and re-use it, and how to ensure that it becomes a vital resource that delivers benefits for years to come. The RDA is a neutral ground where 5,700+ members from around the world agree to principles of openness, consensus, balance and harmonization, with a community driven and non-profit approach. In RDA working groups, you’ll find people working together from a wide variety of different research organizations and institutions, publishers, governments, platforms and companies. These working groups produce outcomes that have been rigorously discussed and tested and are released as robust recommendations, ready to have a positive impact for researchers managing, sharing, and using data.
Q. What kind of impact has RDA made already for researchers?
A. There is a long list of recommendations and outputs from the RDA, listed here. Many of these work at a very low level – the wiring that makes and will make research data sharing work and future projects possible. The Scholix initiative, for example, is a major step towards sharing and exchanging metadata on research data and published articles. This framework will make it much easier to discover the data underpinning an article you are reading, as well as see the impact of a dataset on the literature.
Perhaps the most visible output has been the ‘23 Things’ initiative. These are resources for librarians to use within their communities of researchers to increase awareness of data sharing, increase literacy around research data practices, and encourage people to share their practices and get involved with data management.
Q. How’s it going with the data policy standardization work you’re leading? What difference will your work make for researchers?
A. The data policy standardization group aims to bring a unified approach to setting data policies. Many publishers have their own distinct policies, often with “tiers” of requirements (such as the Wiley “encourage/expect/mandates data availability” tiers). For a journal or publisher looking to implement a data policy, it can be confusing to know which model to adopt, or the key items to include in policy. The group I’m working with hopes to simplify this, by providing a resource that details the standard issues around data sharing that should be considered, as well as how these can be put together into a robust data policy for their publication(s). This should help make new policies easier to implement, while also giving existing policies something to be compared against and perhaps reviewed. If you are a researcher, this will make following data policies much easier: rather than having a different policy using different terms with different requirements for each journal or publisher you work with, standardized policies will mean you know what to expect, regardless of where you publish.
Q. Why is the work of RDA so important?
A. Research data management has become a critical issue. Without the capacity to reliably store and share the outputs of research beyond what can be conveyed in the capacity of a three-page journal article or presented in a conference paper, so much valuable information has been lost or thrown away over time. This has meant that experiments have had to be re-performed to verify results, or that they have been conducted redundantly by different researchers, unaware that others are working on the same issues. Not having the original data also makes it much more difficult for subsequent researchers to build on the outcomes of earlier studies. We’ve also lost much of the data that has led to null results – just as important as the data behind more significant outcomes and key to reducing the amount of redundant work undertaken. Those days have passed, though. We now have the technical capacity to store and maintain this data and to make it available to all who are interested, informing many more outcomes than may have been possible before. Take the Australian Longitudinal Study on Women’s Health, for example: the data made available from a study of 57,000 women has contributed to no less than 765 published papers and informed many policy documents, guidelines, reports and presentations. The RDA is playing a key role in establishing the infrastructure, policies and guidance required to make all of this happen, from informing best practices around unique identifiers and repository structures through to providing advice to researchers on locating and citing data objects.
Q. Thanks for all that, Simon. Let’s close with a crystal-ball moment. So, if you had a prediction for what research data sharing is going to look like in 2025, what would it be?
A. Things are moving very quickly and a lot of the underlying infrastructure is falling into place. By 2025, archiving research data should be an accepted practice and the idea of simply keeping files on a thumb drive in a desk drawer should seem ridiculous. What I’m excited about is how all of this comes together and is used in practice by researchers, publishers, and readers. I’m looking forward to seeing people moving seamlessly between the articles that define discoveries to interrogating the data that underpin them, then moving on to other outcomes that the same data have informed, leading them to combine and re-use the data themselves to advance their own projects. Having access to online data repositories opens-up the ability to programmatically interact, so data mining will continue to unearth entirely new ways of working with research results – not just drawing on the outcomes of twenty or fifty studies, but thousands, with millions of data points to analyze. It all starts with recognizing research data as a valuable asset and taking the time to manage and store it with care.
Great stuff, Simon. Thank you.
Researchers tell us that preparing an article for publication is a good time for them to think about doing the “admin” required to share data, as well as planning how to get the most impact from their whole research project. Journals with good data-sharing policies are obvious places researchers can go to for help. The Wiley Expects Data policy described in this post is the focus of a great deal of work at Wiley right now: We’re using our Expects Data Toolkit to help journals set-up the policy, and to better support researchers who choose to share their data. Simon’s work with RDA is all about building the shared social and technical infrastructure we all need to make sharing research data easier, whether we’re researchers or publishers. If you're a researcher, then the message for you is that we’re doubling efforts to help you get the recognition and impact you need for all your research outputs, data included. Let us know what you think in the comments below. And if you’d like to get involved in one of the Wiley open research and research data initiatives, then let us know that too. Thank you!
*Wiley is delighted to be an organizational member and to continue its support for RDA.
About the AuthorMore Content by Chris Graf