Two truths and a lie: data in the humanities and social sciences

Two Truths and a Lie: Data in the Humanities and Social Sciences featured image

Data can be a tricky topic for researchers in any field, but it raises additional questions for those working in humanities and social sciences (HSS). In this blog post, Lois Elliott, Marketing Assistant at F1000, highlights data sharing falsehoods with this HSS spin on the classic game of two truths and a lie. 

There’s no doubt about it, researchers within the humanities and social sciences (HSS) love to engage in debate, especially over what the ‘truth’ is. These arguments can last decades, even centuries.  

Yet, there is one topic where the truth shouldn’t be up for contest, and that’s about how you, as a researcher within HSS, share your data.  

Will you be able to spot all the lies?  

Let’s start with an easy one… 

Round 1: types of data 

The lie: “data is only quantitative, so it doesn’t really apply to my field.”

Don’t fall into the trap of assuming that data equals numbers — it’s a myth that data is only quantitative! In fact, as HSS researchers, you probably work with a plethora of data, but just have a different name for it. Archival documents, interviews images, newspaper clippings, case notes, and even audio files are all examples of HSS data. 

Moreover, HSS data repositories have existed since the 1960s. For example, the UK Data Archive and the Harvard-MIT Data Center. Storing your data in repositories like these is key to ensuring that your data meets the FAIR guidelines. Data that adheres to the FAIR criteria is essential if you want to submit to F1000Research.    

To recap, HSS data sharing is not as uncommon and unpractised as you may think. Plus, there are plenty of resources and repositories already out there to help you along.

With the basics settled, let’s move on to round 2. 

Round 2: datasets

The lie: “publishing my data first, before my article, risks someone scooping my idea.”

Some researchers worry that by publishing their data first, they will allow other academics to make away with their research idea. While this is a shared concern across disciplines, there is no evidence to support this claim.  

Crucially, no one else knows this data as well as you do. Even with the most detailed metadata and the strictest adherence to the FAIR guidelines, you will still have one key advantage: you’re the original collector. Act on this and make your research paper as detailed and informative as possible. Plus, there is evidence to suggest that publications associated with a dataset have a citation advantage.  

So, don’t fear the scoop, embrace the credit! As a result, you might even find new collaborations come your way. 

Round 3: sharing your data and metadata 

The lie: “sharing my data first runs the risk of the data being misinterpreted.”

If you are thorough and document how the data should be interpreted, there is no cause for concern. 

Firstly, before beginning your research, take a look at our FAIR Data Principles so you can set your research off on the right foot and avoid potential misinterpretation.  

Furthermore, if your data is being published under the CC BY 4.0 license, there is even more reason to ensure that your research adheres to these principles. Under this license, researchers are welcome to remix, transform, and build upon the material for any purpose, even commercially.  

However, this doesn’t mean that your data will be scooped and that you will lose credit. As mentioned in round 2, publishing data under this license helps to increase the reproducibility of your research.  

Likewise, to improve the understanding of your data, you can also publish a separate data dictionary. This is a separate file that stores the metadata associated with your overall data, such as units and ranges, and often includes other useful information for interpreting the dataset.  

Top tip — Data dictionaries are also a great example of how metadata can be when ethical or data protection issues prevent you from sharing the full dataset.  

Take a look at this example of a data dictionary associated with an F1000Research article. 

Now, onto the penultimate round… 

Round 4: data relating to tangible sources 

The lie: “publishing data relating to tangible sources will lead to a loss of revenue and a decline in footfall for the host institution.”

This is a common fear. However, Europe’s digital cultural platform, Europeana, found that this isn’t necessarily true. Instead, they discovered that making data relating to tangible sources openly available can bring benefits to the institution. These benefits include: 

  • Increased audience engagement with their collections. 
  • Heightened visibility of the institution by helping them stay relevant in a world where open data is becoming the norm. 
  • New funding possibilities for projects which request freely accessible content and metadata. 

Moreover, it’s true that HSS data has a much wider readership and real-world applicability than academics may originally believe. Both policymakers and educators can make use of open HSS data.  

While you may think that your data isn’t useful right now, you can’t know in advance how useful it might be in the future. For example, historical and social data originating from the Spanish Flu of 1918 has been used to influence academic recommendations for the COVID-19 pandemic. 

Let’s move on to our final round! 

Round 5: F1000Research and HSS data 

The lie: “F1000Research only publishes data associated with the life sciences.”

Over the years, F1000Research has expanded its scope and now welcomes research from all disciplines including the humanities and social sciences, as well as the physical sciences and engineering.

If you’re planning to publish this data as a Registered Report, you can find out everything you need to know over on our instructions for authors page.

That’s a wrap. How did you do? 

At F1000Research, we know that publishing data can seem laborious or intimidating, especially for researchers in the humanities and social sciences. But it doesn’t have to be this way! That’s why we have put together this handy Open Data guide to make data sharing a straightforward process. 

We hope to see you publishing your HSS research and data with us soon!  

previous post

How energy modelling can support policymaking for the European Green Deal

next post

Coproducing quality in healthcare: A multidimensional model

User comments must be in English, comprehensible and relevant to the post under discussion. We reserve the right to remove any comments that we consider to be inappropriate, offensive or otherwise in breach of the User Comment Terms and Conditions. Commenters must not use a comment for personal attacks.

Click here to post comment and indicate that you accept the Commenting Terms and Conditions.

Leave a Reply

Your email address will not be published. Required fields are marked *