Research data is an important topic in academia. We – the Specialised Information Service Asia (Fachinformationsdienst [FID] Asien) – would like to support you with information about how to handle research data in the social sciences and humanities from and about Asia.
On this page, you will find some introductory information on research data as well as links with further information.
Please feel free to contact us with any questions:
Research data refers to all data used in research projects in order to deal with the respective research question. When we talk about research data here, we first of all refer to digitally available data.
Due to the diversity of research fields and methods in Asia-related humanities and social sciences, we deal with very diverse research data, including: text data, bibliographic data, geospatial data, audio, image and video data, numerical or statistical data, but also digital 3D models and program codes such as for relational databases, digital tools and analysis scripts, etc. We generate data in the course of the research process, when studying sources, evaluating and annotating texts, working with objects from collections, through digitisation, recordings and observations, experiments and simulations, qualitative and quantitative surveys, etc. We as researchers create data ourselves, or we create new data when we further develop or evaluate existing data (our own or re-used).
The German Council for Scientific Information Infrastructures (RfII) and in the German Research Foundation’s (DFG) Guidelines on the Handling of Research Data provide relevant definitions of "research data".
Good scientific practice: Research data are an important part of the research process. Therefore, securing and preserving research data is enormously important, among others, in order to make the research process transparent and to ensure the possibility to reproduce the research results (see the DFG Guidelines for Safeguarding Good Scientific Practice, in German).
Reuse of research data: Research data often has a lasting value beyond the research context in which it was created, and can form the basis for other research questions and projects. This is particularly true for unique data that cannot be reproduced. Therefore, it is important to ensure accessibility to the data and to curate the data, i.e. to enrich the data with descriptive and contextualising information to preserve the scientific significance of the data. Often it is helpful to use existing standards (data formats, metadata formats, etc.) when creating and describing the data in order to ensure the comprehensibility of the data. In addition, it is important to consider the perspective of law (please see below).
New research methods: The provision of digital data sets increasingly makes it possible to process such sets using a constantly growing number of digital tools. This opens up completely new ways of research in the field of digital humanities (with respect to research questions, methods, etc.).
It is recommended to deal with the topic of research data management already in the early planning stage of the research project and, if necessary, to create a data management plan (DMP). The aim of research data management is to make the data accessible in the long term, independent of the data producer, and to ensure that it is structured, reusable and verifiable as far as possible. The FAIR Data Principles can help: data should be Findable, Accessible, Interoperable, and Reusable. Accordingly, the project design should already consider or plan which data and how they should be collected, processed, evaluated and described, and which steps are necessary for the transformation, selection and storage of research data.
It is helpful to consider the different stages of the research data life cycle already during the project planning in order to preserve the scientific validity of the data and thus the possibility for subsequent use:
Forschungsdaten.info, for example, provides extensive information on research data management, on various topics such as planning and structuring research data, including the creation of a data management plan (DMP), organising and working with research data, preparing and publishing research data, including information on metadata standards, preserving and re-using research data, and legal and ethical issues that should be taken into account when dealing with research data. However, this website is mostly in German. You can find other guides in English, e.g. the handbook Managing and Sharing Data of the UK Data Archive.
DARIAH provides recommendations for data and metadata formats specific to humanities research data, including those concerning cross-disciplinary objects: "Specific recommendations for data and metadata" (in German). FAIRsharing.org provides a curated list of subject specific data and metadata standards (here the link for humanities and social sciences). In general, when describing the data and choosing the (meta-) data formats, the goal should be to provide the data in a way that these are readable, intelligible and processible by both, humans and machines, today and, ideally, in the future as well.
In addition to general information on research data management, there are a number of subject-specific recommendations and handouts on how to handle research data, e.g. by the German Research Foundation’s (DFG) Review Boards and various academic societies. For Asian Studies, see, for example, the recommendation of the Review Board 106 "Social and Cultural Anthropology, Non-European Cultures, Jewish Studies and Religious Studies" (in German). Depending on the subject and focus of the research question, other recommendations might be relevant as well: for sociology (in German) and economics as well as of the German Data Forum (RatSWD; in German), for scientific editions in literary studies (in German), for collecting language corpora (in German) as well as legal aspects in the handling of such corpora (in German) in linguistics.
More and more research funding organisations expect, when applying for a project, the researcher (or group) already considers how to deal with the research data he/she will create, evaluate and develop in the project. The German Research Foundation (DFG) provides Guidelines on the Handling of Research Data and the European Commission provides Guidelines on FAIR Data Management in Horizon 2020. Forschungsdaten.info has compiled an overview of important requirements and information on guidelines of research funding organisations. Such requirements of research funding organisations often include a data management plan.
A data management plan helps to plan and structure the handling of research data in a research project. It describes how the data that is / will be collected, processed, evaluated, and described should be handled during and after the project. See also the article on data management plans at forschungsdaten.info (in German) or UK Data Archive handbook Managing and Sharing Data.
Many research funders expect a DMP along with the application for a project. You will find sample DMPs for meeting institutional and funder requirements on the websites of the Humboldt-Universität zu Berlin. In addition, tools for the creation of DMPs are available, such as DMPonline with a template for European Commission’s H2020 project proposals or RDMO, which is designed to meet the requirements of funding organisations in the German-speaking area, i.e. DMPs for DFG and BMBF project proposals.
Directories of repositories:
Metasearch engines for research data:
Interdisciplinary (research data) repositories:
Subject-specific repositories and data collections:
Social research data:
In addition, we plan to use a joint search to display research data stored in our repository as well as data relevant to Asian Studies stored in repositories of other institutions.
When planning and conceptualising your own research project, consider reusing already existing (and published) data sources. This might be revevant e.g. for meta-analyses, for expanding your research focus or for optimising your study design. Research data often include material that has not been analysed in the original research. Relevant, for example, is data produced in projects that are similar in terms of content and/or methodology or when your project does follow-up research. Finally, reusing data is economic and saves resources.
When reusing data that originated in other research contexts, it is important to consider the following aspects:
If you are reusing data: Please remember to cite the source of the data you are using.
We, the FID Asia, will be happy to assist you if you would like to publish your data. If you would like to publish data in the CrossAsia repository, please contact us.
When selecting a repository, you should bear in mind that you might use non-Latin scripts and probably those that are not Unicode-compatible in the research data, in particular in the metadata and descriptive materials for the research data. Please make sure that these will be displayed correctly and that they are searchable in the repository.
In order to publish data (open accessible) it is important to consider some legal questions, in particular if copyright protected, sensitive and/or personal data is concerned.
In the case of personal data, it is always required to get the consent of the persons concerned. In addition, please clarify whether the data can be anonymised.
If the copyright of others is affected, e.g. if larger parts of a database or of a title are involved, the authors should be contacted in advance.
Forschungsdaten.info has created a document to support the decision making process for publishing research data that deals with the most important legal aspects (in German).
You are welcome to contact us with these and any other questions at any time:
Staatsbibliothek zu Berlin - PK
East Asia Department
Tel.: +49 30 266-436001
CATS Library / South Asia
Voßstrasse 2, Building 4110
Tel.: +49 (0)6221 54 15047