nav-image-narrow

Research data in Asia-related research

We offer comprehensive advice and inform you about all aspects of research data management: Whether it's support with data management plans, assistance with funding proposals, or mentoring for data publication — we are here for you!

mailto: x-asia(at)sbb.spk-berlin.de

Advisory Services

Research Data

Research data are all data used during a scientific project and research process to address the respective question. Contact us for advice on managing your research data.

Data Management Plan

The handling of research data should be documented and planned. Contact us for tailored support for your data management plan (DMP).

 

Publication Platforms

CrossAsia Open Access Repository

CrossAsia Open Access Repository is the central publication platform of the CrossAsia portal and is open to all researchers and research institutions in the Asia-related sciences for scientific open access publishing.

CrossAsia Integrated Text Repository (ITR)

The CrossAsia-ITR provides a variety of licensed electronic resources from and about Asia (around 70 million documents) independent of the originating databases for research, including text/data mining purposes.

Relevant attempts at defining research data can be found in the 2016 report to the Council for Information Infrastructures, in the 2015 guidelines, and in the 2021 checklist from the DFG regarding the handling of research data. Given the large diversity of research fields and methods in the Asia-related humanities and social sciences, research data are naturally heterogeneous. The related understanding of the term is often discipline- and/or project-specific. These may include texts, bibliographic data, corpora, geodata, audio, image, and video data, numerical or statistical data, survey or observation data, methodological test procedures, as well as digital 3D models and program codes, like for relational databases, digital tools, and analysis scripts.

Good scientific practice: Research data are an essential part of the research process, so securing and preserving them is important to ensure transparency in the research process and to guarantees the reproducibility and traceability of research results.

Reusing Research Data: Research data often have lasting value beyond their original creation and research context. Therefore, it's crucial to ensure the accessibility of data, curate them, and enrich them with descriptive and contextual information to preserve their scientific value.

New Research Methods
: The availability of digital datasets allows the use of an increasing number of digital tools, opening new research possibilities in digital humanities.

It is advisable to address the topic of research data management early in the planning phase of a research project and, if necessary, to create a data management plan (DMP). Research data management aims to make data accessible in the long term, independent of the data producer, and to ensure that it is as structured, reusable, and verifiable as possible. The FAIR principles (2016) have proven helpful in facilitating data exchange and are increasingly expected as a de facto standard by research funders (see also the initiative "How to FAIR"). The data should be:

  •  Findable (discoverable),
  •  Accessible (available),
  •  Interoperable (exchangeable), and
  •  Reusable (usable again).

During project design, technical implementation should be planned or considered, such as the annotation of metadata and the searchability of data, which data will be collected, processed, analyzed, and described, and what steps are necessary for the transformation, selection, and storage of research data.

It is also useful to consider the various stages of the research data lifecycle during project planning to preserve the scientific value of the data and enable reuse:

  1. Planning 
  2. Creating and collecting data 
  3. Processing and analyzing data 
  4. Publishing and sharing data 
  5. Archiving and making data accessible 
  6. Reusing data 

(Phases based on the UK Data Service)

Further information on research data management can be found at, for example, [forschungsdaten.info](https://forschungsdaten.info). This includes various topics such as planning and structuring the handling of research data, creating a data management plan (DMP), organizing and working with research data, preparing and publishing research data (including information on metadata standards), preserving and reusing research data, as well as rights and obligations that should be considered in dealing with research data.

Some recommendations for data and metadata formats specific to humanities research data, including those involving interdisciplinary objects, can be found at DARIAH: “Disciplinary Recommendations for Data and Metadata” (2016). A curated list of disciplinary data and metadata standards can be found at Fairsharing.org, prefiltered for the humanities and social sciences. In general, when describing data and selecting (meta)data formats, it should be ensured that the data is readable, understandable, and processable by both humans and machines, not only today but ideally in the future as well.

In addition to general information on research data management, there are some discipline-specific recommendations and guidelines for handling research data, for example, from the DFG subject-specific committees (2023) and various professional associations. For Asian studies, see, for instance, the guideline from Subject Committee 106: “Social and Cultural Anthropology, Non-European Cultures, Judaic Studies, and Religious Studies.”

Depending on the specific orientation of the research question, other guidelines may also be relevant, such as:

More and more funding agencies expect researchers to consider how research data, which will be created, analyzed, and developed during the project, will be handled at the time of the grant application.

The DFG has issued guidelines (2015) and specific requirements (2022) on the handling of research data. Additionally, the European Commission has published the Guidelines on FAIR Data Management in Horizon 2020 (2016).

An overview of the key requirements from funding agencies can be found at Forschungsdaten.info. Such requirements from funding agencies often include the preparation of a data management plan.



A data management plan helps to plan and structure the handling of research data in a research project. It describes how the data, which will be collected, processed, analyzed, and described, will be managed during the project and after its completion. See also the article on data management plans at forschungsdaten.info.

Many funding agencies expect a data management plan to be included when applying for a project. Sample DMPs for various funding organizations can be found on the website of Humboldt University of Berlin.

Additionally, there are tools for creating DMPs, such as:

  • DMPonline: Includes a template for H2020 project proposals by the European Commission.

  • RDMO: More focused on the German-speaking region, supporting DMPs for DFG and BMBF project proposals.



Research data are either stored in the institutional repositories of the research institutions where the respective researchers are located, or in discipline- or topic-specific repositories. Please check the respective terms of use and access information.

Directories of repositories:

  • OpenDOAR – directory of repositories with various content types, including research data
  • Re3Data – Registry of Research Data Repositories, a registry to search for repositories that provide research data. It is possible to filter the result by discipline: humanities and social sciences 
  • Repository Finder – a service provided by DataCite to find repositories in Re3Data meeting the criteria recommended by the Enabling FAIR Data Project

Metasearch engines for research data:

  • Base – Search engine especially for academic web resources
  • Cinii Research – Cross-search, Discoverysuche for academic output, including research data, provided by the National Institute of Informatics (NII), Research Center for Open Science and Data Platform (RCOS)
  • DataCite – Search in metadata of participating Data centres
  • dataOn – Korean National Research Data Platform Service
  • IRBD Institutional Repository Database 学術機関リポジトリデータベース – NII National Institute of Informatics
  • RatSWD – Search for research data provided by centres accredited by the German Data Forum (RatSWD)https://www.ratswd.de/forschungsdaten/su

Interdisciplinary (research data) repositories:

  • GitHub – Software development projects
  • OpenAIRE – freely accessible research results, i.e. publications and data sets, from EU-funded projects (keyword: Open Science).
  • Peking University Open Research Data 北京大学开放研究数据平台
  • Zenodo – research results from all disciplines, i.e. publications, data sets, presentations, etc.

Subject-specific repositories and data collections:

Humanities data:

Social research data:

  • Barometer on China's Development 中国发展数据库 – Universities Service Centre for China Studies, CUHK
  • Beijing City Lab 北京城市实验室
  • Chinese Social Quality Data Archive 中国社会质量基础数据库 – Chinese Academy of Social Sciences
  • CNSDA Chinese National Survey Data Archive 中国学术调查数据资料库 – National Survey Research Center (NSRC), Renmin University of China, and National Natural Science Foundation of China
  • DataHub – contains especially social science and economics data sets
  • Fudan University Social Science Data Repository 复旦大学社会科学数据平台
  • ICPSR – social science data of the Inter-University Consortium for Political and Social Research
  • KOSSDA Korea Social Science Data Archive 한국사회과학자료원
  • PORI Hong Kong Public Opinion Research Institute 香港民意研究所
  • SowiDataNet | datorium – search for social science research data in GESIS, the Leibniz Institute for the Social Sciences (so far contains very few data about Asia)
  • SRDA Survey Research Data Archive 學術調查研究資料庫 – Center for Survey Research, Research Center for Humanities and Social Sciences, Academia Sinica
  • SSJDA Social Science Japan Data Archive SSJデータアーカイブ – Center for Social Research and Data Archives, Institute of Social Science, The University of Tokyo
  • ... and in official statistics and governmental data repositories of the respective countries and regions

In addition, we plan to use a joint search to display research data stored in our repository as well as data relevant to Asian Studies stored in repositories of other institutions.

When planning and conceptualizing your own research project, it is advisable to consider existing (and published) data sources. This can be useful for meta-analyses, expanding the scientific focus, or optimizing your study design. Research data often contain material that has not been thoroughly studied. Relevant data may come from projects with similar content and/or methodology or from predecessor projects. Finally, reusing existing data helps conserve resources.

However, when reusing data created in other research contexts, several aspects should be considered:

  • Data quality and completeness:
    In what context were the data created? Are there details about methodology, quality control, and completeness? Were the research results, within the context of which the data were created, published (e.g., via peer-review)?

  • Data source and reliability:
    Where were the data published? Which institution is behind the repository? Is there information about long-term availability? Are persistent identifiers (DOI, etc.) used?

  • Description and metadata:
    What data formats are used? Are the data sufficiently described using standardized metadata formats accepted in the respective field? Are the project design, methodology, data creation, and processing described? Are data types, variables, etc., defined in so-called data dictionaries?

  • Accessibility and usage conditions:
    Are there defined conditions for accessibility and reuse, for example, through open licenses?

When reusing data, make sure to:

  • Cite the source of the data, just as you would cite other publications in line with good scientific practices.

  • Ensure an appropriate handling of data, particularly to respect the interests of indigenous communities. The CARE principles—Collective Benefit, Authority to Control, Responsibility, and Ethics (2019)—can be helpful in this regard.

The FID Asia offers advice on data publication. If you wish to publish data through CrossAsia, please contact us.

When selecting a repository, consider whether non-Latin scripts or those not compatible with Unicode can be used in the metadata and descriptive materials associated with the research data. Ensure that such scripts are correctly displayed and searchable within the repository.

Some repository options include:

  • CLARIN-D – research data services and centres for linguistic data and language corpora
  • CrossAsia Open Access Repository – Open Access Repository for research concerning Asia
  • DARIAH-DE – research data repository for the humanities and cultural sciences
  • Subject-specific repository, search e.g. via Re3Data
  • GitHub – development platform
  • Institutional, university repository of your institution
  • TextGrid – digital preservation archive for humanities research data (so far the repository does not contain any texts in Asian languages)
  • Zenodo – interdisciplinary repository for scientific data sets, it is possible to implement access restriction

To publish data (openly accessible), it is essential to clarify the relevant legal issues beforehand. These concerns particularly involve copyrighted, sensitive, and/or personal data.

For personal data:

  • Consent from the individuals involved is always required for publication.
  • It must also be determined whether the data can be fully anonymized.

If the copyright of others is affected (e.g., in cases involving significant portions of a (database) work), the copyright holders must be contacted in advance.

Forschungsdaten.info has created a decision-making guide for publishing research data (2019), which addresses the most important legal aspects.



As part of the "FID Asia" project, we are planning to develop a centralized platform for accessing and searching Asia-related research data. Stay updated through our blog.


We are looking forward to your support:

  • Tell us about other important sources and repositories for Asia-related research data.
  • Tell us where you published your data.
  • Tell us what services and support you would like to receive with respect to research data.
  • Ask us if you need support, e.g. in creating a data management plan, planning and describing research data, or archiving them.

You are welcome to contact us with these and any other questions at any time:
x-asia(at)sbb.spk-berlin.de