Programme 2013

Tuesday 16th July       Wednesday 17th July       Thursday 18th July       Friday 19th July      

Download the conference book
Download the program

Wednesday 17th July 2013, 11:00 - 12:30, Room: No. 16

Research Data Management for Re-use: Bringing Researchers and Archivists closer 2

Convenor Dr Alexia Katsanidou (GESIS - Leibniz Institute for the Social Sciences)
Coordinator 1Mr Laurence Horton (GESIS - Leibniz Institute for the Social Sciences)
Coordinator 2Dr Christina Eder (GESIS - Leibniz Institute for the Social Sciences)

Session Details

Research data management includes organizing, documenting and validating data to produce long-term re-usable data. Good research data management practice fulfills the King, et al. (1994: 8), requirement of social science that "procedures are public" to verify quality and permit replication. Accordingly, to the fullest extent possible the research community requires access to data and contextual documentation. Funding bodies impose archiving requirements on researchers, and data archives establish standards and procedures to ensure data are preservable, discoverable, and comprehensible. Following these practices, large survey programs increasingly make data management plans, collect metadata and document every stage of research: from conception to analysis.

However, in practice surveys can be inadequately documented due to miscommunication between researchers and archivists resulting in poor planning, which bring time and resource pressures and lead to poor quality data. Data and contextual information can remain hidden and vulnerable: stored on researcher hard-drives or websites, metadata could be incomplete or non-existent, variable and value labels may be cryptic, and do- or syntax-files nowhere to be found. Thus, despite suggestions and standards, effective implementation of data management plans (if existent) remains unfulfilled.

This session brings together two audiences: researchers designing and/or implementing data management plans in survey research, and archivists involved in digital preservation and dissemination of survey data. This session is a forum to discuss and evaluate approaches to research data management, promote common understanding of problems encountered, and discuss means to an end product: reusable data. We have already received expressions of interest in participating from at least seven researchers and archivists from relevant institutions.

We welcome papers from data creators, principal investigators or data managers dealing with theoretical, methodological, and practical problems in research data management in cross-sectional, repeated or longitudinal surveys, as well as papers from archive personnel

Paper Details

Dr Kristine Witkowski (University of Michigan)

The social value of data collections are dramatically enhanced by the broad dissemination of research files and the resulting increase in scientific productivity. Currently, most studies are designed with a focus on collecting information that is analytically useful and accurate, with little forethought as to how it will be shared. Both literature and practice also presume that disclosure analysis will take place after data collection. But to produce public-use data of the highest analytical utility for the largest user group, disclosure risk must be considered at the beginning of the research process. Drawing upon economic and statistical decision-theoretic frameworks and survey methodology research, this study seeks to enhance the scientific productivity of shared research data by describing how disclosure risk can be addressed in the earliest stages of research with the formulation of "safe designs" and "disclosure simulations", where an applied statistical approach has been taken in: (1) developing and validating models that predict the composition of survey data under different sampling designs; (2) selecting and/or developing measures and methods used in the assessments of disclosure risk, analytical utility, and disclosure survey costs that are best suited for evaluating sampling and database designs; and (3) conducting simulations to gather estimates of risk, utility, and cost for studies with a wide range of sampling and database design characteristics.

2. Research Data Management with the Data Sharing Repository DATORIUM - the various Ways Scientists benefit from Data Sharing
Ms Monika Linne (GESIS-Leibniz Institute for the Social Sciences)

One of the current projects for digital data preservation at the Data Archive of GESIS-Leibniz Institute for the Social Sciences is the data sharing repository DATORIUM. By developing a digital data sharing repository GESIS responds to a changing data landscape, in which scientists call for more flexible ways of distributing and re-using research results. For this reason GESIS is expanding its present range of services offered with a digital data dissemination tool that allows for prompt publishing and sharing of research results with other scientists.
This repository will serve as a web-based software that enables researchers to manage, document, archive and publish their data and structured meta data autonomously. Therefore scientists can use DATORIUM for the publication of their research results and the corresponding research data and further relevant material. As a result the visibility and availability of research projects will be increased, long-term preservation of the data and meta data is ensured and wide-ranging dissemination possibilities are provided.
By facilitating access to their research data scientists can support new research or secondary analysis and beyond that they profit from the raise of citations of their work and therefore improve their reputation. In addition, DATORIUM can operate as a working environment that can jointly be used by a research group in order to work together on the publication of research results.

3. Sharing qualitative Data of Business and Organizational Research - Problems and Solutions
Mr Tobias Gebel (Data Service Center for Business and Organizational Data (DSC-BO), Bielefeld University)
Mrs Iris Nopper (Data Service Center for Business and Organizational Data (DSC-BO), Bielefeld University)

In the German empirical organizational research, qualitative methods are used predominantly. For this specific field of research it is typical, that the samples are often very small and sensitive. Also the data often are not usable for other researchers. Consequences are the non-exhaustion of the analysis potential of interesting research data, recurring interviews and a strain of the field. That causes an ongoing decline of the willingness of respondents to participate in interviews. Data sharing can contribute to relieve overstressed research populations and to exhaust the analysis potential of available data. Nevertheless, data sharing has no tradition in qualitative organizational research.
Our presentation focuses on two central challenges: documentation and data protection for sharing business and organizational data. Organizational analyses are very complex. Individuals and groups are examined as a collective or parts of organizations, as a result of labor division or hierarchical structures. Furthermore, the whole organization or its individual segments are examined as part of the economic or social system individually or in groups. So first, the documentation of organizational data must account for the whole complexity and heterogeneity.
Second, the data protection interests of the individual respondents as well as the organizations must be protected. The policy must prevent an identification of the respondents based on the information given by the organization and vice versa.
We will address the specific features of data documentation, as well as the data protection procedure establish at the Data Service Center for Business and Organizational Data Bielefeld.

4. Search engines as Key Brokers of Scientific Data
Mr Christoph Thewes (University of Potsdam)
Mr Denis Huschka (German Data Forum & GWI Science Policy and Infrastructure Development GmbH )
Mr Gert G. Wagner (Data Forum, German Institute for Economic Research (DIW Berlin) & University of Technology Berlin)

The paper addresses the fact, that there is more and more data for research purposes potentially available, but most of this data is invisible and therefore not used for secondary analyses. Search engines should spread any kind of information that forms the basis for knowledge production: It is the raw data that is just as interesting to the research community as the published articles.

However - the raw information are difficult to find because this data is on the researchers computers, and therewith invisible for search engines. By making search engines aware of the existence of such raw data and letting them search through information about this data, this situation can be changed.

Researchers want to earn credits by promoting their research, that one personally can be traced as "author" of the data shared. On the other hand, researchers need tools, which make the generation and the sharing of these metadata quick and simple.
There are numerous metadata standards developing, that are all complex and time-consuming. The authors of this paper therefore developed a STATA-tool which generates a rudimentary "data paper" consisting of a small metadata set within minutes. This paper will be sent automatically to a central platform that publishes this data paper online and makes the data findable by search engines and the author of this data paper earns credits for this publication.

The presented system is work in progress, and we are currently working towards an integration of persistent identifiers and researcher IDs (DOI/ORCID).