letter of intent - dfg

12
DFG form nfdi10 - 05/20 page 1 of 15 Deutsche Forschungsgemeinschaft Kennedyallee 40 53175 Bonn, Germany Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 Fax: + 49 228 885-2777 [email protected] www.dfg.de Letter of Intent

Upload: others

Post on 11-Apr-2022

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Letter of Intent - DFG

DFG form nfdi10 - 05/20 page 1 of 15

Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de

Letter of Intent

Page 2: Letter of Intent - DFG

DFG form nfdi10 - 05/20 page 2 of 15

Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de

1 Binding letter of intent as advance notification or non-binding letter of intent

Binding letter of intent (required as advance notification for proposals in 2020)

☐ Non-binding letter of intent (anticipated submission in 2021)

2 Formal details

Planned name of the consortium

Business, Economic and Related Data @ NFDI

Acronym of the planned consortium

BERD@NFDI

Applicant institution

University of Mannheim, Schloss, 68161 Mannheim;

Head: Prof. Dr. Thomas Puhl

Spokesperson

Prof. Dr. Florian Stahl, [email protected], Chair of Quantitative Marketing

and Consumer Analytics and Co-Director of Mannheim Center for Data Science at the

University of Mannheim

Co-spokesperson

Prof. Dr. Hartmut Höhle, [email protected], Management Analytics

Center and Chair of Enterprise Systems at the University of Mannheim

Co-applicant institution

Ludwig-Maximilians-Universität München, Ludwigstr. 33, 80539 München;

Head: Prof. Dr. rer. pol. Bernd Huber

Co-spokespersons

Prof. Dr. Bernd Bischl, [email protected], Chair of Statistical Learning &

Data Science and Director of the Munich Center of Machine Learning (MCML);

Prof. Dr. Göran Kauermann, [email protected], Chair of

Statistics - in Economics, Business Administration and Social Sciences

Co-applicant institution

Page 3: Letter of Intent - DFG

DFG form nfdi10 - 05/20 page 3 of 15

Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de

University of Cologne, Albert-Magnus-Platz, 50923 Köln;

Head: Prof. Dr. Dr. h.c. Axel Freimuth

Co-spokesperson

Prof. Dr. Marc Fischer, [email protected], Chair in Marketing Science and

Analytics at the University of Cologne

Co-applicant institution

Mannheim University Library, Schloss, 68161 Mannheim;

Head: Dr. Sabine Gehrlein

Co-spokesperson

Dr. Sabine Gehrlein, [email protected], Mannheim University Library

at the University of Mannheim

Co-applicant institution

Universität Hamburg, Mittelweg 177, 20148 Hamburg;

Head: Prof. Dr. Dr. h.c. Dieter Lenzen

Co-spokesperson

Prof. Dr. Mark Heitmann, [email protected], Chair of Marketing &

Customer Insight at the Universität Hamburg

Co-applicant institution

Institute for Employment Research (IAB), Regensburger Str. 100, 90478 Nürnberg;

Head: Prof. Bernd Fitzenberger, PhD

Co-spokesperson

Prof. Dr. Frauke Kreuter, [email protected], IAB Statistical Methods group

and Professorship for Statistics and Methodology at the University of Mannheim

Co-applicant institution

Leibniz Information Center for Economics (ZBW), Düsternbrooker Weg 120, 24105 Kiel;

Head: Prof. Dr. Klaus Tochtermann

Co-spokesperson

Prof. Dr. Klaus Tochtermann, [email protected], ZBW and Digital Information

Infrastructures Group at the Christian-Albrechts-Universität zu Kiel

Co-applicant institution

Page 4: Letter of Intent - DFG

DFG form nfdi10 - 05/20 page 4 of 15

Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de

GESIS - Leibniz-Institut für Sozialwissenschaften, B2,1 68159 Mannheim;

Head: Prof. Dr. Christof Wolf

Co-spokespersons

Prof. Dr. Christof Wolf, [email protected], GESIS and Chair of Social Stratification at

the University of Mannheim;

Prof. Dr. Stefan Dietze, [email protected], GESIS and Group Data & Knowledge

Engineering at the Heinrich Heine Universität Düsseldorf

Participants

o AG Informationskompetenz des BVB (AGIK BAY), Dr. Fabian Franke

o GSWG – Gesellschaft für Sozial- und Wirtschaftsgeschichte, Prof. Dr. Mark Spoerer

o GUG – Gesellschaft für Unternehmensgeschichte, Dr. Andrea H. Schneider-

Braunberger

o Institut für Bank- und Finanzgeschichte e.V., Hanna Floto-Degener

o Leibniz-Rechenzentrum (LRZ) der Bayerischen Akademie der Wissenschaften, Prof.

Dr. Dieter Kranzlmüller

o Leibniz Institute for Financial Research SAFE Sustainable Architecture for Finance

in Europe – Data Center House of Finance, Goethe University Frankfurt, Prof. Dr.

Jan Pieter Krahnen, Prof. Dr. Uwe Walz

o Leibniz Institute of Ecological Urban and Regional Development (IOER), Prof. Dr.

Marc Wolfram

o Netzwerk-Informationskompetenz Baden-Württemberg (NIK-BW), Dr. Marianne Dörr

o RatSWD – Rat für Sozial- und Wirtschaftsdaten, Prof. Dr. Monika Jungbauer-Gans

o Universitäts-IT der Universität Mannheim, Dr. Alexander Pfister, Kerstin Bein

o VHB – Verband der Hochschullehrer für Betriebswirtschaft e.V. German Academic

Association for Business Research, Prof. Dr. Hans Ulrich Buhl, Tina Osteneck

o Verein für Socialpolitik – Wirtschaftshistorischer Ausschuss, Prof. Dr. Ulrich Pfister

o Rayid Ghani, Carnegie Mellon University

o Prof. Julia Lane, Ph.D., New York University

o Dr. Georg Licht, ZEW – Leibniz-Zentrum für Europäische Wirtschaftsforschung

(ZEW)

o Dr. Katrin Moeller, Historisches Datenarchiv Sachsen-Anhalt

o Dana Müller, Institute for Employment Research of the Federal Employment Agency

(IAB)

o Prof. Dr. Isabella Peters, Leibniz Information Centre for Economics (ZBW)

Page 5: Letter of Intent - DFG

DFG form nfdi10 - 05/20 page 5 of 15

Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de

o Prof. Dr. Mark Spoerer, University of Regensburg

o Prof. Dr. Jochen Streb, University of Mannheim

o Prof. Dr. Heiner Stuckenschmidt, University of Mannheim

o Dr. Peter Wittenburg, formerly Max Planck Computing and Data Facility; Member of

the GOFAIR Foundation Board

3 Objectives, work programme and research environment

Research area of the proposed consortium (according to the DFG classification system: o 12 Social and Behavioral Sciences

• 112 Economics

− 112-02 Economic Policy and Public Finance

− 112-03 Business Administration

− 112-04 Statistics and Econometrics

− 112-05 Economic and Social History

• 111 Social Sciences

− 111-02 Empirical Social Research

− 111-03 Communication Sciences

− 111-04 Political Science

• 110 Psychology

− 110-03 Social Psychology, Industrial and Organizational Psychology

• 109 Educational Research

− 109-04 Educational Research on Socialization, Welfare and Organizations

Concise summary of the planned consortium’s main objectives and task areas Social agents are the focal study object in the social sciences. It can be individuals such as in

psychology, or sociology, or larger aggregates in terms of organizations such as companies in

business administration, political parties in political science, or even economies such as for

macro-economic research. A key characteristic of the new digital era is that social agents leave

traces of their life and behavior in form of unstructured, non-standard data from new digital

sources such as social media, Google search, or geo-satellite services. By unstructured we

refer to data (“big data”) that is available in form of text, image, voice, or video data and is

generated from data sources not primarily constructed for analytical purposes. Existing research

data infrastructures are not prepared to handle this new form of data that is very different from

Page 6: Letter of Intent - DFG

DFG form nfdi10 - 05/20 page 6 of 15

Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de

traditional structured data in terms of size and required computing capacity. Both traditional,

structured and new unstructured data are necessary for scientific progress in the social

sciences. By definition, structured data are ready to be used for theory testing and prediction

models of the empirical researcher. Unstructured data, however, need to be processed and

analyzed to turn it into a structured form that serves the research question to be studied.

Researchers in business, economics and related fields need efficient processes and tools, an

efficient data infrastructure and a comprehensive implementation of pertinent methodological

knowledge for their research and teaching. BERD@NFDI is a cooperation between the

Universities of Mannheim, Cologne, Hamburg, and Munich, the Institute of Employment

Research (IAB), the ZBW – Leibniz Information Center for Economics as well as GESIS –

Leibniz Institute for the Social Sciences, and is supported by community partners, such as the

German Academic Association for Business Research (VHB). That means, BERD@NFDI brings together leading institutions in business, economics,

educational research, psychology, social science, and communication science with method

experts in the area of artificial intelligence and machine learning, who intend to contribute their

best resources in order to exploit the new types of data for evidence-based empirical research.

The consortium will be supported by leading research institutions, such as the Leibniz Institute

for Financial Research SAFE, and infrastructure organizations, such as ZEW-FDZ, ZBW,

GESIS, the Mannheim University Library, and the Leibniz Supercomputing Centre (LRZ) in

Garching. As a structural contribution to NFDI, BERD@NFDI aims to develop and disseminate

transparent, FAIR and innovative standards, methods and tools to manage, (pre-)process and

archive unstructured and non-standard data as well as to combine and connect them with

structured data in economics, business and related research fields. The intended work program

comprises seven task areas (TA, see figure 1).

TA1 – BERD@NFDI Community Involvement: As an initiative driven by researchers,

community involvement and user-centered design lies in the DNA of BERD@NFDI. Users will

be involved in project steering and requirements analysis. A user-centered design and an agile

development methodology ensure that the infrastructure is aligned with the actual needs and

requirements of the user community.

TA2 – Creating BERD: In social sciences in general and business and economics in particular,

both the amount and type of data sources have proliferated. Creating transparency of the types

of data access and collection approaches will enhance the potential to reuse prior data

collection efforts for replication, extension, and application to new research problems.

Page 7: Letter of Intent - DFG

DFG form nfdi10 - 05/20 page 7 of 15

Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de

Figure 1: Work program of BERD@NFDI with seven task areas for the management of unstructured and non-standard research data.

TA3 – Processing BERD: Researchers have to deal with a variety of data and different levels

of data quality. BERD@NFDI will support the research community in offering and describing

suitable methods for (pre-)processing, e.g. to extract text and semantics from images, and

linking business and economic research data, as well as in documenting and making data

accessible. As a result, users will be able to assess the strength and weaknesses of the

available data

TA4 – Analyzing BERD: To investigate substantive research questions, researchers do not

only search for relevant data but also need to apply appropriate machine learning algorithms to

transform unstructured data into a form that is amenable to further (causal) analysis.

BERD@NFDI will connect research data with algorithms used in business, economics and

related areas so users can exploit the available data to its full potential, find common standards

in terms of data processing and better understand the weaknesses, strengths, and performance

characteristics of individual algorithms for applied research purposes.

TA5 – Preserving & Accessing BERD: Preserving and maintaining a sustainable degree of

accessibility for digital content is one of the main challenges in research data management.

BERD@NFDI will provide data preservation and data handling operations, metadata

standardization concepts and furthermore develop data migration strategies. The BERD@NFDI

Page 8: Letter of Intent - DFG

DFG form nfdi10 - 05/20 page 8 of 15

Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de

information portal will package all necessary components and provide access to functionalities

like searching, remote access, identity management and persistent identifiers.

TA6 – Re-using BERD: Researchers generating data often face resource or legal constraints

when intending to share data. Even if the data is being shared, it is often located in archives

with only a small number of users and little impact on academic insights. BERD@NFDI will

demonstrate the value of sharing to those involved in data production and support them to

overcome existing barriers.

Brief description of the proposed use of existing infrastructures, tools and services that are essential in order to fulfil the planned consortium’s objectives

BERD@NFDI builds on the following existing infrastructures: o ZBW will bring in its entire portfolio for research data management, including the

technologies developed within the DFG-funded project GeRDI as well as the entire

portfolio of FAIR data implementation networks of the GoFAIR initiative. ZBW will provide

its federation and harvesting technologies for research data repositories as well as its

tools for metadata normalization. Latest developments within the FAIR data movement

will be ensured through ZBW’s leading role in this initiative.

o GESIS will bring in its entire portfolio and technologies for research data management.

Together with ZBW, GESIS will exploit the synergies with the consortium KonsortSWD.

o DFG Research Group on the impact of social media headed by the University of

Hamburg and the University of Cologne, which generates, shares and works with both

structured and unstructured data from various online channels.

o The Mannheim Center for Data Science (MCDS) will contribute the results of the

BERD@BW project regarding professional training in the area of analysis and

management of unstructured (big) data.

o OpenML (https://www.openml.org/) which is a collaborative and open platform for

machine learning where Prof. Bernd Bischl is a developer and member of the core team.

o LRZ with its world-leading high-performance computing and storage systems will provide

its infrastructure for high-performance services and software for data analytics which are

specially tailored to the needs of AI methods.

o The BMBF-funded International Program in Survey and Data Science (IPSDS)

established in collaboration between the University of Mannheim and the University of

Maryland, with inputs from the IAB, the LMU and the Bundesbank, will provide a platform

and starting point for the asynchronous professional training opportunities on the next

types of data.

Page 9: Letter of Intent - DFG

DFG form nfdi10 - 05/20 page 9 of 15

Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de

o The research data center of the Mannheim University Library will provide data and

infrastructure services (see https://fdz.bib.uni-mannheim.de/), e.g. the automated text,

layout and structure extraction from digitized publications as done in the Aktienführer-

Datenarchiv (https://digi.bib.uni-mannheim.de/aktienführer/data/index.php). Furthermore,

the ontology of German firms will be employed and extended in BERD@NFDI and used

for entity identification purposes in (un-)structured data.

o The Coleridge Initiative (https:// coleridgeinitiative.org/) – with a partnership to

BERD@NFDI’s co-spokespersons – successfully implemented a secure Administrative

Research Data Facility which holds in addition to more structured records unstructured

data as part of a joint data schema. The environment runs within AWS and has been

successfully tested to provide a training platform and to allow remote access to

researchers. This platform serves as a role model for the planned infrastructure.

Interfaces to other proposed NFDI consortia: brief description of existing agreements for collaboration and/or plans for future collaboration

KonsortSWD: KonsortSWD is an established consortium with a long history and expertise in

the integration and management of primarily structured data from standard sources in the Social

Sciences. Scientific progress in the social sciences needs a powerful and interconnected

infrastructure for handling both structured and unstructured data originating from both standard

and non-standard sources. BERD is the complement to KonsortSWD, with which we closely

work together to offer an integrated access to the full breadth of data. The two consortia will

build up the research data infrastructure for the future covering all subfields of the social

sciences and which is unique to the world. NFDI4Memory: Since much historical (economic, business and social) information lies within

pictures and unstructured text, we will closely interact with the NFDI4Memory. The cooperation

will comprise the exchange of relevant metadata for the Data Space (NFDI4Memory) and the

data pool (BERD@NFDI) as well as ontologies and vocabularies (e.g. the firm ontology of

BERD@NFDI, the standard thesaurus for economics of ZBW, the ontology of job titles of

NFDI4Memory). We will also work together on methods regarding the analysis of unstructured

data as needed, e.g. automated text, layout and structure recognition.

MaRDI – Mathematical Research Data Initiative: MaRDI and BERD@NFDI will cooperate on

interdisciplinary topics of machine learning led by Bernd Bischl, who is co-spokesperson in both

Page 10: Letter of Intent - DFG

DFG form nfdi10 - 05/20 page 10 of 15

Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de

consortia. Both consortia aim at advancing machine learning approaches and at enabling

researchers to successfully apply them to specific research questions. While MaRDI contributes

its expertise in algorithms, their implementation and empirical benchmarks, BERD@NFDI

focuses on the application of these tools to data and research questions in economics and

social sciences. Another field of collaboration is the assessment of data quality in

heterogeneous and unstructured data.

Text+: Text+ is also a consortium in the Humanities and has its focus on building an

infrastructure for text and speech-based data with tools and services applied on this kind of data

in the Digital Humanities. We will strive for cooperation on topics regarding the management of

and analysis methods for text and speech data as types of unstructured data.

4 Cross-cutting topics

Please identify cross-cutting topics that are relevant for your consortium and that need to be designed and developed by several or all NFDI consortia.

BERD@NFDI supports the Berlin Declaration of handling NFDI-cross-cutting topics. All

areas mentioned in the declaration are important with BERD@NFDI seeing a special relevance

in the following topics: o Teaching & education: The report of the German Council for Information Infrastructures

(RfII) and the High-Level Expert Group on the EOSC have identified a clear need for

building up new competences, more capacities and new curricula for research data

management.

o User involvement: To build a useful and valuable infrastructure for research data that

users will contribute to, user involvement also on the governance level and user-centered

design is a crucial point, no matter what the underlying research discipline is.

o Legal aspects: Legal questions are also expected to arise in all consortia along the whole

data management process. This includes, among others, questions on licensing external

data from other research institutions, research data centers, companies etc. for their own

research purposes, as well as data protection, privacy issues and the proper license for

publishing data at the end of the research process.

o Quality assurance: Data quality is crucial for the (re-)usability of data in all research areas.

For the BERD@NFDI consortium this holds even more since sources of unstructured data

are often non-standard and not primarily research-focused. Therefore, the authenticity of

data is a major issue to be addressed.

Page 11: Letter of Intent - DFG

DFG form nfdi10 - 05/20 page 11 of 15

Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de

o Standardization and harmonization of terminologies: Almost every research community

has its own terminologies. In order to gain a common understanding of data, methods and

concepts, a certain degree of standardization and harmonization of terminologies across

the consortia is essential.

o FAIR metadata: To promote interdisciplinary re-use of data from all NFDI consortia,

interoperability has to be ensured with the FAIR (meta) data principles as central

guidelines, while incorporating the specific requirements of a particular user community at

the same time.

Please indicate which of these cross-cutting topics your consortium could contribute to and how.

BERD@NFDI supports the Berlin Declaration of handling NFDI-cross-cutting topics. The

BERD@NFDI consortium will treat these and other possible cross-cutting topics in its cross-

cutting topics committees. These committees will actively cooperate with other consortia and

feed the results in the strategic planning and work program of BERD@NFDI.

BERD@NFDI will especially contribute to the following topics: o Teaching and education: BERD@NFDI can build on extensive experience of several

participants in the conception and implementation of online courses and workshops about

data processing and analysis in the social sciences. The concepts which have proven

successful at an international level can also inform the development of similar activities in

other disciplines.

o User involvement and user-centered design engineering play an essential role in the

whole process of developing, pre- and post-implementing the services of BERD@NFDI.

The consortium can bring in its high expertise of these topics to provide innovative impulses

for the NFDI as a whole.

o Concerning legal aspects, BERD@NFDI will closely interact with KonsortSWD to leverage

synergies in this area. We will bring in the issues arising in connection with unstructured

data, AI and ML algorithms. BERD@NFDI can contribute to this with its clear disciplinary

focus.

o Quality assurance: BERD@NFDI will focus on the development of standards for data

quality assessment, normalization, and pre-processing of unstructured data. Furthermore,

we will strive to foster the standardized documentation of these processes in research

publications. The adoption of such generally accepted standards could significantly

advance good scientific practice in the field of empirical economics and social sciences,

and serve as a model for other disciplines.

Page 12: Letter of Intent - DFG

DFG form nfdi10 - 05/20 page 12 of 15

Deutsche Forschungsgemeinschaft Kennedyallee 40 ∙ 53175 Bonn, Germany ∙ Postal address: 53170 Bonn, Germany Tel.: + 49 228 885-1 ∙ Fax: + 49 228 885-2777 ∙ [email protected] ∙ www.dfg.de

o BERD@NFDI will contribute to the standardization and harmonization of terminologies

because it brings in-depth knowledge of the vocabularies and ontologies used in

economics. In addition, it develops its own ontology and knowledge graph for company

data. This offers numerous opportunities to further develop the existing structures into a

semantic web for scientific data.

o FAIR metadata: For the services of BERD@NFDI, a way must be found to bring specific

requirements of a particular user community in line with the requirements of interdisciplinary

re-use and general guidelines such as FAIR principles. BERD@NFDI will contribute its

perspective to the discussion on FAIR metadata for interoperability in NFDI and beyond.