ethical aspects of digital humanities - uni-passau.de ethical issues of digital humanities malte ......

26
1 On Ethical Issues of Digital Humanities Malte Rehbein, Passau Version: 2015-07-24 (Preprint) Anecdotal Introduction I am not an ethicist and had not thought through the moral aspects of my profession as a Digital Humanities scholar until quite recently. 1 A historian by training with a focus on Medieval Studies, I mostly considered the objects I studied to be beyond the scope of moral and legal issues: historical figures are long dead and historical events took place in the past, and my research would hardly ever influence the course of history. After this training as a (traditional) historian, I went on to work digitally with historical data, like Digital Humanists do, trying to identify entities, to find correlations, or to visualize patterns within that very data. At the big Digital Humanities gathering in Hamburg, 2012, however, rumours circulated that a secret service agency was recruiting personnel at the conference. This agency, they said, was interested in competences and skills in analysing and interpreting ambiguous, heterogeneous, and uncertain data. I had not been approached in person and until today do not know whether the story is true or not. Nevertheless, just the idea that a secret service might be interested in expertise from the Digital Humanities was a strong enough signal to start thinking about the moral implications of the work we are doing in this field, and it inspired for this essay. In this light, examples of recent research in Digital Humanities such as psychological profiling appear at the same time exciting and frightening. We can observe a typical dual-use problem: something can have good as well as bad consequences according to its use. Is it not a fascinating asset for research to determine a psychological profile of a historical figure just through automated analysis of historical textual sources? On the other hand, what would the consequences be if a psychological profile of anyone, living 1 This contribution is based on the Institute Lecture “On Ethical Aspects of Digital Humanities (Some Thoughts)” presented by the author at the Digital Humanities Summer Institute, Victoria BC, Canada, 5 June 2015. While the text has been revised and enriched with annotations, its essayistic style has been maintained.

Upload: vohuong

Post on 13-May-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

1

On Ethical Issues of Digital Humanities Malte Rehbein, Passau

Version: 2015-07-24 (Preprint)

Anecdotal Introduction

I am not an ethicist and had not thought through the moral aspects of my profession as

a Digital Humanities scholar until quite recently.1 A historian by training with a focus on

Medieval Studies, I mostly considered the objects I studied to be beyond the scope of

moral and legal issues: historical figures are long dead and historical events took place

in the past, and my research would hardly ever influence the course of history. After this

training as a (traditional) historian, I went on to work digitally with historical data, like

Digital Humanists do, trying to identify entities, to find correlations, or to visualize

patterns within that very data.

At the big Digital Humanities gathering in Hamburg, 2012, however, rumours circulated

that a secret service agency was recruiting personnel at the conference. This agency,

they said, was interested in competences and skills in analysing and interpreting

ambiguous, heterogeneous, and uncertain data. I had not been approached in person

and until today do not know whether the story is true or not. Nevertheless, just the idea

that a secret service might be interested in expertise from the Digital Humanities was a

strong enough signal to start thinking about the moral implications of the work we are

doing in this field, and it inspired for this essay.

In this light, examples of recent research in Digital Humanities such as psychological

profiling appear at the same time exciting and frightening. We can observe a typical

dual-use problem: something can have good as well as bad consequences according to

its use. Is it not a fascinating asset for research to determine a psychological profile of a

historical figure just through automated analysis of historical textual sources? On the

other hand, what would the consequences be if a psychological profile of anyone, living

1 This contribution is based on the Institute Lecture “On Ethical Aspects of Digital Humanities (Some

Thoughts)” presented by the author at the Digital Humanities Summer Institute, Victoria BC, Canada, 5 June 2015. While the text has been revised and enriched with annotations, its essayistic style has been maintained.

2

or dead, were to be revealed or circulated without her knowledge or her assignee’s

consent?

Ethical considerations are more than just a philosophical exercise. They help us to

shape the future of our society (and environment) in ways that we want it to be, and

they help us minimize risks of changes that we do not want to happen.

Setting the Stage: Use and Abuse of Big Data Now and Then

Big Data

This essay uses Big Data as a vehicle for considerations about ethical issues of the

Digital Humanities, pars pro toto for the field as a whole with all its facets. Big Data has

been a hyped term worldwide for some time now,2 and its methods have reached the

Humanities.3 Big Data is a collective term for huge collections of data and their analysis,

often characterized by four attributes: volume, velocity, variety, and veracity:

Volume describes vast amounts of data generated or collected. The notion of

“vast” is ambiguous, however. Some define it as an amount of data that is too big

to be handled by a single computer. Digital Humanities and the use-cases

described in this essay will hardly ever reach this amount. However, as Manfred

Thaller has pointed out in his keynote presentation at the second annual

conference of the Digital Humanities im deutschsprachigen Raum,4 Big Data is

characterized especially by the multiplication of all four factors described here.

Since data in the Humanities is often far more complex than, for instance,

engineering data due to its ambiguous nature, data in the Humanities can be

“big” in this way.

Velocity describes the speed at which data goes viral. As people often think of

new data generated within seconds or even faster, this is not characteristic for

the Humanities, which deals with historical or literary data. But it can become

2 Puschmann, Cornelius, und Jean Burgess. “Big Data, Big Questions. Metaphors of Big Data”,

International Journal of Communication 8 (2014): 1690–1709. 3 Schöch, Christof. “Big? Smart? Clean? Messy? Data in the Humanities”, Journal of Digital Humanities,

2013. Online. 4 Thaller, Manfred: "Wenn die Quellen überfließen. Spitzweg und Big Data", Closing Keynote, Graz, 27

February 2015.

3

relevant for Social Science, for instance in the analysis of so-called social media

data, which can be considered as part of Digital Humanities.

Variety refers to the various types of data processed, multimodal data, data in

different structures or completely unstructured data. This variety is characteristic

of Humanities’ sources, and Digital Humanities offer new methods to interlink

various types of data and to process it synchronously.

Veracity questions the trustworthiness of the data to be processed, its quality and

accuracy. The Humanities, especially within the historical disciplines, know best

of all what critically questioning origin, context, and content of data means –

making veracity of data a very relevant aspect for the Digital Humanities.

Overall, Big Data collections are too big and/or too complex to be handled by traditional

techniques of data management (databases) and by algorithmic analysis on a single

computer. With this definition of big data in mind, one might think of systems like the

Large Hadron Collider in Geneva, which is considered the largest research

infrastructure and scientific enterprise of all time, or of telecommunication data

produced by mankind – tons of terabytes every second. Compared to this, it might not

be appropriate to speak of Big Data in the context of scholarship in the Humanities at all.

Nevertheless, Big Data can act as a metaphor for one of the current major trends in

Digital Humanities: data-driven, quantitative research based of an amount of data that a

single scholar or even a group of scholars can barely oversee let alone calculate. Such

data, due to its amount, complexity, incompleteness, uncertainty, and ambiguity,

requires the support of computers and their algorithmic power. For centuries,

Humanities scholars have recognised this aspect of their data, but now they have this

data at hand in a much larger quantity than ever before.

In general, typical applications for Big Data are well known and described. With regard

to ethics, these applications span a broad range of how they are used and what

implications this might cause. This shall be illustrated by the following examples, divided

into three groups.

The first group comprises those of the Sciences and Humanities (the German term

Wissenschaft fits better here and will be used henceforth). Incredible amounts of data

4

are investigated, for example for research on global climate (e.g. NASA Center for

Climate Simulation), to increase the precision of weather forecasts, to decode the

human genome, or in the search for elementary particles at the Large Hadron Collider in

Geneva. In a positive view on Wissenschaft, these investigations shall serve society as

a whole.

The second class encompasses applications of Big Data from which particular groups

would benefit but which might interfere with the interests of others. Depending on one’s

perspective, such applications can easily be found in the business world. For example,

on February 16th, 2012, the New York Times published an article “How Companies

Learn Your Secrets”. Taking the example of the US retailer Target, the article describes

how companies analyse data and then try to predict consumer behaviour in order to

tailor and refine their marketing machine. In this context, Andrew Pole proposed a

“pregnancy-prediction model” to answer a company’s question: “If we wanted to figure

out if a customer is pregnant, even if she didn’t want us to know, can you do that?” with

the help of algorithms based on Big Data.5

In the wake of revelations by Edward Snowden, Glen Greenwald and others in 2013,6 a

third class of applications has become more and more into public consciousness.

Current mass surveillance might be the strongest example of Big Data analysis in which

the interests of a very small group are in stark contrast with the values of society at

large.

Big Data Ethics

These three categories form only one, preliminary classification of Big Data applications

from a moral perspective, a classification, which is, of course, simplistic and disputable.

Nevertheless, it shall lead us towards the basic question of ethics: the distinction

between right and wrong or good and evil when it comes to deciding between different

courses of action. It should also be clear by now that there is no one answer to this

question, but that different moral perspectives exist: perspectives of those who conduct

5 Duhigg, Charles. “How Companies Learn Your Secrets - NYTimes.com.” New York Times, 16 February

2012 (http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html?pagewanted=all&_r=0; last accessed 24 July 2015). 6 http://www.theguardian.com/world/series/the-snowden-files

5

Big Data analysis, perspectives of those who do basic research so that others can apply

these methods, and perspectives of the ambient society.

Putting aside the particular scenarios in which Big Data is studied / examined, one

might ask what is methodologically typical for it? There are three main methodological

areas involved: pattern recognition, linkage of data and correlation detection. Then

people (or machines) begin the process of inference or prediction. In his 1956 short

story “The Minority Report”, Philip K. Dick’s depicts a future society in which a police

department called Precrime, employing three clairvoyant mutants, called Precogs, is

capable of predicting and consequently preventing violent crimes. This world has not

seen a single murder in years. In Steven Spielberg’s movie of the same name from

2002,7 these Precogs are characterized in more detail: thanks to their extrasensory

capabilities they are worshiped by mankind. However, it is said in the film, the only thing

they do is searching for patterns of crime in their visions: “people start calling them

divine – Precogs are pattern recognition filters, nothing more”.

Assuredly, Precogs are a fiction. However, modern real-life crime prevention indeed

attempts to find patterns in Big Data collections to predict likely locations of criminal

behaviour, which has been reported with various, some say doubtable success rates

from the USA and Germany, and probably other countries. The way our governments

and secret services justify (disguise) their actions of mass surveillance, namely to

predict and prevent terror attacks are real, too. Big data and its technology (pattern

recognition, data linkage, and inference) serve such predictions: weather prediction –

pregnancy prediction – crime prediction.

This is not new. For instance, the East German secret service Stasi under the direction

of Erich Mielke conceptualized a comprehensive database of human activities and

intentions.8 Its goal: to put together digital data and reports of all of the 16.5 million

citizens of the German Democratic Republic".9 In the historically realistic scenario of the

7 Minority Report, Dir. Steven Spielberg, 2002.

8 Cf. Wolle, Stefan (2013). Die heile Welt der Diktatur, 186.

9 Sebestyen, Victor, Revolution 1989: The Fall of the Soviet Empire. New York 2009, 121. In comparison

to nowadays Big Data companies, Andrew Keen concludes: "Mielke war ein Datendieb des 20. Jahrhunderts, der die DDR in eine Datenkleptokratie verwandelte. Doch verglichen mit den Datenbaronen des 21. Jahrhunderts war sein Informationsimperium zu regional und zu klein gedacht. Er kam nicht auf

6

Academy Award-winning film The Lives of Others (orig. “Das Leben der Anderen”),10 the

Stasi possesses a type specimen collection of all typewriters that are in circulation, and

they know the machine favoured by each of the human writers they are observing.

Whenever they come across an anonymous document, they attribute this document to a

particular writer by comparing the typesetting of this document with the specimen. This

is not (yet) Big Data in the modern definition, but is a form of pattern recognition. In “The

Lives of Others”, the Stasi did not manage to disclose Georg Dreyman as the author of

an article published in the West German magazine “Der Spiegel”, an article in which

Dreyman revealed the high statistical rate of suicides in East Germany after the suicide

of his friend. The East German secret service did not manage to do so, because

Dreyman did not write the draft with his own but with somebody else’s typewriter. He

behaved untypically.

Scenarios and applications like this are not new. What is new is their dimension. And

what brings these briefly introduced examples together, be they fictitious or real, is that

they are all so-called probabilistic methods; they do not give us the truth, but the

probability that a particular event or behaviour will take or has taken place. However,

even a likelihood of 99% prediction accuracy means that in one out of a hundred cases,

the wrong person will have to suffer the consequences.

For various reasons, Big Data yields several normative questions and issues. On May

30th, 2014, Kate Crawford published a piece in “The New Inquiry” under the title: “The

Anxieties of Big Data. What does the lived reality of big data feel like?” She concludes:

“If the big-data fundamentalists argue that more data is inherently better, closer to the

truth, then there is no point in their theology at which enough is enough. This is the

radical project of big data. It is epistemology taken to its limit. The affective residue from

this experiment is the Janus-faced anxiety that is heavy in the air, and it leaves us with

den Gedanken, dass Milliarden Menschen in aller Welt ihre persönlichen Daten freiwillig herausrücken könnten." (Keen, Andrew (2015. Das digitale Debakel, München, 201). 10

Das Leben der Anderen, Dir. Florian Henckel von Donnersmarck, 2006.

7

an open question: How might we find a radical potential in the surveillant anxieties of

the big-data era?”11

Ethical questions in Big Data have barely been addressed in the research.12 In 2014,

Rajendra Akerkar edited a volume on “Big Data Computing”.13 In 540 pages, however,

neither legal nor ethical questions are discussed. In the chapter on “Challenges and

Opportunities” by Roberto Zicarci (103-130), for example, opportunities are business

opportunities, challenges are mostly technical challenges.14 The volume does not

address individual, organisational let alone societal risks and consequences of Big Data

Computing. This seems to be symptomatic for hyped technologies such as Big Data

and for technological advancement of our time generally. First, we do it, and then we

handle the consequences.

Very much alike is the 2013 report on “Frontiers in Massive Data Analysis” issued by

the National Academy of Sciences of the USA. Limitations of data analysis discussed

here are merely of a technical nature. The report states: “The current report focuses on

the technical issues – computational and inferential – that surround massive data,

consciously setting aside major issues in areas such as public policy, law, and ethics

that are beyond the current scope” (5).15 Bollier makes such issues more explicit: “The

rise of large pools of databases that interact with each other clearly elevates the

potential for privacy violations, identity theft, civil security and consumer manipulation”

(33).16

Even in areas where potential ethical issues are more obvious than in the Humanities,

Wissenschaft and the general public are slowly beginning to realize the implications of

11

Crawford, Kate. “The Anxieties of Big Data”, The New Inquiry, 30 May 2014 (http://thenewinquiry.com/essays/the-anxieties-of-big-data/; last access 24 July 2015). 12

More recently, a conference at Herrenhausen “Big Data in a Transdisciplinary Perspective” discussed legal aspects of Big Data. Their proceedings have not yet been published. A report is available: Kolodziejski, Christoph, und Vera Szöllösi-Brenig. “Big Data in a Transdisciplinary Perspective. Herrenhäuser Konferenz”, 22 July 2015 (http://www.hsozkult.de/conferencereport/id/tagungsberichte-6084; last accessed 24 July 2015). 13

Akerkar, Rajendra, 2014. Big data computing. Boca Raton, Fla: CRC Press, Taylor & Francis Group. 14

Roberto V. Zicar, 2014. “Big Data: Challenges and Opportunities”, Big data computing. Ed. Rajendra Akerkar, 103-128. Ethical challenges are mentioned (“Ensuring that data are used correctly (abiding by its intended uses and relevant laws)”) but not further discussed (111). 15

Frontiers in Massive Data Analysis, 2013. Online. Last accessed 24 July 2015. 16

Bollier, David, 2010. The Promise and Peril of Big Data. Washington, DC: Aspen Institute, Communications and Society Program. Print.

8

Big Data and to demand action. In June 2014, for example, the University of Oxford

announced a postdoctoral position of Philosophy in “ethics of big data”: “this pilot project

will formulate a blueprint of the ethical aspects, requirements and desiderata

underpinning a European framework for the ethical use of Big Data in biomedical

research”.17 Earlier, on October 24th, 2012, Stephan Noller called for a general ethics of

algorithms (orig.: Algorithmen-Ethik) in the German newspaper FAZ to promote control

and transparency: "Algorithmen müssen transparent gemacht werden, sowohl in ihrem

Einsatz als auch in ihrer Wirkweise".18 It is clear that a wide-spread understanding of

algorithms is also an urgent necessity.

Technology is not value-free

In a brief survey of current research, one should not overlook a small publication by

Kord Davis from 2012, titled “Ethics of Big Data”. Davis’ analysis runs as follows: “While

big-data technology offers the ability to connect information and innovative new

products and services for both profit and the greater social good, it is, like all technology

ethical neutral. That means it does not come with a built-in perspective on what is right

or wrong or what is good or bad in using it. Big-data technology has no value framework.

Individuals and corporations, however, do have value systems, and it is only by asking

and seeking answers to ethical questions that we can ensure big data is used in a way

that aligns with those values” (8).

While Davis is right in demanding that the discussion of Big Data ethics has to be

embedded in surrounding value systems, he is wrong about the neutrality of technology.

His argument reminds us of Francis Bacon who had this dream of value-free

Wissenschaft in the 17th century. In the wake of the bombing of Hiroshima and

Nagasaki in August 1945, many, such as Max Born19 woke up from this dream and

recognized the dual-use dilemma of technology and acknowledged the responsibility of

the scientists: "Wir stehen auf einem Scheidewege, wie ihn die Menschheit auf ihrer

Wanderung noch niemals angetroffen hat" (9). Closer to our field, Vannevar Bush, who

17

https://data.ox.ac.uk/doc/vacancy/113435; last accessed 24 July 2015. 18

Noller, Stephan. “Relevanz ist alles. Plädoyer für eine Algorithmen-Ethik“. FAZ, 24 October 2012. 19

Born, Max. 1965. Von der Verantwortung des Naturwissenschaftlers. Gesammelte Vorträge. Munchen: Nymphenburger Verlagshandlung.

9

provided an important milestone for the development of the Digital Humanities with his

seminal publication “As We May Think” from 1945, asked how science can come back

to the track that leads to the growth of knowledge: "It is the physicists who have been

thrown most violently off stride, who have left academic pursuits for the making of

strange destructive gadgets, who have had to devise new methods for their

unanticipated assignments. [...] Now, as peace approaches, one asks where they will

find objectives worthy of their best".20

Technology is not value-free. Scientists and scholars develop it. Together with those

who apply technology in specific use cases, a huge share of responsibility belongs to

them. Computer pioneer Konrad Zuse recognised this. Looking back from the vantage

point of his memoir, he describes the qualms (orig.: “Scheu”) he had in the end of 1944

to further develop his machine (Z4). Implementing conditional jumps into it would allow

free control flow: "Solange dieser Draht nicht gelegt ist, sind die Computer in ihren

Möglichkeiten und Auswirkungen gut zu übersehen und zu beherrschen. Ist aber der

freie Programmablauf erst einmal möglich, ist es schwer, die Grenze zu erkennen, an

der man sagen könnte: bis hierher und nicht weiter" (77). According to Zuse's memoir,

his reputation suffered from this "Veranwortungsbewußtsein des Erfinders" (77).21

There is a second critical aspect of Davis’ ethics. His readers are decision makers of

business enterprises. The value system he discusses refers to corporations and

individuals within the corporate structure. He does not address individuals outside the

corporation, let alone the ambient society and world at large: internal but not external

responsibility. For Wissenschaft, however, it is essential that we address both. The

freedom to study and to investigate always comes with the responsibility to use this

freedom carefully. In Wissenschaft, freedom and responsibility are two sides of the

same coin.

20

Bush, Vannevar. "As We May Think." The Atlantic July 1945. Web. 14 Apr. 2014. 21

Zuse, Konrad., Der Computer - Mein Lebenswerk. Mit Geleitworten von F. L. Bauer und H. Zemanek. Berlin-Heidelberg-New York-Tokyo, Springer-Verlag 1984.

10

Case Studies

Some Thoughts on Digital Humanities

One can easily imagine that Big Data in biomedical research (as seen in the Oxford job

posting) opens the door for ethical considerations. But what about the Digital

Humanities? Why should we bother? In the context of this question, it is helpful to

characterize Digital Humanities as an attempt to offer new practices for the Humanities.

This is mainly facilitated by a) the existence or creation of and access to digital data

relevant to research in the Humanities, b) the possibility of a computer-assisted

operation upon this data as well as c) modern communication technology in particular

the internet. Overall, this characterizes the Digital Humanities as a hybrid field,

suggesting two different perspectives within the scholarly landscape. For both

perspectives, ethical discussions play a role.

The first perspective is that of a separate discipline, with its own research questions,

methodology, study programmes, publication venues, and so on, and of course: values.

As a discipline on its own, Digital Humanities needs its Wissenschaftsphilosophie

(philosophy of science), including theory22 and ethics. The second perspective, however,

sees Digital Humanities as a Hilfswissenschaft (auxiliary science) that provides services

for others, which one might compare with the role maths plays for physics and

engineering, or palaeography for history and medieval studies. This perspective on

Digital Humanities is relevant for our ethical discussion, because a Digital Humanist

might be tempted to argue that he is only developing methodologies and hence is not

responsible for the uses that others make of them.

Early Victims of Digital Humanities: William Shakespeare and Agatha Christie

A first case study comprises the work by Ryan Boyd and James Pennebaker on William

Shakespeare. In the context of modern text and language analysis, Pennebaker, a

social psychologist, is known for his method of Linguistic Inquiry and Word Count

22

For Digital Humanities as an emerging academic discipline on its own, more theoretical foundation seems to be timely. This is particularly true in the context of Big Data analysis where proponents are announcing an "end of theory" (provocative: Anderson, Chris. "The End of Theory: The Data Deluge Makes the Scientific Method Obsolete." Wired Magazine 16.07 (2008). A critical discussion offers Kitchin, Rob. "Big Data, New Epistemologies and Paradigm Shifts." Big Data & Society 2014.

11

(LIWC). Applying text analytical methods to the large corpus of the work of William

Shakespeare, Boyd and Pennebaker claim to be able to create a psychological

signature of authors ("methods allowed for the inference of Shakespeare's [...] unique

psychological signatures", 579) and to confirm the broadly accepted characterization of

the playwright as "classically trained" and "socially focused and interested in climbing

higher on the social ladder" (579-80).23 Shakespeare has long been dead, of course,

and most likely, neither he nor any of his kin has to face the consequences of this

research. But the methods employed here are of a general nature and can easily be

applied to anyone, living or dead, whether he wants it or not.

Another prominent “victim” of this kind was Agatha Christie, maybe the most read

English female writer of all time. In 2009, Ian Lancashire and Graeme Hirst published a

study “Vocabulary Changes in Agatha Christie’s Mysteries as an Indication of Dementia:

A Case Study”.24 Lancashire and Hirst analyse the corpus of Christie’s work as follows:

“Fourteen Christie novels written between ages 34 and 82 were digitized, and digitized

copies of her first two mysteries […] were taken from Project Gutenberg. After all

punctuation, apostrophes, and hyphens were deleted, each text was divided into

10,000-word segments. The segments were then analysed with the software tools

Concordance and the Text Analysis Computing Tools (TACT). We performed three

analyses of the first 50,000 words of each novel”.25 The result of this, fairly straight-

forward, textual analysis indicated that Christie’s vocabulary was in significant decline

over the course of her life and that the amount of repetition increased, such as the

usage of indefinite words. For Lancashire and Hirst, this is an indication that Agatha

Christie developed dementia.

These techniques on textual corpora operate on text as a sequence of characters. They

are agnostic about who had written these texts and for what purpose. In other words,

not only texts by well-known and deceased writers can be examined in such manner.

23

Boyd, Ryan L., and James W. Pennebaker. "Did Shakespeare Write Double Falsehood? Identifying Individuals by Creating Psychological Signatures With Text Analysis." Psychological Science 26.5 (2015):

570–582. 24

Lancashire, Ian, and Graeme Hirst (2010). "Vocabulary Changes in Agatha Christie's Mysteries as an Indication of Dementia: A Case Study." In Forgetful Muses: Reading the Author in the Text (Toronto:

University of Toronto Press), 207–19. 25

http://ftp.cs.toronto.edu/pub/gh/Lancashire+Hirst-extabs-2009.pdf; last accessed 28 July 2015.

12

Any text can. Lancashire and Hirst are well aware of this fact and of the potential

consequences. Like many technologists, however, their ethics and outlook is strictly

positive: “While few present-day patients”, they conclude, “have a large online

diachronic corpus available for analysis, this will begin to change as more individuals

begin to keep, if only by inertia, a life-time archive of e-mail, blogs, professional

documents, and the like. […We can] foresee the possibility of automated textual

analysis as a part of the early diagnosis of Alzheimer’s disease and similar

dementias”.26

Early diagnosis of diseases or their prediction might be a wonderful “tool” in the future.

Research in this direction aims at something “good”, Lancashire and Hirst would argue.

Their ethics is utilitarian in the tradition of Jeremy Bentham and John Stuart Mill. But

what happens if this data is used against someone, for instance, to deny an insurance

policy? And as textual data becomes more and more easily available, whether we

consciously deliver it, for instance in blogs or Facebook microblogs, or because our e-

mails are intercepted, it becomes almost impossible for the individual to avoid this

situation.

Revealing Your Health Preconditions

Another, related example shall illustrate that not only texts and data that we currently

provide might lean to individual or societal consequences, but also data from the past.

An open question in medical research addresses whether or not there is a genetic

predisposition to Alzheimer’s disease. Neurologist Hans Klunemann and archivist

Herbert Wurster now propose that this hypothesis can be tested with historical data.27

Their research uses historical records, parochial death registers from 1750 to 1900,

which were digitized, transcribed and encoded in a database at the archive of the

diocese of Passau. They analyse the data for family relations in order to create family

26

Ibid. 27

Klünemann, Hans, Herbert Wurster, and Helmfried Klein. "Alzheimer, Ahnen Und Archive. Genetisch-Genealogische Alzheimerforschung." Blick in die Wissenschaft. Forschungsmagazin der Universität

Regensburg 15 (2013): 44–51.

13

trees, and they analyse mortality data to find indicators for Alzheimer’s disease.28

Through this, they hope to identify genetic conditions for the development of

Alzheimer’s disease and they hope, in the future, to be able to predict whether or not

someone belongs to such a risk group.

This is a highly interdisciplinary approach with Digital Humanities at its very heart:

digitization, digital transcription and encoding as well as computer-based analysis of

historical data make this work. If the approach turns out to work, one can foresee great

potential in it. What could be problematic about such research? This data (the digitized

church registers) has been made publically available, searchable, and analysable. Many

other archives have done or will do the same. Consequently, however, information

about an individual’s family and their causes of death will become public information

and this information can be used, for instance, to evaluate the individual risk of a living

descendant for a certain disease even if this individual has not disclosed any personal

information about him or herself. Hence, information about living persons can be

inferred from open historical data.

In addition to the question of whether individual rights are affected, these case studies

demonstrate typical dual-use problems. On the one hand, family doctors can use the

data and its analysis as an early diagnosis of severe diseases. On the other hand,

potential employers can also use it, for instance, to pick only those individuals that do

not belong to any risk group. There is no easy solution for this problem. Ethical

questions appear to be dilemmas, also in Digital Humanities.

Another Prominent Victim of DH: J. K. Rowling

In 2013, a quite prominent case of authorship attribution floated around. A certain

Robert Galbraith published a novel called The Cuckoo’s Calling. Despite positive

reviews, the book was at first only an average success on the book market. However,

three months later, rumours began circulating that the real author of The Cuckoo’s

Calling was J. K. Rowling, who had had such a sweeping success with her Harry Potter

28

As Dementia or Alzheimer were not known then, other terms were used as indicator for these diseases. “Gehirnerweichung” or “Gehirnwassersucht” are typical expressions from the sources that Klunemann and Wurster use for their research.

14

series. Patrick Juola and Peter Millicam analysed the text of The Cuckoo’s Calling with

methods of forensic stylometry and came to the conclusion that it was quite probable

that Rowling is indeed its author, which she afterwards admitted.

Especially when it is a “closed game” as in this case, in which one computes the

likelihood with which a text can be attributed to an author candidate (as opposed to the

“open game” where one computes the most likely author of a text), forensic stylometry

is a simple method: “language is a set of choices, and speakers and writers tend to fall

into habitual, or at least common, choices. Some choices come from dialect […], some

from social pressure […], and some just seem to come”.29 This leaves stylistic patterns

that a computer can measure and compare to corpora of texts already attributed, such

as the Harry Potter series. The method has been described and practised since the 19th

century (although computers are a late entrant to the game).

For the Digital Humanities, methods like these are – at first sight – fantastic. They offer

vast opportunities for fundamental research, for example in studying the history of

literature, or general history, they allow testing existing hypotheses, and they offer new

ones. The moral question, however, is again: at what cost? J.K. Rowling admitted that

she would have preferred to remain unrevealed: “Being Robert Galbraith has been such

a liberating experience […] It has been wonderful to publish without hype and

expectation and pure pleasure to get feedback under a different name”.30 Does

research in Digital Humanities threaten the effectiveness of a pseudonym and hence an

individual’s right to privacy and freedom to publish?

This kind of research does not only affect individuals. There are consequences for

society as whole, for the world we live in, and for our social interaction. If one thinks the

idea of authorship attribution through to its very end, then we arrive at a future in which

it is impossible to remain anonymous – even when we try. Proponents of mass

surveillance and leaders of totalitarian regimes will certainly favour such a scenario, but

free-speech advocates will certainly not. We have to carefully evaluate the risk that our

research carries. There is another interesting aspect to this story: we usually speak of

29

Juola, Language Log Blog, 16 July 2013. 30

Quoted after The Boston Globe, 15 July 2013.

15

technology and Wissenschaft in the same breath as representing? progress.

Wissenschaft enhances, it extends, it augments. In the case discussed here, however,

we appear to lose a capability by this scientific progress: We will not be capable

anymore of hiding.

Psychological Profiling Through Textual Analysis

In 2013, inspired by the Pannebaker’s work on the psychological signature of

Shakespeare, John Noecker, Michael Ryan, and Patrick Juola published a study of

“Psychological profiling through textual analysis”.31 This research presumes that the

personality of an individual can be classified with the help of psychological profiles or

patterns. Based on a typology suggested by Carl Gustav Jung in 1921,32 Katherine

Briggs and Isabel Myers developed a classification on their own (Myers-Briggs type

indicator, MBTI)33 in which they classify individuals’ preferences among four

dichotomies: extraversion versus introversion, sensation versus intuition, thinking

versus feeling, and perception versus judging. An individual can be, for instance, an

ISTJ type: an introversive, sensing thinker who makes decisions quite quickly. Although

the validity of this classification as well as its reliance on questionnaires is disputable,

the Myers-Briggs indicator is quite popular, especially in the USA where it is used in

counselling, team building, social skill development, and other forms of coaching.

Noecker’s, Ryan’s, and Juola’s formulate a simple hypothesis: the writing style of an

individual can serve as a measure for this individual’s MBTI and hence, stylometric

methods can be used to determine the type indicator. In other words, they propose that

automated textual analysis can create a psychological classification of the author of a

given text. For their experiments, the Noecker, Ryan, and Juola used a corpus of texts

by Dutch authors whose MBTI is known (Luycks and Daelemans’ Personae: A Corpus

for Author and Personality Prediction from Text).34 Noecker, Ryan, and Juola state an

average success rate of 75%. They claim to detect the ‘J’-type (judging) and the ‘F’-type

31

Noecker, John, Michael Ryan, and Patrick Juola. Psychological profiling through textual analysis. Literary & Linguistic Computing (2013) 28 (3): 382-387. 32

Jung, Carl Gustav. Psychologische Typen. Zürich: Rascher, 1921. 33

Cf. A Guide to the Isabel Briggs Myers Papers, online http://web.uflib.ufl.edu/spec/manuscript/guides/Myers.htm; last accessed 28 July 2015. 34

Luyckx, K. and Daelemans, W. (2008). Personae: A Corpus for Author and Personality Prediction from Text. Proceedings of the 6th Language Resources and Evaluation Conference. Marrakech, Morocco.

16

(feeling) quite well (91%, 86%). For the ‘P’-types, the perceivers, however, the method

does not respond equally well (56%).35 According to Myers and Briggs, the perceivers

are those individuals who are willing to rethink their decisions and plans in favour of new

information, those who act more spontaneously than others.

Again, the texts that these methods are grounded in might be provided consciously and

willingly or unconsciously and unwillingly. Hence, the same moral issue of use and

reuse of scholarly methods arises here and needs to be discussed within the context of

these usages. But what about the researcher who develops but does not necessarily

apply this technology? In this case, Digital Humanities would play the role of an auxiliary

science, providing services for others. As such an auxiliary science, it is tempting to

argue that research is value-free, that its sole goal is the development of methods and

that only those who apply these methods have to consider moral consequences –

whether that be literary scholars working on Agatha Christie or historians interested in

the psychological profiling of historical leaders,. However, as argued above,

Wissenschaft and technology is never value-free. Everyone who is developing

something is responsible for considering potential risks of its usage. Especially when

Digital Humanities is understood as a discipline in its own right, these issues have to be

addressed and discussed.

Elements of an Ethical Framework – Towards a Wissenschaftsethik for Digital Humanities

Fears of Media Change

With the rough definition of Digital Humanities elaborated above in mind, we next sketch

out some of the changes underway during this computational turn. Media changeover

has always been characterised by anxiety and outspoken criticism. Well-known

examples include Plato’s critique on writing as it led to degeneration of the human

capability of memorizing and more importantly comprehension (Phaedrus dialogue), the

invention of the printing press which allowed limitless publications and led to moral

decay, Nietsche’s trouble with the typewriter and how this technology changed his way

35

Noecker, John, Michael Ryan, and Patrick Juola. Psychological profiling through textual analysis. Literary & Linguistic Computing (2013) 28 (3): 382-387, here: 385.

17

of thinking,36 the “indoctrination or seize-over of the listener through very close spraying”

of sounds by stereophonic headphones,37 and many others. More recently, the internet

as a new medium has been criticized as leading towards superficiality and the decline of

cognitive capabilities as Nicholas Carr’s rhetorical question “Is Google Making Us

Stupid?” suggests.38

Let us briefly look at some positive aspects of these changes: writing down knowledge

allowed its increase beyond the memory capability of a single person, the invention of

the printing press led to a liberalisation of this knowledge, internet technology and open

access might lead to further democratization, de-imperialization and de-canonization of

knowledge.

In the context of the latter, in David Berry emphasis the ubiquitous access to human

knowledge,39 which reminds us / one of Vannevar Bush’s memory extension system

Memex: “Technology enables access to the databanks of human knowledge from

anywhere, disregarding and bypassing the traditional gatekeepers of knowledge in the

state, the universities, and market. [...] This introduces not only a moment of societal

disorientation with individuals and institutions flooded with information, but also offer a

computational solution to them in the form of computational rationalities, what Turing

(1950) described as super-critical modes of thought” (8-9).

One may regard it as positive or negative,40 but changes in media have always been

followed by a dismissal of the old “gatekeepers of knowledge”: first the authorities of the

classical age, the Christian church and the monasteries, then the publishing houses and

the governmental control in modern history. Progress dismissed them but new

gatekeepers succeeded them. In a way, Vannevar Bush’s vision of the memory

extension by what we would now call a networked database of knowledge seems to

36

Kunzmann, Robert (1982): Friedrich Nietzsche und die Schreibmaschine. In: Archiv für Kurzschrift, Maschinenschreiben, Bürotechnik, 3 (1982), 12-13. 37

Psychoterror durch den Kunstkopf, zitiert nach http://www.heise.de/newsticker/meldung/Vor-40-Jahren-Ein-Kunstkopf-fuer-binaurale-Stereophonie-1946286.html; last accessed 28 July 2015. 38

Carr, Nicholas. "Is Google Making Us Stupid?". The Atlantic. 1 July 2008. 39

Berry, David M. Introduction. In: Berry, David M. (eds). Understanding Digital Humanities. Houndmills, Basingstoke, Hampshire: Palgrave Macmillan, 2012, 1-20. 40

Andrew Keen is one to emphasize the negative impact of the vanishing of gatekeepers because it led to lost of trust and opens the door for manipulation and propaganda (Andrew Keen, Das digitale Debakel, 184-185).

18

have become reality. Not only do the various types of media converge, but also man

and machine merge. Data and algorithms become more and more important for

everyday life and work, and those who control these algorithms and “gatekeep” the data,

wield power. “Code is law“, postulates Lawrence Lessig,41 and in the German

newspaper DIE ZEIT, Gero von Randow follows this up and proclaims: “Who controls

this process, rules the future”.42 Apparently, this leaves the door open for manipulation

and for mistakes.

The Sorcerer’s Apprentice

The 1980s Czechoslovakian (children’s) science-fiction TV series Návštěvníci (The

Visitors)43 depicts a peaceful world in the year 2484. In this world, everything is in

harmony until the Central Brain of Mankind, a computer, predicts the collision of an

asteroid with the Earth leading to the planet’s destruction. The people completely and

blindly rely on this Central Brain and start a mission to rescue mankind. The mission

fails and people are about to evacuate the planet. Then an accidental traveller in time,

from the year 1984, comes into this world. What he finds out is very simple: the people

have built the machine (the Central Brain) onto a crooked surface which hence caused

crooked predictions. The traveller put the Central Brain back into its upright position

from which it could correct its prediction (Earth was not threatened) and the machine

apologized for causing so much trouble. The visitor from a past time did one thing that

the people of 2484 did not: he critically (one might say naively) approached the

computer and challenged its functionality – a capability that the people of 2484 have lost

or forgotten. They never thought of questioning the computer’s prediction.

The moral of this story is that, in the end, it has to be the humans to justify the

consequences of actions. This is very much like what Joseph Weizenbaum has told us.

A computer can make decisions, he would argue, but it has no free choice. In the future

world of The Visitors, one single, central computer steers the fate of mankind. In our

41

Lessig, Lawrence. Code and Other Laws of Cyberspace. New York: Basic Books, 1999. 42

"Wer diesen Prozess steuert, beherrscht die Zukunft." Randow, Gero von. “Zukunftstechnologie: Wer denkt in meinem Hirn?” ZEIT, 7 Mar. 2014. 43

Návštěvníci (1981-1983), dir. Jindřich Polák.

19

present age, it is the ubiquity of computing technology – computers are everywhere –

that effects our daily lives.

Philosopher Klaus Wiegerling discusses ubiquitous computing44 from an ethical

perspective in ways that are highly relevant to (Digital) Humanities.45 If systems,

Wiegerling argues, acquire, exchange, process, and evaluate data on their own, then

the materialization of information can no longer be comprehended by people. Personal

identity, however, is formed through such comprehension, and making experiences (an

important part of these is doubt or resistance) is essential for it. Hence, ubiquitous

algorithms might lead to a loss of identity and personal capabilities and competences.

Like The Visitors, we start behaving like little children, being incapable of determining

reality correctly, losing our identity as an acting subject and limiting our options on how

to act. The “unfriendly takeover” by computers that technology critic Douglas Rushkoff

fears for our present society46 has taken place in The Visitors and it is only someone

from the past who saves the present live of the future.

We need to engage more critically with the origin of our data and with the algorithms we

are using. One need only look into a university classroom to observe how the role

search engines and smartphone apps play in decision-making is increasing. A typical

argument that you can often hear is that some information comes from ‘from the

internet’. That this information is not challenged (by questioning who has provided this

‘information’ and when, with what intention, which were the sources etc.) illustrates the

lack of information literacy. Additionally, the conclusion that because something ‘comes

from the internet’, this something has to be the truth (or at least valid), illustrates the

danger of this attitude and information illiteracy being abused. Consequently, new

gatekeepers of knowledge might emerge all too easily. Being incapable of critical

thinking can be observed more and more, from a classroom situation in Digital

Humanities to scholarship in general, and to society and large.

44

The term appeared around 1988. Cf. Weiser, Mark, R. Gold, and J.S. Brown. "The Origins of

Ubiquitous Computing Research at PARC in the Late 1980s." IBM Systems Journal 38.4 (1999): 693–696. 45

Wiegerling, Klaus. Ubiquitous Computing. In: Grundwald, Armin (ed.), Handbuch Technikethik. Stuttgart: J.B. Metzler, 2013, 374-378. 46

Rushkoff, Douglas. 2013. Present shock: when everything happens now. New York: Current.

20

Crawford’s observation about Big Data that “If the big-data fundamentalists argue that

more data is inherently better, closer to the truth, then there is no point in their theology

at which enough is enough” leads us to a position that ethicists would call the problem

of the Sorcerer’s Apprentice, named after Goethe’s poem Der Zauberlehrling.47

The poem begins as an old sorcerer departs his workshop, leaving his apprentice with

household chores to be done. Tired of fetching water with a pail, the apprentice

enchants a broom to do the work for him – using magic for which he is not yet fully

trained. The floor is soon awash with water, and the apprentice realizes that he does not

know how to stop the broom:

Immer neue Güsse

bringt er schnell herein,

Ach, und hundert Flüsse

stürzen auf mich ein!

The apprentice splits the broom in two with an axe, but every piece becomes a new

broom on its own and takes up a pail and continues fetching water, now at twice the

speed.

Die ich rief, die Geister,

werd‘ ich nun nicht los

When all seems lost, the old sorcerer returns and quickly breaks the spell. The poem

finishes with the old sorcerer's statement that powerful spirits should only be called by

the master himself.

The analogy to the risks of Big Data is obvious: what initially has been useful to handle

large amount of data might get out of control and start ruling us, taking away from us the

options that we once had: the normative power of the de-facto. At some point, we might

have no choice anymore but to use data analysis or other computer-based methods for

any kind of research in the Humanities. And what would then happen if we do not

47

I am grateful to my colleague Christian Thies, Professor of Philosophy at the University of Passau to share his thoughts on this with me.

21

understand the data and the algorithms anymore and stop challenging the machines

like the people from 2484?

Wiegerling concludes that it is becoming more and more important today to pinpoint the

options available for action, to make transparent the possibilities of intervening with an

autonomous operating system, and to enlighten people about the functionality of these

systems (376). This should be a core rationale of any training in Digital Humanities, and

it is essential to shape our tools before these tools shape us.48

Some General Thoughts on Ethics of Science for the Digital Humanities

New technologies have their good sides and their bad sides depending on one’s

perspective. Every change brings forward winners and losers. The big ethical question

is how to value and how to opt and to justify what we are doing. Philosopher Julian

Nida-Rümelin pointed out that for various areas of human conduct, different normative

criteria might be appropriate and ethics cannot be reduced to one single system of

moral rules and principles.49 As we are currently forming Digital Humanities as a

discipline on its own, a definition of its own ethics of science as a complementary

counterpart to its theory of science seems to be timely. Theory and ethics together

make philosophy of science. Their role it is to clarify what exactly this Wissenschaft is

(its ontological determination) and how Wissenschaft is capable to produce reliable

knowledge.50

Ethics is part of philosophy and is regarded as a discipline that studies moral (as a

noun), i.e. normative, moral (as an adjective) systems, judgements, and principles. This

is not the place to discuss any moral criteria. However, on a more general level, a

framework from which these criteria, or code of conduct, for Digital Humanities might be

48

Cf. Keen, Andrew, Das digitale Debakel (orig.: The Internet is Not the Answer), 20, in analogy to the famous Winston Churchill quote: "We shape our buildings; thereafter they shape us." 49

Nida-Rümelin, Julian (1996). Theoretische und angewandte Ethik: Paradigmen, Begründungen, Bereiche. In: Nida-Rümelin, Julian (ed.). Angewandte Ethik: Die Bereichsethiken und ihre theoretische Fundierung. Stuttgart: Alfred Kröner Verlag, 3-85, here: 63. 50

Reydon, Thomas (2013). Wissenschaftsethik. Stuttgart: Eugen Ulmer, 16.

22

derived, shall be outlined along three areas following Hoyningen-Huene's

systematization:51

1. Moral issues in specific fields of research and in close relation to the objects of

study

2. Moral aspects of Wissenschaft as a profession

3. The responsibility of an individual scholar as well as of the scholarly community

at large

All these areas are relevant for Digital Humanities. The first area comes into play, for

instance, when one deals with and analyse personal data. Many of the examples

discussed above touch on this question. Consider the authorship attribution and the

case of Rowling. The researchers analyse text, but this text mediates an individual,

which then becomes the object of study. Do we violate Rowling’s right of privacy or

anonymity? Should one (or not) ask this individual whether she objects to this

investigation? If we are capable of inferring an individual’s genetic disposition to certain

diseases by just analysing historical records, should permission be required from this

individual when the historical data of his ancestors is going to be public through

digitization?

Scientific and technological progress seem to go more and more hand in hand with an

increasing readiness for taking risks as Ulrich Beck criticizes52 He observes that there

are hardly any taboos anymore or that once existing taboos are broken. Societal

scruples seem to disappear with the consequence that society increasingly accepts

once questionable conduct without opposition. Beck’s observation applies not only for

the use of technology but also for research as such. Moreover, this research, Beck

argues, is taking place less and less inside the protected environment of a laboratory.

Instead, the world as a whole is becoming a laboratory for research. For the objects that

Beck discusses, for instance genetically mutated plants, it is rather obvious how this

‘world as laboratory’ is threatening the world as a whole. For the Humanities, it is less

apparent. However, the tendency might indeed be the same. For instance, Big Data

51

Cf. Reydon, Wissenschaftsethik, 12-15. The fourth area, a Sozialphilosophie der Wissenschaft is left out here. It addresses the interplay of Wissenschaft with society. 52

Beck, Ulrich (2008). Weltrisikogesellschaft. Frankfurt: Suhrkamp.

23

offers the possibility of studying communicational patterns and behaviours of people at

large by analysing so-called social media such as Twitter. Unlike an experiment in a

laboratory where people are invited to participate as test subjects, the internet, the

virtual world, becomes the new laboratory where participation is often unwitting and

involuntary. In a physical laboratory, we used to ask people for their permission to

participate in an experiment (and usually paid them some compensation). Should we

not do the same and achieve an informed consent when we regard the internet as a

laboratory and use its data? Can we accept the fact that Tweeters are test persons in

an experiment without even knowing it?53

The second area of ethics discusses moral aspects of Wissenschaft as a profession.

We can we divide ethics of science into two dimensions: first, the internal dimension

that deals with issues of affecting individuals within a given scholarly community and

this community itself, and second, the external dimension that deals with consequences

for individuals outside this community, for the ambient society, culture and nature. Moral

aspects of Wissenschaft as a profession are of the first dimension. What is understood

here is usually a code of good practice: work lege artis, do not fabricate, do not falsify,

do not plagiarize, honour the work of others, give credit to all who supported you, name

your co-authors, do not publish the same thing twice, and various other guidelines that

many scholarly communities have given themselves.54

53

In fact, research of this kind has already been undertaken in a very problematic manner. Kramer, Guillory, and Hancock manipulated the “News Feeds” of 700,000 Facebook users to study the impact on their mood (Kramer, Adam D. I., Jamie E. Guillory, und Jeffrey T. Hancock. “Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks.” Proceedings of the National Academy of Sciences 111, No. 24 (17 June 2014): 8788–90). Facebook was a partner in this experiment, provided access to personal data and facilitated data manipulation. Informed consent by the users had not been asked for. Many raised ethical concerns about this study, for instance in online comments to the publication (http://www.pnas.org/content/111/24/8788.full?sid=750ad790-21a1-4ebc-ba71-9dc0ac5af3d0) and other media. In an opinion piece within the same journal, Kahn, Vayena, and Mastroianni ask in a utilitarian view whether the concept of informed consent “makes sense in social-computing research” and conclude that “that best practices have yet to be identified” (Kahn, Jeffrey P., Effy Vayena, und Anna C. Mastroianni. “Opinion: Learning as We Go: Lessons from the Publication of Facebook’s Social-Computing Research.” Proceedings of the National Academy of Sciences 111, No. 38 (23 September 2014): 13677–79). A more critical opinion expresses Tufekci: Tufekci, Zeynep. “Engineering the Public: Big Data, Surveillance and Computational Politics.” First Monday, 2014. 54

For a general framework cf. for instance the memorandum of the Deutsche Forschungsgemeinschaft (1998/2013). Sicherung guter wissenschaftlicher Praxis. Empfehlungen der Kommission "Selbstkontrolle in der Wissenschaft" / Safeguarding Good Scientific Practice. Recommendations of the Commission on Professional Self Regulation in Science.

24

But it is more than that. Robert Merton formulated in 1942 four epistemological

dimensions of what distinguishes good from bad science.55 He claimed that scholars

shall only be guided by the ethics of their profession and not by personal or social

values. Between the 1920s and 1940s, he observed that science is developing not

autonomously and on its own anymore, but that societal and political forces and their

interests significantly drive it. This led to a loss of trust into the objectivity of scientific

results. Although Merton’s view on the exclusion of personal and social values does not

hold out anymore, in the framework of today’s Wissenschaftssystem, there are a couple

of characteristics, similar to Merton’s observation 70 years ago. These apparently

change the way we work, but they also compel our research into particular directions,

and steer and restrict our choices of research topics and methods. These

characteristics of today’s Wissenschaftssystem include (among others): a permanent

pressure to acquire third-party funding, the "publish-or-perish" principle, a growing

necessity to legitimate research, especially in the Humanities, international competition

and a demand to be “visible” as a researcher. It has to be discussed how these

conditions affect the objectivity of our research especially when at the same time, a

huge amount of data is conveniently at hand to quickly produce analytical results, faster

than by traditional methods but maybe also less grounded. Merton’s principles from

1942 might still serve as guidance. In order to restore legitimation and trust into

research, he demands four principles:

1. Universalism: all research has to be measured against impersonal criteria

regardless of its origin. Only then, best results can be produced (this is a

teleological criterion)

2. Communalism: all research is the result of a communal effort (which refers to

Newton’s ‘Standing on the shoulders of giants’), cannot remain individual and

has to be published widely (the modern Open Access, Open Source, Open Data

movements builds on this)

55

Merton, Robert (1973). The normative structure of science. In: Merton, Robert (ed.). The sociology of science: theoretical and empirical investigations. Chicago: University of Chicago Press, 267-278.

25

3. Selflessness: the behaviour of a researcher has to be guided only by the interest

of the scientific community; it is his duty to produce reliable knowledge (this is a

deontological criterion)

4. Organized skepticism: it is the duty to steadily question the own work and the

work of others in order to produce best possible results.

The latter is particularly important within an emerging field such as the Digital

Humanities.

The third area of ethics is more abstract: it deals with the consequences of our research

for the world in which we live. In the 17th century, Francis Bacon formulated his ideal of

a Wissenschaft, which should serve society in order to improve the living conditions of

humankind. Science shall be – teleologically – subordinated under this higher good. In

Bacon’s time, this especially aimed at understanding nature. Knowledge would then

empower mankind to master nature.56

Scepticism about this view has been raised by many others, among them more recently

Philosopher Hans Jonas.57 Technology’s control over nature has become excessive

with the consequence that technology does not lead any more towards improving living

conditions but towards their destruction. Jonas formulates an imperative of future ethics:

"Handle so, daß die Wirkungen deiner Handlung verträglich sind mit der Permanenz

echten menschlichen Lebens auf Erden" (“act only according to the maxim that the

consequences of your action are in harmony with a permanent existence of true human

life on Earth”; translation MR). Jonas does both: he criticizes and he extends

(modernizes?) Immanuel Kant’s categorical imperative: “Act only according to that

maxim whereby you can, at the same time will, that it should become a universal law

without contradiction”. Jonas demands from each scholar the duty to take responsibility

for future generations and to preserve what makes "echtes menschliches Leben", true

human life. “True” indicates that the question of permanent existence of life goes

beyond mere biological existence and procreation, but the Zeitgeist and the current

systems of values of a society probably define what “true human” actually means.

56

Cf. Reydon, Wissenschaftsethik, 82-83. 57

Jonas, Hans (1984). Das Prinzip Verantwortung. Frankfurt: Suhrkamp.

26

Liberty and privacy could be components of such a system in nowadays Western World.

Any research that threatens the continuity of these values would violate Jonas’

imperative.

For research undertaken in the Digital Humanities, questions like these may arise: how

is our social behaviour changing when we know that we cannot express ourselves

without being monitored? What consequences would follow out of this for the society?

How does a society look like in which possibly the history of diseases and dispositions

of individuals can easily be detected based on Open Access historical data? Is there a

risk that we might create future generations in which values like a right to stay

anonymous do not exist anymore or is there not? And if there is, shall we take this take

or better not? Or what measures shall we take to minimize it?

Jonas gives us advice when it comes to finding answers for these questions, hence to

decide among different options of action. He asks us to think of the worst-case scenario

first. His heuristic is determined by fear (“Heuristik der Furcht”),58 and the principle of

Jonas’ ethics is responsibility, especially for the future. I personally agree with this view

and would like to establish the following: as long as the consequences of our research

in Digital Humanities are not sufficiently clear, one should be sensitive to the problems

that might arise, one should be careful in his actions, and we as a community should at

least have these discussions openly.

Conclusion

Ethics of science refers to all moral and societal aspects of the practice of our

Wissenschaft. Nevertheless, it can do nothing more than to problematize and to make

the stakeholders of Digital Humanities sensitive for moral questions. It can suggest

different perspectives and set a framework within which arguments take place, but it

cannot solve dilemmas. The decisions to be made are always up to the individual

scholar or – in terms of a code of conduct – up to the scholarly community.

58

Jonas, Verantwortung, 7-8.