scientific resources and data economy copyrighted material

18
PART 1 Scientific Resources and Data Economy COPYRIGHTED MATERIAL

Upload: others

Post on 18-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

PART 1

Scientific Resources and Data Economy

COPYRIG

HTED M

ATERIAL

Page 2: Scientific Resources and Data Economy COPYRIGHTED MATERIAL
Page 3: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

1

Data Production and Sharing: Towards a Universal Right?

In 1968, Steward Brand, a biologist associated with the American counterculture, imagined the Whole Earth Catalogue. This tool, which took the form of a travel book, aimed to share knowledge between the hippie communities that read it and left behind it the hope for a universal spread of knowledge. Very quickly, starting in 1985, Brand launched an electronic version of the Whole Earth Catalogue, the Whole Earth ‘Lectronic Link. This first bulletin board system, which then worked like a forum, brought the idea of the universal spread of knowledge to a whole new level. In fact, the dematerialization of the Whole Earth Catalogue allowed the territorial constraint of the previous experiment to be bypassed. In this, we can see very well that, despite the representation of a military development of the Internet, it continues to be influenced by the American counterculture [TUR 06] and, in essence, contains universalist values.

However, regardless of these universalist values that have fed the development of the Internet, the reality shows greater contrast today. The Web, founded on the principles of freedom and open sharing of resources, has in part taken on the aspects of classical liberalism, an economic line of thought that has largely supplanted the original ideals. Thus, if the Internet remains a place for the spread of knowledge, this knowledge is primarily lucrative and is undergoing privatization. Oligopolies are being formed by

Chapter written by Marie BLANQUART, Thomas DESCOUS and Ewen HUET.

Page 4: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

4 The Digital Factory for Knowledge

the concentration of scientific publishing houses, and platforms like Google Scholar dominate the market. This is why the regulations on knowledge are rudimentary; we are far from any perspective recognizing a universal right to access knowledge.

We have thus decided to consider this right to knowledge and its evolutions with a prospective approach: are we moving towards a universal right?

1.1. The right to knowledge today: between attempts at universalization and “self-regulation” by the GAFA

The Internet, through its deterritorialization, requires new regulations. In fact, the first obstacle to the implementation of a universal right of the Web is that the Internet, by its very nature, questions the principle of the territoriality of rights. The essence of rights as a regulation is founded on the idea that it is exercised in a given space, dominated by a sovereign power responsible for enforcing it. It is thus clear that the emergence of the Internet poses a certain number of questions concerning its regulation due to its global character.

In fact, the favored path to regulating the Internet remains the national path. Thus, with the Marco Civil da Internet [MAR 14] supported by Dilma Rousseff, Brazil proposed an innovative model concerning the recognition of Internet rights. We can also cite the digital law supported by the French Secretary of State Axelle Lemaire. In particular, Article 30 states that the copyright period should be reduced for public research, thereby allowing free access to the results of fundamental research. Unfortunately, we can also cite numerous examples where States failed to enforce intellectual property rights and to prevent the emergence of platforms offering protected content free of charge. In fact, servers need only be hosted in a lenient State for platforms to be kept online.

Ambitious attempts at multilateral regulation have failed on this point, which has led to the implementation of imbalanced regulation. For example, despite the creation of the IGF, a forum for Internet governance, it has not played the central, normative role that would have led to the emergence of

Page 5: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

Data Production and Sharing: Towards a Universal Right? 5

a universal right to knowledge. Attempts at multilateral regulation have shown themselves to be impasses, as seen in the failure of the ACTA, the Anti-Counterfeiting Trade Agreement, which was rejected by the European Parliament in 2012 [EUR 12]. However, despite the obstacles to a universal right to the Internet, the extraterritoriality of US law helps spread the hypothesis of self-regulation through GAFA1.

1.1.1. Towards the emergence of a universal right to knowledge subject to divergent economic thinking

In fact, we observe that the current tendency is regulation through a form of extraterritoriality of US law, which then imposes itself as the global web law. We can see this in the ICANN (Internet Corporation for Assigned Names and Numbers) created by the Clinton administration in 1998, which regulates the assignment of domain names around the world. Likewise, the US Department of Justice led a large operation to close the Megaupload platform in 2012. Although this was based in Hong Kong, the US authorities felt that they were in a position to intervene because the data went through servers located in the United States.

Furthermore, we observe that the preferred method of regulation remains soft law, with, for example, the publication of official reports and non-binding recommendations. Yet, it happens that, very often, these recommendations arrive a posteriori and aim to act and frame the existence of practices that evolve at an extremely fast pace concerning new technologies. In fact, those dominating the Web and creating it, GAFA, become their own regulators [PAR 12].

Maintaining this dynamic could create several important risks for the regulation and sharing of knowledge. In fact, it implies a predominance and reinforcement of American control over the Web, which remains concerning after the Snowden affair. On the contrary, the dominance of large groups makes this a “sixth continent”, partly placing oligopolistic businesses ahead of States, which poses a serious democratic problem. Finally, the right

1 This acronym stands for the American businesses Google, Amazon, Facebook and Apple and, by extension, all of the businesses with a strong influence on the world knowledge market.

Page 6: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

6 The Digital Factory for Knowledge

of GAFA favors the market and the logic of profitability, rather than promoting the development of open access to knowledge and its conception as a common good that should be freely accessible.

1.1.2. The recognition of a universal right to knowledge: a “realistic utopia”?

In light of the privatization of the Internet by GAFA, more and more militants are mobilizing so that knowledge will be recognized as a common good. A common good is an unrivaled good that is not exclusive due to its public utility. Since the early 2000s, as a reaction to neoliberalism, numerous actors have mobilized to defend this concept and it has spread to various domains, including knowledge.

On 27 July 2015, in Rome, the participants of a conference at the Italian Senate “Universality of human rights for the transition towards the State of law and the affirmation of the right to knowledge” launched an appeal for the recognition of a universal right to knowledge [NON].

In the wake of these alternative militants, numerous solutions are emerging to counter the vision of a closed, private Web, even taking on concrete realities. Thus, many universities have begun offering MOOCs (Massive Open Online Courses), editors are making more and more Open Access content available and a legal regime of copyright has even been created with the Creative Commons. A portion of knowledge can be found in open access today, a status which allows its democratization and reappropriation and nourishes the ideal of the recognition of a universal right to knowledge.

In conclusion, we can say that the current situation of Internet rights is in contradiction. If we admit that there is a need for the universal regulation of knowledge, there remains a preference for adaptations of national law. Despite the failures that we have seen, we are indeed moving towards global regulation. This will remain imperfect and present strong limitations because it will not be democratic. However, the utopia of a universal right to knowledge could become a concrete reality; it is supported by militants of the common good, and concrete actions show its possibility. Far from a

Page 7: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

Data Production and Sharing: Towards a Universal Right? 7

radical reality, we are seemingly moving towards a hybrid model, where an imperfect right of GAFA will coexist with embryos of the right to knowledge, which are a minority but democratic.

1.2. Platform and scientific community rights: the absence of an upfront legal framework

1.2.1. A system partly caused by the development of the digital sector

The development of the digital sector has allowed the massive creation of new information as well as the improvement of new tools to process it. This revolution particularly concerns the scientific domain and especially STI (scientific and technical information). There are two categories of STI: the data forming the raw material for research and publications. STI is thus presented in every area of research, both in the starting phases and in the final product. There are two primary uses for it: for researchers, it is a tool, and for laboratories, it provides access to their information. Practically omnipresent, it is easy to highlight the importance that STI takes on in the sector and the new role as a facilitator that the digital sector has assumed.

The automation of many systems thanks to the development of computer technologies can also be observed. This greatly increases researchers’ capacities to carry out research on larger data corpora in a more driven and faster way.

Finally, the development of the digital sector has led to the emergence of the notion of the value of data, i.e. the perception of the pure digital product as having an economic value that can be exploited by shrewd investors. This is an interesting notion for both researchers and private businesses, which have sought to benefit from it a forteriori. Researchers can valorize their final product and reap the economic benefits, as well as scientific advances (e.g. commercializing a scientific discovery) and private businesses have the opportunity to draw on new technologies for greater profit.

All of this therefore leads to an evolution of the research system as such and the modification of the methods of functionality for those involved in the sector.

Page 8: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

8 The Digital Factory for Knowledge

1.2.2. The now-fragile law attempting to protect the results of research

What we call “data” is made up of three distinct layers: base layer, basic content and constituent elements of the content. These elements fall under the protection of copyright, protection by sui generis law and the protection of the elements, respectively.

Figure 1.1. Legal architecture of knowledge: a typology of the levels of defining rights [MAU 15]

1.2.3. Intellectual property rights

The question of copyright must be regulated with the various right holders (authors of articles and journal editors) on a contractual basis before being incorporated into the base [MAU 15].

Copyright includes the monopoly of reproduction, including the adaptation of works. Publishers, the holders of the cultural rights on the scientific texts that they publish, can consequently forbid third parties, as well as authors, from partially or fully reproducing a product, as well as any translation, adaptation, transformation, arrangement or reproduction through any art or method [LEG 92b]. The publishing contract between a researcher and a publisher most often takes the form of a contract of adhesion. It foresees a cessation of the researcher’s copyright in favor of the publisher, generally in an exclusive and gracious manner, for use throughout the whole world and for the entire legal duration of the copyright. Numerous testimonies have allowed the publishers’ practice of having the signing of a copyright transfer form proven. This contract is written in such a way that

Page 9: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

Data Production and Sharing: Towards a Universal Right? 9

it seems that only a lawyer specializing in copyright law could understand it. Researchers often sign it without even reading it because they do not have their institution’s support to provide a reasoned opinion and a means of defending their rights as a creator. Therefore, the speed of publication is a significant point in the framework of international competition and researchers do not always have time to implement an appropriate procedure for validating the contract.

The CNRS’s ethics committee, in a statement “concerning the relationship between researchers and scientific publishing houses” made on 31 January 2011 [CNR 11], describes this situation in the following way:

“The end of the copyright for an article accepted by the editorial committee of a journal, which may be based in one country or another, on a recommendation from one or more reviewers, is most often requested by the publisher as a courtesy. If an author refuses to sign the form ceding his/her rights as an author, the article, although it has been accepted by the editorial committee, will generally not be published”.

Some publishers, aware of the importance of making articles available for research and the tendency towards Open Science, give authorization for the article to be uploaded to an open archive after an embargo period (post-print) [CEN 16].

1.2.4. The notion of databases and protection by sui generis law

The notion of databases is defined in Article L112-3 of the French Intellectual Property Code [LEG 92a]: “Databases should be understood as a collection of works, data, or other independent elements, arranged systematically or methodically, and individually accessible by electronic or other means”.

The legal framework protecting data is defined by the provision of the European directive on databases from 11 March 1996 [EUR 96], transposed in France by the law concerning the legal protection of databases. This creates a so-called sui generis law in favor of the database’s producer. The database’s producer is defined as the person who takes the initiative and risk of the investment.

Page 10: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

10 The Digital Factory for Knowledge

It can prohibit:

– the extraction of all or a substantial part of the content of the database;

– the reuse of all or a qualitatively or quantitatively substantial part of the content of the database;

– the extraction or repeated and systematic reuse of qualitatively or quantitatively insubstantial parts of the content of the database when these operations manifestly exceed the normal conditions for using the database.

For example, digital STI is accessible through scientific publishers’ databases. The publisher, in this case, is the producer of a database who can consequently prohibit all qualitatively or quantitatively substantial extraction from his/her database. Digital STI is also available through institutional databases, or even epijournals, Open Access databases. The producers of each of these databases are also holders of sui generis rights, who can prohibit all qualitatively or quantitatively substantial extraction [CEN 16].

The principles of sui generis law are as follows: if, in principle, the data are not individually protectable (save for specific cases), in the hypothesis that the data is protected by a private right (intellectual property rights, personal data rights, right to privacy), the aggregation of a significant amount of data can be protected, if necessary, by virtue of the sui generis rights of the database’s producer [CEN 16].

These provisions are unfortunately limited by the legal fuzziness that weighs on the notions of “data” and “platforms.” In fact, even if the database has a set legal definition, the notion of data is not defined precisely. The 22 December 1981 decision on the enrichment of the French language imposes the following definition: “representation of information in a conventional form meant to facilitate its processing”, but this is far from covering every possibility for the Internet.

The notion of platform has neither a statute nor a legal regimen. This absence of a definition leads to a certain legal insecurity already emphasized by the French National Digital Council in its notice from 13 June 2014 [CNN 16], as well as by the French State Council in its 2014 report, “Digital and Fundamental Rights” [CNN 16].

Page 11: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

Data Production and Sharing: Towards a Universal Right? 11

It was the law for a Digital Republic [LEG] that accomplished this definition, but in an indirect way, given that it is the activities of online platform users that allow a platform to be defined:

1) “the classification or referencing, through computer algorithms, of contents, goods or services offered or placed online by third parties;

2) the comparison of several parts with an eye to sell a good, provide a service or exchange or share content, a good or a service”.

Obligatory loyalty is also imposed upon the platform’s user.

1.2.5. Problems with the legal statute of knowledge

In addition to this legal fuzziness comes a true problem of researchers themselves lacking familiarity with the law.

According to the survey results presented in Figure 1.2, published in the CNRS report (2016), more than half of the researchers interviewed either do not know if the data they use are free from any claim or knowingly use such data despite the total illegality of this.

Figure 1.2. Survey at research units: the perception of the legal risks in connection with data. For a color version of this figure, see www.iste.co.uk/fabre/factory.zip

Page 12: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

12 The Digital Factory for Knowledge

This demonstrates two important points:

– on the one hand, researchers today do not systematically know the status of intellectual property rights for the data that they use every day as part of their job;

– on the other hand, researchers therefore also do not know their own rights concerning raw data and, as a result, the protected data that they could produce (e.g. scientific publication).

The following survey reveals that 65% of those interviewed have never been faced with “legal questions concerning the digitization and uploading of content” and that, consequently, they have never looked into this issue. All of this therefore leads to a significant – and growing – risk of committing illegal practices due to a simple lack of regard for laws and thus of invalidating much potential, even pertinent, research due to technical flaws.

This lack of knowledge concerning legislation on the parts of people who are nevertheless the primary parties concerned highlights an urgent need to review existing laws. Most importantly, there is a need to educate researchers in the coming generations in order to ensure that they have precise knowledge of legislation.

There is thus great tension in the area of research between the existence of a legal framework concerning the protection of data and its visible lack of application by researchers themselves. This disconnect clearly poses the question of a reform of the legal frameworks of research, surpassed today by the technical evolutions linked to the digital sector.

1.3. The need to elaborate several types of legislation

1.3.1. Platform rights

Between the terms of platform or database use and the intellectual rights and agreements established with publishers, there is something called a “no-go area”. This is an area where the regulation is not complete and therefore where there is no law to respond to eventual conflicts. This leads to negative consequences, as we have observed in the previous sections, as well as problems in the regulation of publications.

Page 13: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

Data Production and Sharing: Towards a Universal Right? 13

In its report on platform neutrality, the CNN (Conseil National du Numérique, the French Digital Council) advocates reducing these zones by adapting existing laws to the digital field. It also advises redefining the notion of “platform” as such, given that the current definition dates back to 1996, i.e. the very beginnings of the digital sector! It also hopes to adopt obligatory loyalty for all platform operators in order to reduce the risks of regulatory drifts.

These two suggestions have indeed been taken into consideration for the law for a Digital Republic, announced by the French National Assembly in 2016. In fact, Article 49 of the project for a Digital Republic includes an official definition of the notion of a platform, which is as follows:

“Activities involving the classification or referencing of contents, goods, or services offered or placed online by third parties, or comparing, by electronic means, several parties with the intent to sell a good, provide a service, even at no cost, or exchanging or sharing a good or service”.

This new definition is precise and encompasses every form that platforms have taken throughout the last 10 years, and therefore allows legal gaps like those previously seen to be avoided. In the digital law, there is also the notion of obligatory loyalty, which will henceforth be imposed on platform operators. It is subject to the following definition:

“All online platform operators are bound to deliver loyal, clear, and transparent information to the consumer concerning the general terms of use of the go-between service that is being offered and on the methods of referencing, classifying, and dereferencing the contents, goods, or services offered or placed online”.

In this manner, the government hopes to put an end to the abuse of certain publication or data platforms, using complex and usurious terms of services to deprive researchers of their rights without them even knowing it. These new regulations in France are therefore a large step towards filling the legal gaps and general lack of upfront regulation of the new systems for sharing knowledge.

Page 14: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

14 The Digital Factory for Knowledge

1.3.2. Text and Data Mining: the great new stake

The practice of Text and Data Mining (TDM) is a major stake for science, research and innovation insofar as it allows new research subjects and new knowledge to be extracted and economic, social and societal issues to be resolved. The scientific and economic stakes are even more important, given that the practice of TDM is worldwide and is the object of different normalizations from one country to another, including within Europe.

Germany has introduced a right for secondary exploitation of scientific publications. The United States and the United Kingdom have confirmed the right of researchers to proceed with TDM operations [CEN 16]. French research cannot be allowed to discriminate with regard to its European neighbors and has been slower in this concern, risking an irreparable situation and one whose consequences could be extremely harmful.

To proceed with TDM operations on data corpora, it is necessary to proceed with data or text searches generally requiring content to be copied or extracted [CEN 16], yet these acts, in principle, trigger the application of copyright and/or database rights. In fact, traditional exceptions such as quotation laws, research illustration and provisional technical copies are badly suited to the practice of TDM.

The absence of a legal statute for exploring data and the inadaptability of the rights of database producers for the dynamic processing of knowledge are a source of legal insecurity that must be responded to by the law.

In this context, scientific publishers have expressed a true desire to introduce licenses for TDM. This contractual solution was strongly encouraged across Europe by the Licences for Europe process [EUR 13] in 2013, although this initiative revealed itself to be a failure in the end.

In France, an Open Data policy has also developed, as well as a progressive extension of the lack of charge and the availability of information under an open license.

Legally, Open Data is a renunciation of database laws [MAU 16], where we are free to:

– reuse information;

– reproduce, copy, publish and transmit information;

Page 15: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

Data Production and Sharing: Towards a Universal Right? 15

– spread and redistribute information;

– adapt, modify, extract and transform using this information, particularly to create derived information;

– exploit the information commercially, e.g. by combining it with other information or including it in our own product or application,

provided that:

– the authorship of the information, its source, at least the name of the producer and the date that it was last updated are mentioned;

– there are intellectual property rights;

– the producer guarantees that the information is not protected by intellectual property rights belonging to third parties.

Furthermore, the European Union encourages moving towards opening research data with the Open Research Data Pilot [EUR 13], in the framework of the Horizon 2020 project, by promoting the use of CC BY or CCO licenses for the openness of research data [MAU 15].

1.4. Open Science: an achievable goal?

The new law of Open Science is an international movement for data openness and sharing, preferred by the International Scientific Community [CEN 16]. Open Data, Open Format, Open Source, Open Access and Open Process are different spheres of data openness whose common philosophy is the sharing and free reuse of data.

Figure 1.3. Open license logo and Open Source logo

Let us take the example of Open Source: open, more or less permissive licenses, allowing data, database, digital creations or software to be made available to third parties.

Page 16: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

16 The Digital Factory for Knowledge

The most used licenses, particularly in Open Data, are the following:

– Etalab license;

– ODbL license;

– PDDL license;

– Creative Commons licenses.

Creative Commons licenses were created by starting with the principle that intellectual property was fundamentally different from physical property and from the idea according to which the current laws on copyright were a brake on the spread of culture. Their goal is to provide a legal tool guaranteeing both the protection of copyright to an artistic work and the free circulation of the cultural content of this work, thereby allowing authors to contribute to a heritage of works freely accessible to everyone.

Figure 1.4. Different combinations and logos corresponding to the types of Creative Commons licenses

1) Attribution

This obliges the users of a work placed under a Creative Commons license to credit it author without the author himself/herself approving or supporting the party using it.

2) No commercial use

The user cannot use the work for commercial ends. If the “sharing in the same conditions” and “no modification” conditions in the license are not authorized, the user can reproduce, spread and modify the work.

Page 17: Scientific Resources and Data Economy COPYRIGHTED MATERIAL

Data Production and Sharing: Towards a Universal Right? 17

3) Sharing in the same conditions

The user can also reproduce, spread and modify the work on condition that it is under the same Creative Commons license that the author has chosen.

4) No modification

The user cannot modify the original work. If he/she wishes to, he/she must contact the author to receive his/her authorization.

Throughout this chapter, we have thus seen that digitization has taken place quickly, consequently changing the system for sharing knowledge and increasing our abilities to process data thanks to new techniques depending on new technologies. This rapid and almost brutal change of the research sector has left a problematic legal gap, particularly in regards to the question of platform rights, as well as those of copyright. This gap must therefore be filled by new legislation that is slowly coming about both nationally and internationally.

Page 18: Scientific Resources and Data Economy COPYRIGHTED MATERIAL