Research Practices and Tools: 2013

Monday 30 December 2013

The world according to Elsevier

Elsevier is considered by some academics as the worst predatory publisher, and is a favourite target for boycotts. In particular, the negotiations for subscriptions to bundles of journals, which were the subject of my previous post, are particularly difficult in the case of Elsevier, leading to much frustration for academic institutions. But what sets Elsevier apart from other academic publishers? Let me give some tentative answers, based on observations on Elsevier's behaviour. I will first give the data, and then try to explain them in terms of a coherent strategy.

Elsevier considers open access to scientific publications as a threat, and fights against it. This is explicitly said (in milder terms) in their 2011 financial review.

Imminent capitulation of research institutes in "negotiations" with publishers

Recently, I have read pathetic emails from librarians who are involved in "negotiations" with publishers for the renewal of subscriptions to academic journals -- subscriptions which, if not renewed, will lapse at the end of this year. Subscriptions are "negotiated" at a high level: nationwide consortiums of research institutes on one side, large publishers selling bundles of hundreds of journals on the other side. Here is how such "negotiations" go: the publisher demands an outrageous price, and the consortium agrees to that price, minus a symbolic discount if the publisher wants the consortium to save face.

This outcome is the inevitable result of the basic economics of the game. A big publisher enjoys a de facto monopoly on his journals, and can therefore set prices based on what they think the consortium can pay. The only way out for the consortium would be to threaten not to subscribe. But the larger the consortium, the more researchers there are who will say they cannot live without the subscription, and the least credible the threat is. Anyway, consortiums are built with the specific aim of subscribing, and do not have the option to do otherwise.

This is why the "negotiations" with publishers are not really negotiations, and why the negotiators are so dispirited.

Bibcac: a Perl script for classifying references in a Bibtex bibliography

In my previous post I proposed a script for updating a Bibtex file with publication data. Here is now a completely independent script for organizing the references in a Bibtex file.

To do this, we will not modify the Bibtex file, but rather create an auxiliary file where we will associate categories and comments to a number of the references. For instance, if we want to associate the category "Topological Recursion" and the comment "This article..." to the article whose Bibtex key is "cer12", we will add the following text in the auxiliary file:

\ArticleLabel{cer12}
\ArticleCategory{Topological Recursion}

This article ...

Why would we want to have such categories and comments? Categories are particularly useful if they match some classification of the articles, whether as files in a computer, or as physical printouts. If you keep articles in a number of folders, the program will enable you to find in which folder a given article is, and to list the articles in a given folder. As for comments, keeping them in a centralized file is safer than having them as annotations (handwritten or electronic) on a copy of an article, which might well get lost.

Bibpup: a Perl script for updating Bibtex files using Inspire

Bibtex is a nice way to manage bibliographies, by collecting the bibliographical data of many articles in one Bibtex file, and then citing only some of these articles in a given document. However, the bibliographical data of an article can change when the article gets published in a journal. If the article was initially entered in the Bibtex file as a preprint, the Bibtex file must then be updated.
Here I propose a Perl script which does this automatically using the search engine Inspire, which is the standard search engine for high-energy physics and related fields.

Of course I do not believe that it is very relevant or useful to know whether an article is published and in which journal, but journals typically require such data to be displayed in bibliographies. In any case, the decision to display publication data or not is done at the level of the bibliography style file -- the Bibtex file itself is only a database, which should be as complete as possible.

So what does the script do? the script takes a Bibtex file, and for each entry which has no publication data, but an eprint field with an Arxiv number, a request is sent to Inspire. If publication data are found, then the entry is modified to include them. (The rest of the entry, and in particular the key, are not changed.)

Academic publishing reform 3. Institutional evolutions

Institutions such as universities and funding bodies play a major role in research policy, and therefore potentially in scientific publishing. There are many things institutions could do to improve the publishing system, such as:

renouncing the use of impact factors and citation counts,
evaluating researcher's works on their merits, and not on whether or where they are published,
lauching and supporting innovative journals,
stopping paying Big Deal subscriptions to publishers, and facilitating the buying, hoarding and sharing of individual articles by researchers,
ordering the researchers they employ not to work as referees or editors for the worst publishers.

But institutions seldom if ever do these things, and since I want to discuss only observed behaviour, the discussion will be rather brief. Actually, institutions are sometimes more inclined to embrace pernicious innovations, such as the h-index or the publishers' Big Deals, than genuine progresses, especially when the pernicious innovations favour the bureaucracy's natural tendency to increase its own size and power.

For example, a common misconception is that large institutions are better than small ones at dealing with big publishers. Actually, given the all-or-nothing nature of the proposed Big Deals, the only available negociating tool is the threat to not subscribe at all. The smaller the institution, the more credible the threat. People who are specifically mandated to negociate with a publisher on behalf of a large institution or consortium cannot make such a threat, and must agree to the dictated price. (The publisher may pretend to demand an even more outrageous price at the beginning, and then retreat, in order to hide the fact that negotiations are meaningless.)

A common and significant way institutions try to improve the publishing system is by adopting open access policies. Clear and detailed explanations on the subject of open access to scientific publications are provided in a Unesco document. However, not all open access policies are effective. An effective policy should:

ask researchers to deposit their articles in an institutional repository, in order to take control back from the publishers,
be mandatory, and count only articles in the repository for managing careers,
allow for publishers' embargos, but moot them by providing a "request article" button for semi-automatically distributing embargoed articles,
avoid funding Gold open access publishing,
provide some advantages, not only obligations, to researchers. These advantages, beyond the increase in readership which automatically comes with open access, can be a facilitation of administrative formalities, by the automatic compilation of publication lists.

Such a policy was adopted by the University of Liège under Bernard Rentier.

Friday 11 October 2013

Academic publishing reform 2. New publishing models

The organization of traditional academic journals arose at a time when their most important task was to disseminate information, and when the printed word was expensive and therefore scarce. The electronic word is no longer expensive, but readers' time still is, and there remains some justification for imposing conciseness in articles. But disseminating articles can now be done for free, and subscription-based journals now hinder the flow of information using paywalls and legal threats, rather than helping it. Not to mention the very useful information which they never disclose -- the reviewers' reports.

There are many ideas on how to improve scientific communication. Articles themselves might be replaced by wikis or something else. Keeping articles as the primary means of communication, the natural idea is to build, as Gowers puts it,
"a cross between the arXiv, a social networking site, Amazon book reviews, and Mathoverflow". However, it is difficult to start a new system from scratch.
Any new system must be able to compete with the established journals, which not only are entrenched in the minds of academics, but also hold sway over their careers. Being able to compete limits the radicality of possible innovations: the new system must be officially recognized and indexed as a journal, and must play the game of bibliometrics. Of course, it is possible to solve each problem in turn, and to start winning over academics with services such as social networking, before trying to be recognized as a journal.

So, which innovations have been or are being tested?

The simplest idea is to start a traditional journal using modern tools, thereby reducing costs to a minimum. One can use free journal-management software. One can even rely on the arXiv for hosting and distributing the articles, and do only the selecting and reviewing: this is the idea of the overlay journals.

A more ambitious proposition is to start journals with advanced features such as open peer review or a new economic model. This was done by eLife, and PeerJ. These examples show that, when given the choice, most authors will make reviewers' reports public. Moreover, PeerJ has introduced a new economic model where authors pay of the order of 100 euros for a lifetime membership -- negligible costs when compared with the thousands of euros per article of typical Gold open access publishers.

Finally, the more radical idea of allowing the public to comment on articles, instead of picking one or more reviewers, is being tested by the Selected Papers Network and Publons. For the time being, these systems are not supposed to replace journals, but they can be used to publicly discuss articles. The community would certainly benefit if email exchanges between readers and authors, as well as reviewers' reports, were thus made public.

Friday 4 October 2013

Academic publishing reform 1. Boycott

Important reforms of academic publishing, such as achieving open access, reforming and opening peer review, and drastically reducing costs, are desirable and feasible. Such reforms will not come from traditional academic publishers, which enjoy de facto monopolies and make huge profits in the existing system.
Reforms have to come from academics and academic institutions. In a series of posts, I will review some of the actions which are being undertaken by individuals and institutions. The first post is about boycotts.

Boycotting the worst established publishers is the most straightforward way for academics to act: it directly addresses the main obstacles to reform, and it can be done by individuals. The most prominent example is the boycott of the publisher Elsevier. While this boycott is no existential threat to Elsevier, it is prominent enough to be noticeable by financial analysts.

Academics can boycott journals as authors, reviewers and editors.

Boycotting a journal as an author can be dangerous for one's career, as there are usually very few journals with a given level of prestige in a given scientific field. Moreover, authors often have coauthors, and dragging them into a boycott can be difficult, and even irresponsible when some of them are junior scientists.

Boycotting a journal as a reviewer is a priori easier, as an anonymous reviewer's work is only known to the editor, and has little or no impact on the reviewer's career. However, most reviewers consider their work as an altruistic act benefitting the community, and will not easily be conviced that working for certain journals is detrimental to the community, by perpetuating obsolete and overly expensive models of publishing. There is little a journal could do if faced with a widespread reviewers' boycott. Paying reviewers might help, but the sums would have to be small. (Assuming profits of 1000 euros per article, out of which 20% are diverted to paying reviewers, an acceptance rate of 50%, and two reviewers per article, we obtain 50 euros per reviewer.)

A journal cannot survive without good editors, and the prestige of the editors as researchers is decisive for the standing of the journal. Editors will not easily decide to resign: they have often given much effort to the journal, and want to continue the good work. Moreover, editors are typically paid. Nevertheless, it has sometimes happened that a whole editorial board resigns, in order to recreate a similar journal independently from the original publisher.

None of these three types of boycott is very widespread. This is not surprising: on the one hand, journals are established participants of the research enterprise, and academics mostly communicate with journals via other academics -- the editors. On the other hand, the publishers' mischief is not directly visible to academics, as journal subscriptions with their abusive prices are negotiated way above their heads. Journals show their nice faces to the academics, and their ugly faces to the hapless librarians. Then academics fail to understand why they should boycott, and librarians fail to understand why academics are reluctant to boycott.

Saturday 21 September 2013

Comment ne pas financer la recherche [How not to finance research]

[This post is in French, because it specifically deals with the French research system.]

Il y a quelques années, j'ai reçu un coup de téléphone d'une employée du Conseil Régional d'Auvergne, qui cherchait à joindre un collègue. C'est que le Conseil Régional d'Auvergne finance la recherche: il fait des appels à projets, et distribue de l'argent aux meilleurs projets candidats, après avoir demandé à des spécialistes de les évaluer. C'est pour évaluer un projet de cosmologie qu'on essayait de contacter mon collègue, qui est cependant spécialiste de gravité quantique -- la différence peut paraître subtile, mais elle est rédhibitoire.
J'ai donc renvoyé mon interlocutrice à un autre collègue, et je ne connais pas la fin de l'histoire. La procédure de sélection des spécialistes, par du personnel apparement dépourvu compétences scientifiques, m'a cependant semblé quelque peu amateuriste. De plus, une politique scientifique d'envergure régionale risque de devoir arbitrer entre des thématiques fort éloignées. Comparer des projets thématiquements éloignés est un exercice quasiment impossible: si on consulte des spécialistes de chaque thématique, chaque spécialiste a intérêt à donner un avis positif, pour défendre sa propre discipline.

Du point de vue des chercheurs, la multiplication des guichets -- l'existence de sources de financement petites mais nombreuses -- est source de perte de temps. Et puis, est-ce bien la peine de passer des heures à écrire un projet, selon des règles spécifiques, pour espérer obtenir un financement minime, qu'on ne pourra dépenser qu'en suivant d'autres règles spécifiques? Beaucoup de chercheurs penseront que non. Ceux qui seront candidats ne seront pas nécéssairement les meilleurs, et auront parfois des motivations indirectes, comme de plaire à leur hiérarchie en attirant des financements extérieurs.

Certains appels à projet n'ont pas directement pour but de faire avancer la recherche, mais, par exemple, de créer des collaborations nationales ou internationales, de former des étudiants, etc. En fait, c'est non seulement le financement, mais aussi l'évaluation de la recherche, qui sont de plus en plus conditionnés par des activités dérivées: publications, conférences, collaborations, vulgarisation, formation. Ces activités accompagnent normalement la recherche, mais il est en fait possible de s'y livrer sans faire de recherche proprement dite. Et c'est ce à quoi certaines politiques de recherche incitent les chercheurs, en ne se focalisent pas directement sur la recherche et ses résultats.

Les chercheurs ont besoin d'argent pour le matériel, les conférences et autres déplacements professionnels, et l'embauche d'étudiants et de postdocs.
Il y a des besoins réguliers: ordinateurs personnels, matériel de bureau, étudiants stagiaires, conférences. Il y a des besoins exceptionnels: matériel coûteux, déplacements prolongés, embauches importantes. Il serait logique de financer les besoins réguliers par des financements récurrents, et les besoins exceptionnels par des appels à projets. Or on constate la disparition progressive des financements récurrents, et la prolifération des petits appels à projets.
On risque d'en venir à financer des besoins réguliers par des appels à projets. Non seulement ce serait une perte de temps considérable, mais encore cela conduirait les organismes de financement à microgérer la recherche.
Ce serait comme si, dans la vie courante, on devait emprunter de l'argent non seulement pour acheter une maison ou une voiture, mais aussi pour faire les courses alimentaires, en établissant pour cela un budget détaillé des dépenses prévues -- et sans être sûr d'obtenir le prêt.

Paradoxalement, la baisse des crédits semble conduire les instituts de recherche à une sorte de course aux armements dans le domaine de la communication. Il semble qu'un institut de recherche ait de plus en plus de mal à vivre sans un logo dessiné par un professionnel, des plaquettes présentant ses activités, et du personnel chargé à plein temps de la communication. En bonne partie, il ne s'agit pas de vulgarisation destinée au public, mais de communication destinée à faire bonne impression aux financeurs. Aurons-nous un jour dans les laboratoires des directeurs marketing, chargés de choisir les projets de recherche selon leur caractère "vendable" aux organismes financeurs?

J'ai récemment entendu un haut administrateur dire que "le financement de la recherche a changé de manière dangereuse", et déplorer la baisse des budgets.
Cette baisse des budgets serait un moindre mal, si l'évolution du mode de financement ne poussait pas en même temps à l'hypertrophie d'activités de communication et d'administration, au détriment de la substance même de la recherche.

Thursday 29 August 2013

Write for humans, not for robots

The recent San Francisco Declaration On Research Assessment (DORA) aims at improving how scientific research is evaluated. To do this, the declaration wants not only to improve the evaluation process itself, but also to modify how research is reported, in particular by having authors of scientific articles cite original articles in preference to review articles. But this would be bad scientific practice: researchers should not have to worry about the reliability of bibliometrics when they do research or write articles.

As a method of evaluating research, bibliometrics has many well-known shortcomings. The DORA denounces some of these shortcomings, and proposes remedies. Some of these remedies are common-sense ideas, such as stopping using journal impact factors, and evaluating the contents of articles (and other research outputs) rather than relying on bibliometrics.
Some other proposals aim at improving bibliometrics. It is not obvious that this would be good, because bibliometrics will always encourage bad practices, such as trading citations and authorships, and splitting results into small articles. Eliminating some of the flaws at the expense of making metrics more complicated may not be worth the trouble.

And the proposals 10 and 16, which want authors of scientific articles to cite original articles in preference to review articles, are downright pernicious. The problem is no longer to improve bibliometrics itself, but to modify how articles are written, with the aim of making bibliometrics more reliable. So the DORA claims that bibliometrics is so flawed that its use should be much reduced, but that it is so important that great efforts should be made to improve it -- efforts not limited to modifying the metrics themselves. In this sense, the DORA is self-contradictory.

This might not matter too much, if citing original articles was good scientific practice. But it is not. The original literature is a tangled mess of more or less reliable and understandable texts. Review articles play the vital roles of making sense of it, and of promoting common, generally accepted terminology and ideas. Writing review articles is in some cases more useful than doing original research, and should be encouraged. Moreover, researchers often do not learn of existing results from original articles, but from other texts, which may be clearer or more accessible than the original articles. Citing the original articles would often mean citing articles without having read them, which is obviously bad practice.

Remember why articles cite earlier works in the first place: to avoid repeating material which is available elsewhere, and to help readers find the origins and proofs of results which are built upon. To do this efficiently, one should cite as few articles as necessary, and select them based on clarity, reliability and ease of access. (Open access should surely be favoured.) There is no reason to favour original articles. Favouring original articles serves another purpose: as the DORA puts it, to "give credit where credit is due". But the purpose of research articles is not to give credit. An article's contribution to the history of its subject is only a byproduct, and should not take precedence over its primary purpose of reporting scientific ideas.

Bibliometrics was supposed to help with the task of giving credit, but now we are told that it will not work unless we think more about bibliometrics than about readers when we write articles. This should be resisted: write for humans, not for robots.

Tuesday 27 August 2013

This blog's initial intentions

This blog is for discussing practices in scientific research as they are and as they should be, and the tools which can help doing research efficiently.

The focus is not to describe the practices which researchers should follow for the sake of their careers, but rather to discuss what should be done in order to do good and useful research. As a theoretical physicist, my definition of useful research will be research whose results are easily accessed, understood and discussed -- so scientific publishing will be a major topic.

It has been obvious for a long time that scientific journals could be replaced with a much more efficient and cheap system. But opinions vary widely as to whether and when this can happen: from the prediction of the imminent doomsday of commercial science publishing, to despair at the slow progress of open access. It is also not clear how a new system could emerge from the current situation: evolutionarily from existing journals? as an Arxiv overlay such as the Selected Papers Network? or as a new creation such as the journal PeerJ?

In some cases, there is a consensus on what the good and bad practices are, and the question is how to switch from the latter to the former: how do we escape rapacious scientific publishers? how do we stop using impact factors and the h-index? In other cases, the identification of good and bad practices is less clear: is it good practice to use Mathematica? should researchers contribute to Wikipedia? should publicly-employed scientists put their writings in the public domain?

Some recommended reading:

Gowers's Weblog.
Two texts of mine on scientific publishing (in French): Journaux scientifiques: échapper aux abonnements ruineux and Les publications scientifiques au CEA : quelques questions.
Nielsen's book "Reinventing Discovery" -- Citing it is admittedly bad practice as the text is not freely accessible, but this book is a compelling explanation of how research practices could be radically improved.

Research Practices and Tools