How open is EU public procurement data ?

24 July 2015

Author: Sachiko Muto

A huge amount of data is generated daily by public procurement authorities throughout the EU. This is of course immensely useful as that data can be mined to identify interesting trends and produce analysis (such as our very own Procurement Monitoring Reports which aim to document the practice and discriminatory impact of referring to specific brand names when procuring for ICT solutions). Most of that information is collected and made available through an online portal called Tenders Electronic Daily (TED).

Accessibility

Recently the Commission announced that the data would also be available through the EU Open Data Platform :

The EU Open Data Portal was set up in December 2012 to facilitate access to and reuse of EU data, and since its inception has collected over 8,000 datasets, with more being added regularly. While in most cases (as for the TED data) the information made available through the platform is not new, this platform makes it more visible and accessible, and in that sense it can be said to increase its openness. After all, what use is a data-set if no one can find or easily exploit it?

Machine-readability

Another perk which could be credited to the Commission is that all of the data on TED (and now on the EU Open Data Portal) going back to 1993 is downloadable in bulk in XML, a machine-readable format. This is an important requirement for any data to qualify as ‘open’, and greatly facilitates systematic analysis of the data. The other side of the coin however is that the XML format does not lend itself well to analysis ‘by hand’, thus raising a kind of ‘barrier to entry’ for anyone proposing to re-use the data. At the same time, it should be noted that a subset of TED data (covering the most important fields from each contract award notice) is also made available in the more accessible CSV format. In addition, there have also been a number of community initiatives (such as OpenTED) to make that information easier to exploit by the less technology-savvy.

Licensing terms

Another important area to look at when assessing the openness of a particular data-set is the licensing terms under which it is available for re-use. TED data, like all information produced by the Commission, is published under a licence set out in Decision 2011/833/EU on the re-use of Commission documents. In many ways, that licence qualifies as an ‘open’ one, considering in particular that it allows re-use without any need to seek individual permission, for both commercial and non-commercial purposes, so long as the source is attributed. However some fuzziness still remains as to when this licence applies, and as to when and how exceptions can be made. This was the topic of an interesting exchange on Twitter between Open Data expert Friedrich Lindenberg, Mathias Schindler and the Commission’s Open Data team :

(click here to read the full exchange).

One may also wonder why the Commission, which recently recommended the use of Creative Commons licences in a set of guidelines on the implementation of the PSI Directive, does not apply its own advice and use well-established and widely recognised licences in an effort to limit licence proliferation, and the added complexity and legal uncertainty that such proliferation entails.

Comprehensiveness

Finally, this very quick assessment of the openness of EU public procurement data would not be complete without a word about the quality of the data itself. Any analysis can only ever be as useful as the comprehensiveness and consistency of the information contained in the data-set allows it to be. Generally, TED data is very good in that respect, but nevertheless it does vary greatly across (and sometimes within) different countries, due to a number of factors, including the following :

  • Public authorities are only required to publish on TED those legal documents which pertain to contracts that exceed above a certain threshold in total value. Whilst it is considered good practice also to publish on TED those contracts that fall below this threshold, in practice not everyone does.
  • Even when details of a particular contract are published on TED, the amount of available information can vary greatly, contract by contract. Typically, a short summary is provided in all EU languages, and further information can be found in the original language of the contract; however in many cases the full documentation is only available on a separate platform, which can only be accessed through a separate login, and which therefore is effectively impossible to ‘mine’ for the additional information placed on the platform and so protected by that login requirement.