Bernd Stahl

I recently had the honour of being invited to contribute the conference on “Healthcare in the Era of Big Data: Opportunities and Challenges”, organised by the New York Academy of Sciences on 24 and 25 October 2018. The event brought together about 150 stakeholders from groups including academia, industry, regulators and patient representatives. The aim of the event was to explore how big data technologies has influenced and is expected to further influence medical care.

The event was held in the venue of the NYAS which is located on the 40th floor of 7 World Trade Centre, offering this stunning view of Manhattan.


Big Data Ethics

The starting point of the event was the recognition that big data offers the promise to lead to new knowledge and insights that can facilitate new interventions. At the same time, this type of big data research raises numerous ethical, legal and social questions.


It quickly became clear that this was going to be an interesting meeting bringing contrary opinions together when Arthur Caplan from NYU School of Medicine suggested in his opening remarks that privacy is dead; a statement whose veracity was repeatedly questioned throughout the event. The question of privacy was one of the recurring ones that several of the presenters and panelists would return to.


The conference was conceptually and from a disciplinary perspective clearly grounded in biomedical sciences. It was therefore not surprising that many of the questions and debates linked to well-established positions and discourses in these areas. Conceptual questions arose in different ways. The obvious one of the definition of big data was one of them with the three Vs (volume, variety, velocity) being one of them. This definition also pointed to some of the open questions, for example raising the question whether additional Vs, like veracity, might be needed. Beyond these definitory questions, it became apparent that big data raises epistemological challenges for biomedical research. Almost by definition big data goes beyond well-structured data typically collected in the gold standard of biomedical research, in randomised control trials. Big data can provide important insight in areas where such trials are impractical or impossible. But this raises the question of the quality of insights derived from big data. What would be standards that big data would need to fulfil to count as appropriate evidence in making decisions on diagnoses or treatment?


Such epistemological questions easily blend over into ethical ones. If the epistemic status of big data is uncertain, then decisions made on the basis of big data analysis may raise ethical issues. At the same time, however, ignoring such “real world data” may also have negative and thus ethically relevant consequences. In the US system there is a strict distinction between medical and non-medical data with privacy protection depending to a significant degree on the status of data as being medical. However, in the world of big data where many data sources can provide insight into health-relevant matters, this distinction between medical and non-medical data is increasingly blurred.


It is therefore probably not very surprising that Harlan Krumholz of Yale University made the point in the first keynote presentation of the event that medicine is emerging as an information science. He convincingly argued that current medical data is relatively superficial. Medical labels often lack depth and ability to distinguish between relevant cases. The digital transformation of healthcare has the potential to make medicine more humane and personalised. During the second keynote of the event by Amy Abernethy of Flatiron Health we were given impressive examples of these potential benefits drawn mostly from the use of electronic health records for oncology. One example that Amy provided was the radical change in uptake of new drugs in recent years, based on better insights.


While the ethical benefits of big data research are thus starting to materialise, the ethical challenges remain open. A key question is how downsides of big data analytics can be avoided. A frequently used term that promises a way forward is that of data stewardship, which implies ethical responsibility for data and its beneficial use. Ownership of data was discussed but it was argued that US law does not give ownership in medical data in the traditional sense of ownership. It is therefore crucial to ensure that patients’ voices are heard, not least during processes of engagement and recruitment in medical studies.


Privacy Protection and Data Ownership

The panel that I had the pleasure of chairing included Eric Perakslis, Arti K. Rai, Mark Barnes, Nadav Zafrir as panelists. We touched on the requested topic of privacy and ownership but also went beyond this remit.

Panel chaired by Bernd Stahl. Photo from

The panel restated that the current state of dealing with healthcare data is problematic. Security issues, for example, need to be seen in a broader context. Health data is no longer just clinical data but spills over and across the boundaries to other types of data, such as lifestyle data, as may be captured by fitness tracking devices. The focus on protecting healthcare data may therefore be misleading. Similarly, the security of data and physical security are closely related. In addition to questions of security and the possibility of securing health data, a key question the panel discussed was the question of ownership of data and ownership of big data technologies. One problem in this area is that patents are difficult to acquire by tech companies. These companies therefore often rely on trade secrecy. This makes it even more difficult for the data subjects to understand how their data is used, as companies are highly reluctant to share the details of the technology.


Big data in healthcare highlights some fundamental issues. One of these is the relationship between privacy and open data. There is a tension between these and it was argued that one cannot have it both ways. But what is the solution? Is it to be found in legislation, such as the European General Data Protection Regulation (GDPR)? For me as a European attending a US conference it was striking how much the GDPR was referenced throughout the event. But that does not imply that it was seen as the solution. While the intention behind the GDPR to protect citizens was welcomed, there was also much criticism of this approach. It was felt that the GDPR is not workable, that it creates too much confusion and is not uniformly applied across Europe. But if the GDPR or comparable data protection regulation does not solve the problem, then what does? One suggestion was to legislate in a way that outlaws the misuse of data. This would allow the use of data for research and might address concerns about misuse.


However, the problem may run deeper than concerns about particular data and their use. Ethical issues in healthcare must be understood in the greater societal context. Much of the big data and AI debate assumes a positive future, a utopia where healthcare big data leads to positive and generally accepted outcomes. But alternative futures are also conceivable, e.g. one where the trust in the political system and governance structures is lost. This can lead to risk aversion and stifle discovery and innovation.


The lively discussions of the various panels covered these and many more questions. From my perspective it was good to see that many of the issues we are discussing in Europe when it comes to big data and AI are covered in similar ways in the US. The conference highlighted many questions, but in most cases solutions are not obvious and will require more work. There are solutions worth pursuing, for example through the development of homomorphic encryption.


But underlying key ethical issues remain unresolved. These include equity and the question how data subjects can adequately benefit from the results of the use of their data. Maybe even more fundamental and more important are questions of power structures and distribution of wealth. While big data in healthcare offers the vista of new treatments, inequalities in the healthcare system remain pervasive. Finding new cures is important but we are collectively not very good in making use of well-established and proven activities, from increasing exercise to giving up smoking. This is particularly pertinent in the US where public health in comparison to other countries is declining, despite enormous investments in the area. It is therefore important not to see big data in isolation but to understand it as part of a larger societal context. Power, wealth, distribution and equity are questions that are not confined to big data in health, but without taking them into consideration the scientific advances to be gained from new technology may exclude communities and prove short-lived and hollow. The question of how to balance costs and benefits of new technologies is not a technical one. Unless our political processes find a way to address it in a way that convinces citizens, technical progress may not only fail to lead to the public good, it may very well cause significant social and political backlash.


If you find this short summary interesting, then you may want to watch the videos of the event which are available here:


Bernd Stahl is the Human Brain Project’s Ethics Director and Professor of Critical Research in Technology and Director the Centre for Computing and Social Responsibility at De Montfort University, Leicester, UK. His interests cover philosophical issues arising from the intersections of business, technology, and information. This includes the ethics of ICT and critical approaches to information systems. 

Author Profile

Leave a Reply

Your email address will not be published. Required fields are marked *