With increased specialization of healthcare services, and high mobility of patients, accessing healthcare services across multiple hospitals or clinics is very common for diagnosis and treatment. In particular, for cancer patients, the transition of care and care coordination is very common. Timely sharing of electronic health records (EHR) across providers is essential for prompt care of cancer patients, not only for cancer treatment (including a request for a “second opinion” in a hospital that may be located abroad), but also for post-treatment monitoring, as an up-to-date longitudinal history of a patient plays a critical role for evaluating and optimizing the care delivered. For example, in practice, it may be urgently required to know the radiation dose received during the treatment to avoid possible harmful consequences for the patient. The gravity of the chronic condition, and possible disability of patients to manage their own medical history, as well as required consent management procedures may further complicate data-sharing, leading to delay of treatment.
Access to a complete history of patients’ data will also empower personalized medicine and improve healthcare quality through machine learning techniques . Yet, healthcare data are highly sensitive: even in retrospect, the history of serious medical conditions can become a discriminating factor; thus, it is crucial to ensure that patients can control who can access their data – and when. Even in the presence of EHR and ecosystems for health information exchange (HIE), the following question remains open: How can we guarantee that the patient’s data are complete, stored securely, and can be accessed according to the patient consent in a fast and convenient manner?
Oncology-specific information systems (in addition to the EHR system) are widely used as oncology data are highly heterogeneous: the data can be originated from different data management systems and include specific comprehensive information (laboratory results, pathology reports, etc.) and multiple high-resolution large-size radiology images and PET/CT scans. Such systems can facilitate oncology-specific comprehensive information and image management and assist clinicians to manage different types of EHR data, develop oncology-specific care plans, and monitor the radiation dose of patients. Yet, these systems cannot currently address the aforementioned issues related to consent management and access-control policy enforcement, in particular, in decentralized settings.
The possibility of using emerging blockchain technology (BCT) for healthcare data management has recently raised major attention in both industries and academia [2-15]. Blockchain is a peer-to-peer distributed ledger technology that provides a shared, immutable, and transparent append-only register of all the actions that have happened to all the participants of the network. It is secured using cryptographic primitives such as hash function, digital signature, and encryption . The data in the form of transactions, digitally signed and broadcasted by the participants, are timestamped, and grouped into blocks in chronological order. A hash function is applied to the content of the block and forms a unique identifier of the block. This identifier is stored in the subsequent block. Due to the deterministic property of the hash function, one can easily verify if the content of the block was modified by hashing the block content and comparing it with the identifier from the subsequent block. Many blockchains can execute arbitrary tasks, typically called smart contracts or chaincodes, written in a domain-specific or a general-purpose programming language .
To add a new block to the ledger a consensus protocol is employed . Based on how the identity of a participant and its right to participate in the consensus are defined within a network, one could distinguish between public and private, or permissioned and permissionless blockchain systems. In a permissionless system, the participants’ identities are either pseudonymous or even anonymous , and every participant can submit a transaction or participate in the consensus protocol. A permissioned blockchain, in contrast, has means to identify the nodes that can control and update the shared state, and often also has ways to control who can issue transactions. A permissioned blockchain can be public, where anyone can read the ledger, but only a predefined set of users can participate in the consensus, or private, where even the right to read the ledger is controlled at the level of membership/identity of the users. Cachin and Vukolić  present an overview of consensus protocols used in the context of permissioned blockchains (e.g., Hyperledger Fabric, Tendermint, R3 Corda, and MultiChain).
BCT can enable a user to have complete control of data and privacy without a central point of control, thus highly cost-effective and efficient for building applications for sharing EHR data. This provides a unique opportunity to develop a secure and trustable EHR data management and sharing framework using blockchain, which will accelerate the data-sharing process and provide users with full access control over their own EHR data. In case of chronic diseases such as cancer, it is particularly important due to the multiple-medication intake (and therefore, reimbursement and prescription management), diagnosis and treatment conducted at multiple hospitals (due to specialization of centers, required “second opinion,” and mobility of the patient). Moreover, employing BCT can enable fast and secure data access for medical practitioners and researchers, leading to improved cancer treatment with significantly increased efficiency and reduced cost. Besides, multiple improvements can be brought to the different stages of the pharmaceutical supply chain: clinical trials will be accelerated by providing a possibility to connect multiple data providers (data sources) and, therefore, to collect more data in a shorter period of time and made transparent using BCT; medication production and delivery will benefit from the increased integrity, traceability, and visibility, thus enabling optimization of the whole supply chain process. However, regardless of ongoing academic research and high interest from the industrial perspective, blockchain-based healthcare data management systems are not in place yet.
The contribution of this work is two-fold: (i) we conduct a systematic literature review (SLR) to analyze the motivations, advantages, and limitations, as well as barriers and future challenges of applying state-of-the-art approaches to employing distributed ledger technology in oncology, (ii) we discuss the outcomes of the SLR and propose the directions of future research.
To provide a thorough study, we adhere to the procedure (Fig. 1) that was first proposed by Calvaresi et al.  based on Kitchenham et al.  and has already been applied to conduct the SLRs of the blockchain-based applications in the following domains: multi-agent systems  and tourism . Following the Goal-Question-Metric (GQM) , the generic free-form question “What are the motivations, approaches, limitations, and barriers when employing distributed ledger technology for data-sharing in oncology?” is formulated and broken down into the following structured research questions (SRQs).
SRQ1: How has the blockchain research and its application in oncology been evolving (e.g., in which year and in which country did the research take place)? SRQ2: What are the proposed applications of the BCT in oncology? SRQ3: What is the status of the solutions proposed in the primary studies (i.e., conceptual, design, implemented)? SRQ4: What are the requirements/motivations behind the employment of BCT? What are the objectives defined by the authors of the primary studies for the BCT implementations in oncology? SRQ5: What are the strengths/advantages of applying BCT in oncology? Which technologies have been employed before, if any? Which BCT designs and implementations have been proposed by the authors? Are there other technologies that are used in combination with BCT? SRQ6: What are the limitations of applying BCT in oncology? Do the authors propose to address the identified limitations? Which are the additional future challenges listed by the authors of the primary studies?
Based on the reviewers’ competences in BCT and given the oncology domain, the following keyword queries have been defined to perform the search for the relevant primary studies: (“blockchain” OR“distributed ledger”) AND(“oncology” OR “cancer”). The search has been conducted using the following sources: PubMed, IEEExplore, Science Direct, and Google Scholar.
Initially, 58 papers were collected. This number was then reduced to 12 after performing a further examination of the papers, in particular based on the following inclusion criteria: temporal (2008 or after), purpose and relevance (applying BCT in oncology), format (review papers, if any, are excluded from the analysis and considered separately), singularity/originality (discard papers presenting minor variations), and theoretical foundation (the primary studies should provide at least one of the following elements: [visionary formulation, theoretical definition, system design]). One has to note that several papers among the initially selected ones (and discarded later on) only mentioned applications of the blockchain in oncology referring to some of the selected papers as examples of applying blockchain in healthcare in general.
This section presents the outcomes of the SLR, which are obtained by applying the methodology presented in the previous section, following the research questions presented above.
To answer SRQ1 about demographics of the primary studies, we looked at when the research work presented in the papers has been conducted, and in which country the institution of the corresponding author is located (Fig. 2). Selected works were published in 2017 (6 studies), 2018 (5 studies), and one study was published in 2016. The timing is in line with the fact that the first attempt to import blockchain into the design of a healthcare system was made in 2016 by Yue et al. . The authors presented the design of the architecture of a healthcare data gateway application for secure control and sharing of medical data between different entities that may use patient data. The research works were performed in the following countries: USA (4 studies), Switzerland (2 studies), and Germany, Iraq, Taiwan, Italy, China (1 study per country). However, one has to note that based on the affiliations of the authors, some of the works were done in the framework of international collaboration, such as collaboration between Switzerland and USA  as well as USA and several European countries .
Next, we address SRQ2 and SRQ3. To do so, we identified specific application domains (related to oncology) of the research works presented in the primary studies (SRQ2) and technology readiness level: whether proposed applications are at the level of concept, architecture design, or prototype (SRQ3).
As shown in Table 1, all the selected studies propose to employ BCT in the following application domains: data-sharing for primary patient care [2-6, 8], conducting medical research [2-6, 8-11], and optimization of the pharmaceutical supply chain processes aiming to ensure the absence of the counterfeit drugs on the market [3, 7, 12, 13]. Most of the proposed approaches are presented as prototypes. This could be attributed to the facts that the works are focused on the specific healthcare area or the chronic disease such as cancer, and domain-specific requirements are taken into account for a stage of system prototyping. Some of these studies [2, 4-6] tackle data-sharing for both primary care and research, underlying that it is important to involve the patient in the data-sharing process and consider the patient’s consent. The work of Mettler  tackles all the three application domains, only at the conceptual level.
We summarize the requirements underlined by the authors for each application domain to motivate the application of BCT, and the objectives set up by the authors of the research works (SRQ4).
In the framework of data-sharing for primary patient care, there is a need to help patients to keep track of and control where their healthcare data are stored , to facilitate and speed-up sharing of the required data for the diagnosis and treatment , to consolidate the data from multiple sources, to ensure access to the full patient data history, including laboratory results, doses and negative side effects of medications . Zhang et al.  noticed that enabling patient-controlled data-sharing is especially relevant in the case of regional virtual tumor boards implemented via telemedicine [26, 27] for institutions that have limited oncology expertise and resources.
Conducting medical research involves data-sharing processes and intelligent data management. It is important to resolve the challenges faced by the regulators and return the control over private data, including medical records, back to the patients. This will provide patients with a possibility to contribute to research and commercial projects while ensuring privacy and security , speed up the phase of participants’ recruitment for clinical trials , and make available real-time EHR data in common data format . Shae and Tsai  also emphasize the need for a data-sharing standard. They argue that intelligent healthcare data management cannot be put in place while there is a lack of mechanisms to collect large and integrated heterogeneous data sets of various ownerships and to enable cross-domain and international collaboration, especially in medical domains due to many challenges, including huge size of the distributed data sets, ownership, privacy, and administrative and government regulation/policy imposed to the medical data.
Regarding optimization of the pharmaceutical supply chain processes, a set of highly important requirements targeted against counterfeit drugs is underlined by Mettler  and Schöner et al. : to increase trust and transparency, to ensure monitoring of the production processes for drugs, and to provide customers with a possibility to track pharmaceutical products throughout the supply chain.
Based on the identified requirements, the following objectives were defined by the authors of the primary studies:
Meet ONC  requirements regarding development of interoperable, privacy-preserving, and secure nationwide health information systems and the promotion of widespread, meaningful use of health IT to improve healthcare .
Facilitate direct verifiable and immutable transactions between the patient as well as different actors in the medical area without relying on a centralized party, while ensuring data security privacy of the distributed sensitive medical data [2, 3, 6].
Accelerate the cross-domain biomedical research and clinical trials, incentivize patients to contribute to the research studies, while providing privacy and security guarantees, transparency, and auditability [5, 8, 10].
Bring integrity, traceability, and transparency to the global drug supply chain .
To answer SRQ5, based on the primary studies, we first list current technologies and approaches that are currently in use, if any, to address the needs identified in the framework of SRQ4. Then, we focus on the specific BCT implementations and other technologies proposed in combination with BCT to achieve aforementioned objectives. Finally, we describe the advantages of using BCT.
In current practice, for primary care, if a patient needs to transfer his/her clinical data for the treatment purposes from one hospital to another, the patient will be required to sign a paper-based consent that specifies what type of data will be shared and the information about the recipient. Ecosystems for HIE aim to ensure that the data from EHRs are securely, efficiently, and accurately shared nationwide. However, HIEs have limited adoption, and there is a lack of standard architecture or protocol to ensure security and enforcement of the access control specified by patients . For medical research, according to Shae and Tsai , centralized datasets that are currently employed are costly and difficult to maintain, such as The Cancer Genome Atlas (TCGA)  for the cancer and precision medicine research. It also requires major effort for data collection and curation. Regarding current approaches for the pharmaceutical supply chain, most of the processes are peer-to-peer, which creates inefficiencies and leaves room for counterfeit drugs .
To replace the current approaches and their limitations, the authors of the primary studies mostly propose to use permissioned BCT (e.g., Hyperledger Fabric , R3 Corda, permissioned/private Ethereum testnet) due to the sensitive nature of healthcare data [2, 4, 12, 32].
Mamoshina et al.  propose to use a hybrid approach of Exonum – a framework for building blockchain applications which employs BFT (Byzantine Fault Tolerance) Bitcoin anchoring algorithm . The algorithm periodically outputs the hash digest of a recent block on an Exonum blockchain, which commits to the entire blockchain state and transaction history, in a transaction on the Bitcoin blockchain. Only hash is stored on the public blockchain, no sensitive data are revealed.
Similarly, using public Ethereum blockchain is proposed by Angeletti et al.  to store only public keys of the IoT devices (to ensure validity of the source) and the hash of the data produced periodically (to ensure immutability and transparence of the trials).
As noted in almost all the primary studies, use of BCT is only advantageous when combined with other existing technologies, such as off-chain data storage [2, 4, 8, 10], cryptographic primitives [2, 4], and domain-specific healthcare standards for data exchange . Combining principles of machine learning and artificial intelligence with the smart contracts’ functionalities, transparency, and traceability properties of the blockchain is marked as a promising approach for personalized healthcare and enhancing medical research [6, 8, 9].
Among the advantages of using BCT, which are brought by its properties, and apart from addressing the aforementioned needs existing in the oncology domain, the following are of high importance: creation of an interoperable standards-based architecture and data-driven marketplace for secure and scalable data exchange, making available precision medicine, collaborative decision support, and optimization of the pharmaceutical supply chain processes.
To address SRQ6, we list some of the limitations of current BCT-based systems and possible approaches to address them, as well as the future research directions discussed by the authors of the primary studies.
In case of employing off-chain data storage (for data storage or for running computations over the healthcare data) and membership service (in case of permissioned BCT implementation), the risk of creating a single point of failure exists. To address these limitations the following has been proposed: using cryptographic techniques (including symmetric and asymmetric encryption, digital signature, threshold encryption, and homomorphic encryption), decentralization of the data storage and membership service, and involving trustful independent parties, such as government agencies like FDA [2, 5, 8].
In case of using Ethereum, the limitations such as gas costs is noted in Kurtulmus et al. . This can be addressed by potential improvement of Solidity, a programming language used for writing smart contracts in Ethereum, creating a new language, or optimization of the code.
Secure key management and digital identity management are challenging. In this regard, coordinating stakeholders (e.g., insurance companies) across the industry and providing patients with easier (and secure) access to their own medical records are proposed by Zhang et al. .
Moreover, several directions of the future work were discussed by the authors of the selected works, including exploration of the distributed federated learning and distributed transfer learning mechanisms within the blockchain, and blockchain-based distributed data management mechanisms to integrate data sets originated from multiple sources , as well as implementation of EMR system architecture based on BCT in a real-life environment .
In this section we discuss major findings of the SLR and propose several future research directions in the area of applying BCT in the oncology domain.
Applications and Specificity of the Blockchain-Based Data-Sharing in Oncology
According to the conducted SLR, BCT has the potential to enhance data-sharing (for primary care and medical research), as well as to attain optimization of the pharmaceutical supply chain by bringing properties such as transparency, traceability, immutability, to name a few, to the applications. In the scope of this work, we analyzed current research work aiming at employing BCT in oncology. Among the primary studies, there are several prototypes that are justifying advantages of employing BCT in oncology. The growing body of research work [35-39] focusing on employing blockchain in healthcare or in the pharmaceutical supply chain in general, not oncology-specific, was out of the scope of this paper. Some of these approaches to build blockchain-based EHR or supply chain system can be applied in oncology as well. An interested reader may refer to the recent relevant reviews of using BCT for healthcare [40, 41], and in supply chain . However, one has to take into account specifics of the oncology domain, including the chronic nature of cancer and data volumes due to the need to manage radiology images .
Regardless of the number of existing prototypes, most of the works lack the evaluation and tests in real-world settings. This is most likely due to the existing regulations [44, 45], sensitive nature of healthcare data, required consortium of multiple institutions, lack of interoperability with existing EHR systems, and other limitations summarized in the framework of SRQ6, such as difficulties to ensure secure key management, and the need to use sophisticated cryptographic techniques. Indeed, BCT itself cannot guarantee data privacy and security. Thus, it is never proposed as a stand-alone technology. Moreover, as noted by the authors of the analyzed primary studies, different blockchain implementations have their own limitations and trade-offs.
In addition, compared with current solutions (i.e., centralized databases in the hospitals and IHE profiles and standards, as well as existing ecosystems for HIE , aiming to ensure interoperability and data security), the following disadvantages of applying BCT shall be considered, among others:
“Mining,” in case of using permissionless blockchain, can introduce the extra costs to the data management process, as well as privacy threats.
In the framework of data-sharing for primary patient care, it is challenging to ensure that the patients are able to manage securely their keys and identity when encrypted data are being shared (in contrast with the technical architecture described in the corresponding IHE profile, where the keys are only shared between caregivers participating in the data exchange ).
Some of the core principles of BCT (e.g., data immutability) are not compliant with data privacy laws and regulations. (e.g., “right to data erasure”); thus, appropriate data management approaches, policies, and additional mechanism are required.
However, it is worth mentioning that the approach that combines some of the properties of the BCT and the Keyless Signatures Infrastructure (KSI)  has been successfully employed in Estonian government networks to ensure the integrity of the data (including EHR) stored in government repositories and to protect them against insider threats. KSI is used to provide time-stamping and server-supported digital signature services. The number of participants in the KSI consensus protocol is limited, which for instance can ensure that transaction settlement can occur within one second. However, major drawbacks of such an approach are limited decentralization and the requirement of trust in the participants of KSI consensus.
Based on the analyzed state-of-the art research work and their current limitations, we propose the following future research agenda in the area of applying BCT in the oncology and healthcare domain in general.
Achieving privacy-preserving distribution and globally reachable data. The challenge of ensuring globally reachable data and enforcement of patient’s access control policy is not trivial: data availability/interoperability requirements can interfere with a patient’s privacy. Is it possible to define a harmonized set of the basic rules built in the healthcare data management architecture based on the international laws and regulations, preserving different sensitivity levels of the data, and ensuring an adherence to such rules without centralized entity?
Intelligent data management.How to design privacy-preserving hybrid data storage for machine learning tasks and artificial intelligence techniques (e.g., to use on-chain storage only for the statistical data avoiding storage of sensitive data on the blockchain)? Can we decouple the query and execution by defining the queries and parameters to be stored on the blockchain, which will then be executed only by trusted entities or data owners (doctors, patients)?
Multi-ledger and ledger interoperability.A plethora of the existing BCT implementations, distributed ledger, and different prototypes built on top of the technology can aggravate the problem of the lack of interoperability between healthcare systems. Thus, ensuring interoperability between different BCT implementations is of a high priority. Moreover, due to custom privacy preferences and individual needs and requirements from different patients, one can think of a multiple-ledger design: a patient-specific, or even a case-specific ledger . Data then can be replicated among multiple ledgers and locations, creating the network of networks . Depending on the context, the requirements that one has to fulfill to access the data will have to be fulfilled. However, is it still questionable on how patients will be able to manage their ledgers? And how to set up such infrastructure in the real-world settings?
Patients’ involvement and education.Once patients have full control over their data, education mechanisms must be put in place for the patients, in order to provide valuable insights regarding best data and consent management practices compliant with existing laws and regulations. Moreover, mechanisms for “break glass” access to the healthcare data in the emergency settings are still to be developed.
Data analysis and research.Having a complete, curated and trusted data set is critical for ensuring accurate results in analysis and research. Once complete and accurate data of oncology patients’ history are systematically stored with the use of blockchain with consent from the patients, the data can be leveraged in advancing oncology research and treatment options. Analytical, compliance, and research tools are currently actively researched and developed . For example, having a detailed history of drug tolerance and side effects on patients combined with their genetic profiles or markers can help to improve selection of patient treatment options.
This paper presented an SLR of 12 primary studies addressing the following question “What are the motivations, approaches, limitations, and barriers when employing distributed ledger technology for data-sharing in oncology?” We analyzed the motivations, advantages, and limitations of the blockchain-based applications in the oncology domain, as well as barriers, potential approaches to overcome them, and future challenges listed in the state-of-the-art research work. We discussed specifics of the blockchain-based applications in oncology and integration barriers. Finally, we proposed directions of the future work that can help to attain integration and adoption of the BCT for data-sharing, medical research, and the pharmaceutical supply chain in oncology, as well as in healthcare in general.
The authors have no conflicts of interest to declare.
There are no funding sources associated with this manuscript.
All authors contributed equally to the work presented in this manuscript.