How AI is shaping the new life in life sciences and pharmaceutical industry

There’s a massive opportunity for AI to transform the life sciences and pharmaceutical industry. Here’s why.

How AI is shaping the new life in life sciences and pharmaceutical industry

Monday February 24, 2020,

7 min Read

The pharma and life sciences industry is faced with increasing regulatory oversight, decreasing R&D productivity, challenges to growth and profitability, and the impact of artificial intelligence (AI) in the value chain. The regulatory changes led by the far-reaching Patient Protection and Affordable Care Act (PPACA) in the US are forcing the pharma and life sciences industry to change its status quo.

Besides the increasing cost of regulatory compliance, the industry is facing rising R&D costs, even though the health outcomes are deteriorating and new epidemics are emerging. Led by the regulatory changes, the customer demographics are also changing. The growth is being driven by emerging geographies of APAC and Latin American region.

As a result, the pharma and life sciences industry is compelled to focus on these relatively nascent and evolving markets. Infusion of AI in life sciences industry are enabling them to rationalise internal costs, and focus on better profiling and targeting of clients and medical practitioners.

Here's an AI challenge for enthusiasts & practitioners: Develop an integrated solution powered by AI for early detection of epidemics and to enable quick actions to mitigate and control it.

AI in life sciences

Disruption in life sciences

Pharmaceutical organisations can leverage AI in a big way to drive insightful decisions on all aspects of their business, from product planning, design to manufacturing and clinical trials to enhance collaboration in the ecosystem, information sharing, process efficiency, cost optimisation, and to drive competitive advantage.

AI enables data mining, engineering, and real time- and algorithmic-driven decision-making solutions, which help in responding to the following key business value chain disruptions in the pharmaceutical industry:

  • AI-driven drug discovery – Enables scientists to source scientific findings and insights from external labs or internal knowledge to jump start discovery which will in turn help reduce cycle time for product development aiding faster go-to-market
  • Reduce cycle times for clinical trials– Through better insights driven by improved accuracy of machine-based ensemble algorithms
  • Supply chain transformation – Building predictive algorithms using a combination of internal and external data would help reduce unforeseen shortages in availability of drugs impacting customer service levels and lost sales revenues
  • Product failure prediction – Via root cause analysis and predictive algorithms of product failures (vendor data)
  • Risk management – For evaluation of potential risks posed by elemental impurities in a formulated drug product
  • Real-time medical device analysis and visualisation– Leveraging interconnecting data from implanted devices and personal care devices
  • Behavioural sciences – To more fully understand customer perceptions about their products which helps in proactively fixing product issues or managing communication better
  • Enhance reporting systems– To meet the changing regulatory compliance needs more effectively
  • Intelligent insights – Renew focus on understanding the underlying business data and generating insights using latest insights and intelligence frameworks

The human microbiome

Though genomics currently hogs the spotlight, there are plenty of other biotechnology fields wrestling with AI. In fact, when it comes to human microbes – the bacteria, fungi, and viruses that live on or inside us – we are talking about astronomical amounts of data. Scientists with the NIH’s Human Microbiome Project have counted more than 100 trillion microbes in the human body.

To determine which microbes are most important to our well-being, researchers at the Harvard Public School of Health used unique computational methods to identify around 350 of the most important organisms in their microbial communities. With the help of DNA sequencing, they sorted through 3.5 terabytes of genomic data and pinpointed genetic “name tags” – sequences specific to those key bacteria. They could then identify where and how often these markers occurred throughout a healthy population. This gave them the opportunity to catalogue over 100 opportunistic pathogens and understand where in the microbiome these organisms occur normally. Like genomics, there are also plenty of startups – Libra Biosciences, Vedanta Biosciences, Seres Health, Onsel – looking to leverage on new discoveries.

Perhaps the biggest data challenge for biotechnologists is synthesis. How can scientists integrate large quantities and diverse sets of data – genomic, proteomic, phenotypic, clinical, semantic, social etc. – into a coherent whole?

Many AI researchers are occupied to provide plausible responses:

•Cambridge Semantics has a developed semantic web technologies that help pharmaceutical companies sort and select which businesses to acquire and which drug compounds to license.

•Data scientists at the Broad Institute of MIT and Harvard have developed the Integrative Genomics Viewer (IGV), open source software that allows for the interactive exploration of large, integrated genomic datasets.

•GNS Healthcare is using proprietary causal Bayesian network modeling and simulation software to analyse diverse sets of data and create predictive models and biomarker signatures.


Numbers-wise, each human genome is composed of 20,000-25,000 genes composed of three billion base pairs. That’s around three gigabytes of data. Genomics and the role of AI in personalising the healthcare experience:

•Sequencing millions of human genomes would add up to hundreds of petabytes of data.

•Analysis of gene interactions multiplies this data even further.

In addition to sequencing, massive amounts of information on structure/function annotations, disease correlations, population variations – the list goes on – are being entered into databanks. Software companies are furiously developing tools and products to analyse this treasure trove.

For example, using Google frameworks as a starting point, the AI team at NextBio have created a platform that allows biotechnologists to search life-science information, share data, and collaborate with other researchers. The computing resources needed to handle genome data will soon exceed those of Twitter and YouTube, says a team of biologists and computer scientists who are worried that their discipline is not geared to cope with the coming genomics flood.

By 2025, between 100 million and 2 billion human genomes could have been sequenced, which is published in the journal PLoS Biology. The data-storage demands for this alone could run to as much as 2–40 exabytes (1 exabyte is 1,018 bytes), because the number of data that must be stored for a single genome are 30 times larger than the size of the genome itself, to make up for errors incurred during sequencing and preliminary analysis.

Robust algorithms with massive data engineering capabilities

The extensive data generation in pharma, genome, and microbiome serves as a clarion call that these fields are going to pose some severe challenges. Astronomers and high-energy physicists process much of their raw data soon after collection and then discard them, which simplifies later steps such as distribution and analysis. But fields like genomics do not yet have standards for converting raw sequence data into processed data.

The variety of analysis that biologists want to perform in genomics is also uniquely large, the authors write, and current methods for performing these analyses will not necessarily translate well as the volume of such data rises. For instance, comparing two genomes requires comparing two sets of genetic variants. If you have a million genomes, you’re talking about a million-squared pairwise comparisons. The algorithms for doing that will be able to deliver this will be required with strong data engineering capabilities.

There’s a massive opportunity of AI transforming life sciences and pharmaceutical industry. The above mentioned disruptions in business value chains have already started making inroads and the CXOs in life sciences industry have realised the virtues of innovation and transformation regime led by AI . Brace up for more interventions in life sciences industry leveraged by AI.

(Edited by Evelyn Ratnakumar)

Sameer Dhanrajani is a prominent name in the domain of analytics and data sciences. #FutureOfWork presents you with an opportunity to pick invaluable insights from him. Book your tickets now.

(Disclaimer: The views and opinions expressed in this article are those of the author and do not necessarily reflect the views of YourStory.)