Who We Are
Investor Relations

Improving KOL engagement through unstructured data analytics – Indegene’s perspective at PMSA 2021

25 Jun 2021

And that’s a wrap! We just concluded our participation at the PMSA Virtual Conference 2021. It was an incredible experience and we thank @PMSA for this opportunity to learn and share insights with our industry friends on key topics such as commercialization during the time of Covid-19, innovative approaches to HCP engagement, salesforce in the new normal, analytics at scale, etc. which are paving way for true digital transformation in the life sciences industry.

The @Indegene team was represented by @Tahira, @Digvijay and me. We spoke on “Improving KOL engagement through unstructured data analytics,” and for those who missed attending the conference, I would like to share some highlights about our session and why life sciences organizations need to switch gears in KOL management to keep pace with the rapidly changing healthcare landscape.

Life sciences industry dynamics have been rapidly evolving in the last few years and the transformation has only accelerated due to the Covid-19 pandemic. This has reinforced the impetus on partnering with KOLs effectively on multiple fronts, be it vaccine advocacy or driving the adoption of new therapies.

As we all know, KOLs provide the industry with unparalleled insights on drug development, regulatory requirements as well as a deep understanding of customers’ needs. They provide credibility to life science organizations by influencing the opinions of other healthcare practitioners, authorities as well as patient advocacy groups. Needless to say, organizations with effective KOL management strategies can provide one of the best competitive advantages compared to their peers. However, many organizations struggle to identify and engage with the right KOLs for their business.

The life sciences industry is now embracing the shift from traditional to modern approaches and is well-positioned to embrace new technology. For example, KOLs as a group has evolved to include digital opinion leaders who can effectively communicate on digital platforms. More digital and virtual channels are now involved in engaging with KOLs than standalone in-person interactions via reps and MSLs.

With this background, we hypothesize that the key to successful KOL partnerships is to engage with the KOLs in a more meaningful and personalized manner. A wealth of information about the KOLs exists in the public domain. They are active in scientific research which is made public through research publications and medical journals. KOLs also participate in various academic and scientific conferences focusing on their areas of specialization and with the advent of social media, KOLs are constantly engaging in impactful dialogues with their colleagues and life sciences organizations online.

This wealth of information in the form of unstructured data sources can be harnessed through advanced analytics technique to generate deep insights into KOLs top causes, their areas of interest, treatment preferences, and customer needs. These insights can be utilized to personalize KOL interactions and create curated content for better engagement.

Approach to Mining Unstructured Data Sources

Here’s our approach to mining unstructured data sources for effective KOL engagement. As a first step, we extract relevant information from the available public data sources based on specific therapy areas of interest and specialties. We leverage domain expertise to identify the sources, create search strings required to extract the information, screen the relevancy of articles, and validate. This helps us identify KOLs of interest as well as their publications and research areas. Once the KOLs are identified, we also mine social media channels such as Twitter, LinkedIn, and blogs to extract relevant social discussions pertaining to our therapy area of interest, for example, vaccines or immuno-oncology. These unstructured data sources are then processed leveraging powerful NLP techniques to draw insights.

Over the last few years, technology has significantly evolved to handle a large volume of unstructured data to be converted into a structured format followed by the application of specific algorithms. One such example is Topic Modeling, which enables us to extract themes and topics that are being discussed across various data sources. Sentiment analysis is another algorithm that helps us classify the opinions of the KOLs with respect to specific topics.

For example, the Latent Dirichlet Allocation model can be used to unearth specific topics from unstructured documents. For one of our projects in the vaccine space, topics such as Knowledge, attitudes & beliefs on vaccines, Strategies to improve the vaccination rate, and Safety & effectiveness of vaccines had the highest share of voice among KOLs with overall sentiment scores ranging across the positive and negative scale.

Insights and Outcomes

The output from our analyses feeds into the KOL engagement strategy to drive content personalization, resulting in the inclusion of 2 new attributes: (a) topic of interest and (b) opinions/sentiments along with traditional KOL parameters. Through these insights, we can derive 2 important outcomes of interest: (1) aligning interactions to pertinent topics and themes that are of interest to KOLs and (2) identifying opportunities to curate effective content for brand awareness and advocacy.


To conclude, unstructured data analytics can provide a competitive edge to life sciences organizations by driving more informed KOL management strategies. Potential enhancements to this approach include developing scalable solutions, measuring the impact of our recommendations, measuring the shift in KOL topics of interest by driving effective discourse, and extrapolation of recommendations to the larger HCP universe – all of which can certainly enable organizations to be future-ready.


Ritu Kohli
Ritu Kohli