GenAI medical search cuts drug discovery time and costs
We developed a GenAI search platform for a major pharma company, transforming a days-long, manual research process across scientific resources into a task taking a minute. The solution accelerates drug development cycles, reduces high-value labor costs, and improves patient outcomes by ensuring critical scientific insights are captured and acted upon instantly.
Our client operates internal research facilities and manages extensive proprietary documentation alongside monitoring external scientific publications. Medical advisors, researchers, and physicians need to stay current with developments in their therapeutic areas, but the sheer volume of data makes comprehensive coverage impossible.
Challenge: information overload and manual processing bottleneck.
Previously, the company’s workflow relied on manual curation of dense scientific literature. Medical specialists spent hours sifting through new articles, and cross-referencing findings across disparate sources. Within this process, critical insights from scientific materials often took days to surface.
- Navigating through emerging scientific data required more and more human resources.
- Interdisciplinary research was limited because of tight deadlines.
- Delays in extracting the data could cause missed research opportunities.
Solution: leveraging GenAI to build a medical search platform
To ensure a focused and viable project, we first conducted an AI Readiness Workshop. This step allowed the client to pinpoint this specific search bottleneck as a high-value opportunity for Generative AI. The workshop delivered a clear implementation plan, moving us swiftly from concept to a functional prototype.
We then developed a web-based summarization service that handles dynamic document uploads, integrates multiple data sources, and supports various file formats used across pharmaceutical operations and databases. The new tool reduces the time required for data preparation and analysis.
How it works:
- User query. The user formulates a natural language question, and optionally uploads their own document (e.g., a research paper, or a clinical study).
- Retrieval options. The system combines multiple approaches.
- A form is provided to fill out in order to make the search more accurate: users can specify publication date, country of research, topic, and document type.
- To search inside a single source of data (e.g., a user-uploaded document or a book) a user specifies the query, and the system will return the corresponding answer without augmenting it, using the RAG technology to ground the response strictly in that source.
- External trustworthy sources and scientific databases queried via the API, and internal data — via SQL. The user may choose between specific resources.
- Data retrieval. The system executes SQL queries on Snowflake, performs vector-based semantic search, and, if it was allowed by a user, it then sends API requests to external medical databases in real time.
- The selected documents or retrieved content undergo segmentation, cleaning, and normalization to ensure quality before entering the language model pipeline.
- Each segment is cleaned from formatting artifacts and normalized for consistent language and terminology, preparing it for accurate semantic indexing and retrieval.
- Relevant text segments are indexed and stored. When a query is made, the most relevant chunks are selected and passed to the LLM as context.
- LLM query. A domain-specific medical prompt is assembled, combining the user’s question with the most relevant pre-selected chunks.
- Result. Responses are provided as a concise summary in natural language:
- A brief summary for quick reference and/or a detailed answer.
- Options for exporting the result to PDF or Word, or sending it via email.
In the controlled test, a 1200-pages PDF book was uploaded. The solution processed a user request in 45 seconds: it scanned internal documents in search of similar topics, retrieved TOP-10 articles from an open database published in 2022–2023, and generated a 200-word summary with 5 cited sources. Analysis time dropped from several hours to 30-60 seconds.
Technical implementation
WaveAccess built the solution as a secure, scalable web application integrated into the client’s private network. OpenAI GPT (gpt-4-turbo) was used as LLM in this project, with specific prompting techniques in order to both drop costs and improve the answer precision.
Token usage optimization
To cut LLM costs we optimized the usage of tokens. To save on the output, we created custom prompts to restrict the length of the answer. To optimize the input, we work with users’ uploaded materials, and with the articles from external sources.
The materials loaded from external databases are cleaned of unnecessary data (e.g. acknowledgments) before sending them to the LLM. An average scientific book contains 9MM characters with only 7MM being useful. With OpenAI charging for every million tokens, removing the extra 2MM per query allows for significant cost optimization, especially as the system scales to support 10–15 simultaneous users.
Solution details
- Data processing is implemented in Dataiku using Python recipes and scenarios.
- Documents (.PDF, .DOCX, .PPTX, .CSV, .TXT) were processed via chunking with overlapping segments to preserve context, followed by normalization and metadata tagging.
- Data warehouse. Snowflake served as the central data warehouse.
- All documents and query results are stored with role-based access control, ensuring compliance with internal data governance policies.
- Generative AI. OpenAI GPT (gpt-4-turbo) is utilized with custom medical prompts optimized for clinical terminology.
- LLM queries pass through an anonymized proxy to prevent raw data exposure as an additional security measure.
- The prompts enforce source attribution and include constraints to reduce hallucinations (e.g., “only reference data present in the provided documents”).
The business impact
We helped develop the solution redefining how the company access, interpret, and act on medical knowledge. Beyond raw speed, the platform delivers critical strategic advantages:
- Accelerated R&D cycles: Drastically reduced time from question to insight directly speeds up drug development.
- Optimized labor costs: Freeing up highly-skilled specialists from hours of manual work reallocates expensive talent to high-value innovation.
- Democratized knowledge: Cross-functional medical teams can access validated, summarized data, breaking down silos and fostering interdisciplinary collaboration.
The ability to instantly synthesize research data allows the company to identify and act on new opportunities faster than competitors.
Further plans
Currently, only one request is handled at a time by the LLM, so users are queued. In the near future, efforts will be focused on parallelizing LLM instances within the Dataiku environment. This is expected to significantly reduce response times and improve overall system performance, especially as the number of concurrent users increases.
Not every data challenge requires LLMs. For a medtech client, we deployed ML-based scripts to automate VoC request categorization. See how data consolidation and PowerBI dashboards implementation drive ROI in our previous case study.