Introducing Core Smart Tags
Introducing Core Smart Tags If you are familiar with Tagging in Nested Knowledge, you know how integral the process of setting up a tagging hierarchy
In the rapidly evolving landscape of health economics and outcomes research (HEOR), artificial intelligence (AI) is becoming an increasingly valuable tool for systematic literature reviews (SLRs). At Nested Knowledge, we are committed to providing AI that users can employ responsibly to enhance the efficiency and quality of their evidence synthesis projects. Toward that goal, we recognize that ensuring compliance with leading guidelines for developing and employing AI is critical.
This article outlines how to achieve compliance with HEOR and SLR guidelines for employing AI:
At Nested Knowledge, we adhere to three core principles when implementing any new AI feature:
Data Provenance: Where relevant, we provide the source and the specific data from that source that informs that finding. This ensures full traceability of information. In practice, this means that when our AI tools identify a relevant piece of information or make a recommendation, users can easily trace back to the original source document and language. This level of transparency is crucial for maintaining the integrity of the systematic review process and allows researchers to verify the accuracy of AI-generated recommendations.
Methodological Transparency: We offer complete methodological information on how our AI is trained and employed. Where applicable, we also provide validation and accuracy data on its performance. This transparency extends to the algorithms used, the training data sets, and any known biases or limitations in our AI models. By sharing this information, we enable users to make informed decisions about how to best utilize our AI tools within their research workflows. Additionally, this openness fosters trust and allows for continuous improvement based on user feedback and evolving industry standards.
Human Oversight: In tasks where AI takes the place of human effort, we ensure that AI outputs are placed into an oversight workflow so that all AI extractions can be reviewed by a human expert. This maintains the critical balance between efficiency and accuracy. Our AI tools are designed to augment human expertise, not replace it. For example, in the screening process, while our Robot Screener can rapidly process thousands of articles, the Robot’s recommendations are surfaced to a human Adjudicator for oversight. This dual-layer approach combines the speed and consistency of AI with the nuanced understanding and critical thinking of experienced researchers.
By following these principles, we provide an evidence synthesis solution with AI enhancements that are supported by:
Our commitment to transparency extends to sharing our validation results:
Internal Validation: Published at ISPOR 2024, our team demonstrated improved Recall over humans, albeit with lower Precision, across approximately 100,000 decisions in 19 SLRs. This extensive validation process involved comparing the Robot Screener’s performance against that of experienced human reviewers. The results showed that our AI tool was able to identify a significantly higher number of relevant studies (higher Recall) compared to human reviewers. However, it also included more irrelevant studies (lower Precision). This trade-off is often acceptable and even desirable for recommendations and for title/abstract screening, as it’s generally preferable to cast a wider net and then refine the selection in subsequent stages.
External Validation: Also presented at ISPOR 2024, showing similar results across 15 SLRs. This external validation was crucial in demonstrating the consistency and generalizability of our Robot Screener’s performance across different types of systematic reviews and research questions, though for these externally-validated reviews, Robot Screener was equivalent to, not better than, human Recall. The alignment between internal and external validation results strengthens confidence in the tool’s reliability and effectiveness in diverse research contexts.
Time Savings: When used in Dual Screening, our Robot Screener saves roughly 45% to 50% of screening time. This significant time reduction is achieved without compromising the quality of the review process. By automating the initial screening of large volumes of literature, researchers can focus their time and expertise on more complex tasks such as data extraction, quality assessment, and synthesis of findings. This efficiency gain is particularly valuable in the context of rapid reviews or when dealing with fields that have a fast-growing body of literature.
For a comprehensive understanding of our AI methods across all AI models employed in Nested Knowledge’s software, including Bibliomine and RoboPICO, please check out our AI Methods and AI Disclosures. Our flagship features – Robot Screener for screening assistance, the recently introduced Core Smart Tags for classification, and Custom Smart Tagging Recommendations for extraction – are at the forefront of our AI integration efforts.
In summary, the AI tools in Nested Knowledge cover the major steps in an SLR or evidence synthesis project (Search strategy, Screening, and Extraction, or tagging, of key data from sources). Each tool is integrated into a workflow for human oversight, and if you need a reference for full transparency, look no further than our Disclosure.
The National Institute for Health and Care Excellence (NICE) recently published a position statement on the use of AI in evidence synthesis. We’re pleased to note that our approach aligns closely with NICE’s recommendations, as well as those outlined in the key reference, “Generative AI for Health Technology Assessment: Opportunities, Challenges, and Policy Considerations,” by Fleurence et al. Both emphasize methodological transparency and human oversight, which are cornerstones of our AI implementation.
Here’s how users of Nested Knowledge can ensure compliance with key positions from the NICE statement:
Augmentation, Not Replacement: NICE states: “Any use of AI methods should be based on the principle of augmentation, not replacement, of human involvement.”
Use of ML and LLMs in Various Stages: NICE recommends: “Machine learning methods and large language model prompts may be able to support evidence identification by generating search strategies, automating the classification of studies, the primary and full-text screening of records to identify eligible studies, and the visualisation of search results.”
Our implementation:
Data Extraction Automation: NICE notes: “Large language models could be used to automate data extraction,” while acknowledging this as a less established use.
AI Methods Disclosure: NICE requires: “When AI is used, the submitting organisation and authors should clearly declare its use, explain the choice of method and report how it was used, including human input.”
In addition, we read with interest the work of Fleurence et al. in “Generative AI for Health Technology Assessment: Opportunities, Challenges, and Policy Considerations,” which was cited in the NICE position statement and which called out several key areas that are promising for AI involvement in evidence synthesis and SLR with appropriate oversight:
The authors finish their evaluation of AI for SLR with the following statement that matches closely with our own Human Oversight philosophy: “In summary, these early applications show that there is promise in using foundation models to support a range of tasks required in SLRs but this rapid overview indicates that … human verification is necessary.”
At Nested Knowledge, we are committed to leveraging AI responsibly to enhance the efficiency and quality of systematic literature reviews and evidence synthesis generally. Our approach aligns closely with industry standards, including NICE’s position statement, ensuring that HEOR professionals can confidently use our tools for evidence synthesis in HTA submissions.
By designing a balance between the speed of AI and the discretion of human expertise, we are driving forward the field of evidence synthesis while upholding the highest standards of transparency and accuracy as set forth by NICE and other guideline developers. The Nested Knowledge platform is designed from the ground up to handle the time-consuming, repetitive tasks in the systematic review process, allowing researchers to focus their expertise on critical analysis and interpretation of the evidence.
The integration of AI in systematic reviews represents a significant advancement in the field of HEOR. It not only accelerates the review process but also enhances the comprehensiveness and consistency of reviews when used under human oversight. However, we recognize that the responsible use of AI requires ongoing validation, transparent reporting, and a clear understanding of its capabilities and limitations. We invite HEOR professionals and regulatory bodies alike to explore our implementation of AI features and provide feedback directly to us, as well as to conduct external validation studies. We remain committed to working with our users to shape the future of evidence synthesis, ensuring that we can meet the evolving needs of researchers, policymakers, and ultimately, patients.
If you would like a demonstration of our software, fill out the form below, and we will be happy to meet with you individually. Or, if you would prefer to explore our platform on your own, sign up and pilot these AI-assisted tools for free.
Yep, you read that right. We started making software for conducting systematic reviews because we like doing systematic reviews. And we bet you do too.
If you do, check out this featured post and come back often! We post all the time about best practices, new software features, and upcoming collaborations (that you can join!).
Better yet, subscribe to our blog, and get each new post straight to your inbox.
Introducing Core Smart Tags If you are familiar with Tagging in Nested Knowledge, you know how integral the process of setting up a tagging hierarchy