Model Name: Smart Search #
Version: 1.0
Overview #
The Smart Search Tool leverages an agent-based large language model (LLM) system to generate optimized PubMed search queries tailored to specific research questions. It employs a generator-critic loop:
- Generator: Proposes a PubMed search query and retrieves results.
- Critic: Evaluates the query results based on their proximity to a desired review size (e.g., 1000, 3000, or 5000) and provides feedback to refine the query.
This iterative process continues until the search meets predefined size and relevance criteria.
Intended Use #
- Primary Purpose: Automate the generation and refinement of PubMed search strategies to assist researchers in systematic review workflows.
- Intended Users: Researchers, healthcare analysts, and systematic review teams.
Training Data #
- Dataset: The tool is not explicitly trained in the traditional sense but uses LLMs as the underlying agent for query generation and evaluation.
- Validation Dataset: The performance was validated on two data sets:
- Nested Knowledge Dataset (NK): 20 reviews conducted by Nested Knowledge and its affiliates with corresponding research questions and included studies.
- Cochrane Review Dataset (CR): Research questions and included studies from 10 randomly selected Cochrane reviews.
- Data Limitations:
- Heuristics were developed exclusively on English-language abstracts.
Evaluation #
- NK Recall: 0.749
- NK Precision: 0.0181
- CR Recall: 0.768
- CR Precision: 0.00468
Ethical Considerations #
- Human Oversight: The tool is intended as a supplement, not a replacement, for expert-led search strategy development. Users are expected to refine and verify the generated queries.
- Bias and Limitations of LLMs: As the LLM is a general-purpose model, its biases or inaccuracies may propagate into the search strategy. These risks are mitigated by the critic module, which evaluates results numerically rather than semantically.
- Automation Bias: Users may over-rely on the tool without critically evaluating its outputs.
Limitations #
- Precision: The low precision highlights a propensity for over-inclusive search strategies. Although this low precision is typically seen in human derived search strategies as well ; low precision is preferred to low recall and merely results in increased screening time instead of lost evidence
- Scope: The system is currently limited to PubMed and optimized for English-language queries.
- Input: Requires clear input research questions and is currently optimized for PubMed.
Planned Improvements #
- Enhanced Precision: Incorporate domain-specific heuristics or secondary validation models to improve the specificity of query results.
- Extended Databases: Expand functionality to other databases such as Embase or Scopus.
Contact Information #
For questions, feedback, or support, please contact support@nested-knowledge.com.
PALISADE Compliance #
Purpose
The Smart Search Tool is designed to optimize search strategies for systematic reviews by leveraging LLM-based query refinement. Its purpose aligns with fair and ethical usage, as it integrates into workflows to assist, not replace, human expertise.
Appropriateness
The tool is appropriate for its intended use, automating a complex and time-intensive aspect of systematic review workflows. The generator-critic approach ensures iterative refinement toward desired search characteristics.
Limitations
- Low precision is a known limitation, reflecting the challenge of highly specific query formulation in an inherently noisy environment like PubMed.
- The tool is only capable of searching PubMed, not other research databases.
- The system’s performance depends on the clarity and specificity of the input research question.
- Limitations of the data:
- Restricted to English-language abstracts; performance may degrade for ambiguous or poorly structured abstracts.
- Performance metrics are specific to PubMed and do not generalize to other databases.
Implementation
The tool uses an OpenAI LLM as the core engine for query generation and refinement. This is computationally intensive and requires web access.
Sensitivity and Specificity
The system demonstrates:
- Recall values of: (0.749, 0.768), indicating strong sensitivity in retrieving relevant studies.
- Precision values of: (0.0181, 0.00468), reflecting a tendency for over-inclusive search strategies.
Algorithm Characteristics
- Design: Generator-critic loop using LLMs to iteratively refine PubMed search queries.
Data Characteristics
- Evaluation Data: Two validation datasets were used, including internally curated examples and Cochrane review-based queries.
Explainability
The reasoning behind the LLM’s search modifications is not fully transparent, complicating reproducibility. While the log of the “conversation” between the critic and the generator is recorded this is not yet available to end users.
Additional Notes on Compliance #
The Smart Search Tool securely shares necessary data with OpenAI to process requests. However, OpenAI does not store this data, and it is not used to train OpenAI’s models, as outlined in our Data Processing Addendum.