RAISE Compliance Guide

Responsible use of AI in evidence Synthesis #

This guide summarises the RAISE framework (April 2026) and explains how Nested Knowledge’s AI tools and expert-curation workflows are designed to align with it. It is intended as a practical reference for users planning to deploy NK in RAISE-compliant systematic literature reviews.


Disclaimer #

This brief guide is an incomplete and high-level overview based solely on our interpretation of the RAISE standards. Our independent review and understanding of the publicly available framework may not align with the intended messages or practices of RAISE, and our team are making good-faith interpretations based only on our existing knowledge. We have no affiliation with, endorsement from, or formal relationship with RAISE or its authors.

This material is provided for informational and methodological guidance only and does not represent an official or authoritative interpretation of RAISE. Users should read and interpret the original RAISE publications and resources directly and in full to verify requirements and ensure full compliance with the framework.


RAISE Principles for Authors #

RAISE places ultimate responsibility for review integrity on the authors, regardless of the AI tooling used:

  1. Ultimate Author Accountability: Responsibility for the integrity, accuracy, and completeness of a review lies fully with the authors.
  2. Transparency in Methods and AI Use: Document how AI tools were used across the review cycle, including prompts, configurations, and human oversight.
  3. Fit-for-Purpose Methods: Align with PRISMA, Cochrane, and similar standards, and use only fit-for-purpose AI tools.
  4. Human Oversight: Critical steps (study selection, data accuracy, interpretation) require active human judgement.
  5. Bias Awareness, Risk Management and Limitations: Mitigate biases introduced by AI tools and communicate limitations clearly in both methods and findings.

RAISE Use-Acceptability Categories #

RAISE classifies AI use into five tiers, which appear throughout the stage-by-stage guidance below.

TierDefinition
Acceptable for UseAI outputs may be used directly within the review workflow so long as limitations and biases are acknowledged and accounted for.
Human Verification RequiredAI outputs must be checked by humans. The degree of checking varies by task but typically requires reading and amending in its entirety.
In-Review ValidationAI outputs may be used without full verification if and only if in-review validation finds AI outputs adequate (e.g. comparable to humans).
Exploratory and Supplementary UseAI may develop ideas or provide a starting point; all outputs should be extensively refined by humans, or used solely as additional approaches (adding to, not replacing, existing processes).
Not Acceptable for UseAt present, AI outputs have such serious limitations that they should not be relied upon.

RAISE Principles by Review Stage — Summary #

StageRAISE Position
SearchScoping purposes or as a supplement to traditional search.
ScreeningUse should be validated within the review.
Data ExtractionOutputs should be reviewed; context-specific evaluation should be present.
Critical AppraisalSupplement to human appraisal, validated within the review. High risk of hallucination.
Qualitative SynthesisSuggestions for coding should be validated by a human. Summarisation is not synthesis.
Quantitative SynthesisNot acceptable for synthesising across studies via LLMs. AI-generated code must be documented.
Text GenerationBest-suited for summarisation of individual elements within a study.

Mapping Nested Knowledge AI Tools to RAISE Standards #

The sections below cover each stage in turn: the relevant RAISE recommendations, the corresponding manual process, and the NK AI tool plus the expert curation needed to align with RAISE.

Search #

AI search tools (LLMs, vector matching) do not always indicate where studies were retrieved from or the comprehensiveness of the search, and some have been shown to return fewer than half of studies captured in a traditional search.

RAISE recommends AI search be used for scoping or as a supplement to a traditional search to identify missed studies, and that sensitivity of the AI tool be considered.

Manual ProcessNK AI Tool & Expert CurationRAISE Guidance
Drafting full Boolean strings based on a Research QuestionAI: Smart Search — a validated, iterative agentic workflow for search strategy development.
Expert: Fully scrutinise or treat as supplemental.
Exploratory and Supplementary Use

Screening #

RAISE notes that prioritisation-screening stopping criteria are “largely overconfident and fail to prevent missing relevant records,” and that safer supervised approaches that maximise recall reduce time savings. LLM screening performance depends on prompt quality and example classifications.

RAISE recommends that ML and AI use in screening be validated within the review, that LLM usage for complex screening include structured examples or training, and that prioritised screening not be used without noting its limitations.

Manual ProcessNK AI Tool & Expert CurationRAISE Guidance
Screening at Abstract and Full Text stages (Dual or Single)AI: Smart Screener — a validated, fully traceable LLM approach for Criteria-Based Screening (AB and FT).
Expert: Either 100% verify or validate as adequate.
Need to Validate within Review

Tagging (Data Extraction) #

RAISE considers data extraction an area where present tools can perform to an acceptable level. Rule-based algorithms have accuracy concerns due to terminology variation; ML can be more comprehensive but is dependent on the training corpus.

RAISE recommends hybrid use of rule-based and ML approaches (e.g. rule-based to identify candidate entities, then ML to filter), human review of LLM outputs for hallucinations and missing data, and context-specific evaluation when applying any AI extraction tool to a new domain (a Study Within a Review).

Manual ProcessNK AI Tool & Expert CurationRAISE Guidance
Expert extraction of all qualitative and quantitative valuesAI: Adaptive Smart Tags — a validated, fully traceable LLM approach for qualitative and quantitative extraction (text, tables, figures).
Expert: Either 100% verify or validate as adequate.
Need to Validate within Review

Critical Appraisal & Certainty of Evidence #

Automation is challenging due to low inter-rater reliability, which makes gold-standard training data hard to assemble. ML tied to specific tools or ensemble approaches have shown success; LLM prompting with detailed instructions is initially promising. GRADE-applying tools are still in development.

RAISE recommends LLM prompts include detailed instruction (including handling of incomplete information and connotation-laden words), that LLMs remain a supplement to human-driven checking, and that use be validated within the review. Hallucination risk is high — verify carefully.

Manual ProcessNK AI Tool & Expert CurationRAISE Guidance
[Dual or Verified] Expert appraisal using study-type-specific Risk of Bias toolsAI: Smart Critical Appraisal — a validated, traceable LLM completing existing Critical Appraisal tools.
Expert: Either 100% verify or validate as adequate.
Need to Validate within Review — High Risk of Hallucination

General Text Generation #

Summarising information across multiple studies using AI is not recommended. Generative AI is best used for generating templates from human instructions as a starting point, or for summarising simple elements within a single study.

RAISE recommends AI not be used for interpretation of results or summarisation across studies. Best-suited current uses: summarisation of limited individual elements (e.g. interventions), plain-language summaries of human syntheses, and translation.

Manual ProcessNK AI Tool & Expert CurationRAISE Guidance
Interpret and summarise or synthesise overall results and findingsAI: None to date — general text generation is exploratory in RAISE and general review practice.
Expert: Fully scrutinise or treat as supplemental.
Exploratory and Supplementary Use

Qualitative Synthesis #

RAISE draws a sharp distinction between summarisation (which LLMs can do) and synthesis (which they cannot): synthesis requires considering whether studies can validly be combined and testing the robustness of an analysis, none of which an LLM does when summarising text.

RAISE recommends all LLM suggestions for qualitative coding be validated by a human, and that LLMs only support qualitative coding — synthesis must remain human-centred.

Manual ProcessNK AI Tool & Expert CurationRAISE Guidance
Concept-by-concept summarisation of specific qualitative elementsAI: Smart Insights — a highly constrained and traceable text summarisation on a concept-by-concept basis.
Expert: Either 100% verify or validate as adequate.
Need to Validate within Review

Quantitative Synthesis #

Rule-based platforms can provide automated meta-analysis when statistics are available, including “live” updates of existing analyses. LLMs can generate analysis code, but RAISE states they are “not acceptable for use” for quantitative synthesis across studies.

RAISE recommends that LLMs not be used to synthesise results across studies, that statistical transformations be documented, that the reviewer have the analytical knowledge to review AI-generated code, and that LLM prompts for meta-analysis include example code. The quality of imported data should be evaluated before use.

Manual ProcessNK AI Tool & Expert CurationRAISE Guidance
Code-based network meta-analysis using standard statistical packageAI: Quantitative Synthesis — a rule-based, open-source fork of the R meta package.
Expert: Curation recommended, not required.
Acceptable for Use; Document Statistical Parameters

In summary: NK’s Screening, Tagging, Critical Appraisal, and Qualitative Synthesis features are validated and traceable — experts should either verify for 100% accuracy or validate as adequate within the review. NK’s Quantitative Synthesis is rule-based, so curation is recommended but not required. NK’s Search and general Text Generation outputs should be treated as supplemental or fully scrutinised.


Validation Evidence #

RAISE requires that AI use at the Screening, Extraction, Critical Appraisal, and Qualitative Synthesis stages be validated within the review. Nested Knowledge maintains a continually updated catalogue of internal and independent third-party validation studies covering Smart Search, Robot Screener, Smart Screener, Core and Adaptive Smart Tags, Smart Critical Appraisal, Smart MA Extraction, and Smart Insights — including an independent ISPOR EU 2025 end-to-end evaluation that found ~85% time savings and 90%+ recall across the full AutoLit AI workflow. For validation highlights, detailed study descriptions, and source citations, see:

Validation Studies of AI Tools in Nested Knowledge

Cochrane CESAR Platform Study (March 2026). Nested Knowledge was selected, alongside Laser AI, as one of two AI tools from a pool of 48 submissions to participate in Cochrane’s CESAR platform study evaluating AI tools for evidence synthesis. The selection process was aligned with RAISE principles. Cochrane has been clear that selection is not a formal endorsement; pilot testing with Cochrane review author teams is underway, with interim analyses expected mid-2026.


Further Reading #

Updated on May 13, 2026
Did this article help?

Have a question?

Send us an email and we’ll get back to you as quickly as we can!