Sign up

Inclusion Rate Modeling at Nested Knowledge

Inclusion Rate Modeling
This blog post has been authored by Kelly Wichmann and Karl Holub from Nested Knowledge When constructing a literature review, not all search results are equally useful or equally likely to be included. In this post, we’ll walk you through trends in screening decisions and how our meta-analytical software, AutoLit, leverages this information to optimize your literature reviews. We’ll be retrospectively peeking into our recently completed review of Flow Diverters, a leading minimally-invasive therapy for intracranial aneurysms. Flow Diverter Literature Review This literature review sought to characterize the efficacy of flow diverters for treatment of aneurysms in a meta-analysis. To identify research on this topic, searches were executed across a variety of keywords, MeSH topics, and device names on PubMed. Studies with a patient population greater than 5 reporting clinical outcomes for aneurysm treatment with flow diverters were included in the review. Summary of the Screening Process Our search identified 1,851 total studies. 399 studies were included after screening, yielding an overall inclusion rate of 21.6%. Below is a table of various exclusion reasons:
Exclusion Reason Number of Studies Percentage (%)
not related 573 31.0
included 399 21.6
case report 272 14.7
review, editorial letters 143 7.7
meta-analysis 139 7.5
non-human 103 5.6
in silico 73 3.9
arm size <5 patients 56 3.0
non-english 38 2.1
in vitro 19 1.0
Feature Exploration The above screening was performed by NK’s analysts; retrospectively, what characteristics of studies most reliably predict their likelihood of inclusion? Let’s visually explore the data! Number of Authors Studies were grouped by the number of authors on the paper, and the inclusion rate within each group was computed. We note a direct correlation in this review. Publication Date Next, we view the inclusion rate of studies by month of publication, observing increased inclusion rate among recent publications. Number of Pages Inclusion rate was found to be highest in the range of 4 to 12 pages. This feature suggests that atypical publications (namely, those with very few or many pages) are unlikely to be included. MeSH & RoboPICO Below is a table of MeSH descriptors for studies in the search. The descriptors “Cerebral Angiography”, “Embolization, Therapeutic”, and “Endovascular Procedures” are all strong indicators of relevant studies, sporting inclusion rates well above the overall inclusion rate of 21.6%.
Descriptor Number of studies Inclusion Rate (%)
Angiography 11 45.5
Blister 11 45.5
Anterior Cerebral Artery 20 45.0
Self Expandable Metallic Stents 35 42.9
Cerebral Angiography 111 36.9
Vertebral Artery Dissection 17 35.3
Embolization, Therapeutic 547 34.9
Endovascular Procedures 408 34.6
Ophthalmic Artery 18 33.3
Postoperative Complications 113 32.7
Nested Knowledge uses a NER (Named Entity Recognition) model we named RoboPICO for identifying PICO elements in study abstracts. In this review, 26,797 PICO elements were automatically identified in 1666 out of 1,851 studies. 196 PICO elements occurred in 15 or more studies; Among these most frequent PICOs, we observe the PICO elements with highest inclusion rates:
Entity Entity Type Number of Studies Inclusion Rate (%)
effectiveness outcome 19 78.9
procedural complications outcome 18 77.8
follow-up outcome 26 76.9
clinical outcome outcome 24 75.0
aneurysm size outcome 65 72.3
institutions population 18 72.2
web intervention 36 69.4
modified rankin scale outcome 22 68.2
web population 31 67.7
Unsurprisingly, common clinical measures and devices show high inclusion rates, again, well above the overall rate of 21.1%. Applications The above study metadata and contents (along with any other useful information provided by the PubMed API), are used to train a predictive logistic model for every review done on AutoLit. We use these models to produce inclusion likelihoods which in turn:
  • Help predict the utility of other searches you may wish to run (for example, on other databases)
  • Inform removal low quality searches and results
  • Optimize the order in which screeners review references
Heat map of term frequency (below) We are continuing to optimize our inclusion modeling, and will update the scientific community as we create tools that may be useful to literature reviews across the clinical literature.

A blog about systematic literature reviews?

Yep, you read that right. We started making software for conducting systematic reviews because we like doing systematic reviews. And we bet you do too.

If you do, check out this featured post and come back often! We post all the time about best practices, new software features, and upcoming collaborations (that you can join!).

Better yet, subscribe to our blog, and get each new post straight to your inbox.

PRISMA Flow Diagram Example

PRISMA Flow Diagram: How publications ‘flow’ through the updated PRISMA 2020 process and chart

The PRISMA Flow Diagram has become a standard part of any systematic review or meta-analysis, and with good reason– it is the most widely accepted method to show the process by which the studies included in a review were included (or excluded). It also shows the different possible steps of a review, different potential sources (from PubMed to ‘expert recommendations’), and the exact reasons you set up to differentiate the wheat from the chaff among the studies you examined.

Read More »