This blog post has been authored by Kelly Wichmann and Karl Holub from Nested Knowledge
When constructing a literature review, not all search results are equally useful or equally likely to be included. In this post, we’ll walk you through trends in screening decisions and how our meta-analytical software, AutoLit, leverages this information to optimize your literature reviews. We’ll be retrospectively peeking into our recently completed review of Flow Diverters, a leading minimally-invasive therapy for intracranial aneurysms.
Flow Diverter Literature Review
This literature review sought to characterize the efficacy of flow diverters for treatment of aneurysms in a meta-analysis. To identify research on this topic, searches were executed across a variety of keywords, MeSH topics, and device names on PubMed. Studies with a patient population greater than 5 reporting clinical outcomes for aneurysm treatment with flow diverters were included in the review.
Summary of the Screening Process
Our search identified 1,851 total studies. 399 studies were included after screening, yielding an overall inclusion rate of 21.6%. Below is a table of various exclusion reasons:
|Exclusion Reason||Number of Studies||Percentage (%)|
|review, editorial letters||143||7.7|
|arm size <5 patients||56||3.0|
The above screening was performed by NK’s analysts; retrospectively, what characteristics of studies most reliably predict their likelihood of inclusion? Let’s visually explore the data!
Number of Authors
Studies were grouped by the number of authors on the paper, and the inclusion rate within each group was computed. We note a direct correlation in this review.
Next, we view the inclusion rate of studies by month of publication, observing increased inclusion rate among recent publications.
Number of Pages
Inclusion rate was found to be highest in the range of 4 to 12 pages. This feature suggests that atypical publications (namely, those with very few or many pages) are unlikely to be included.
MeSH & RoboPICO
Below is a table of MeSH descriptors for studies in the search. The descriptors “Cerebral Angiography”, “Embolization, Therapeutic”, and “Endovascular Procedures” are all strong indicators of relevant studies, sporting inclusion rates well above the overall inclusion rate of 21.6%.
|Descriptor||Number of studies||Inclusion Rate (%)|
|Anterior Cerebral Artery||20||45.0|
|Self Expandable Metallic Stents||35||42.9|
|Vertebral Artery Dissection||17||35.3|
Nested Knowledge uses a NER (Named Entity Recognition) model we named RoboPICO for identifying PICO elements in study abstracts. In this review, 26,797 PICO elements were automatically identified in 1666 out of 1,851 studies. 196 PICO elements occurred in 15 or more studies; Among these most frequent PICOs, we observe the PICO elements with highest inclusion rates:
|Entity||Entity Type||Number of Studies||Inclusion Rate (%)|
|modified rankin scale||outcome||22||68.2|
Unsurprisingly, common clinical measures and devices show high inclusion rates, again, well above the overall rate of 21.1%.
The above study metadata and contents (along with any other useful information provided by the PubMed API), are used to train a predictive logistic model for every review done on AutoLit. We use these models to produce inclusion likelihoods which in turn:
- Help predict the utility of other searches you may wish to run (for example, on other databases)
- Inform removal low quality searches and results
- Optimize the order in which screeners review references
Heat map of term frequency (below)
We are continuing to optimize our inclusion modeling, and will update the scientific community as we create tools that may be useful to literature reviews across the clinical literature.