Lesson 3: Search Strategy

  • How can you turn your P, I, and O into a structured Boolean query?
  • What automated tools can help expand a search?
  • What are the key operators in search terms?
  • Introduction to MeSH and Automatic Term Mapping


  • Create a Boolean Search on PubMed for your review
    • In that search, ensure that your P, I, and O are represented
    • Use at least one truncation, one quotation, and one MeSH term
    • Adjust your search until it returns less than 200 records, at least three of which you think are includable

Kevin Kallmes: Hello, this is Kevin Kallmes with Nested Knowledge, here with Lesson Three in our course on How to Systematically Review the Medical Literature. Today, we’re going to be covering the Holy Grail of a systematic review, which is A Good Literature Search. Before we get started though, let’s go over what we learned last week. If you recall, we drafted the protocol for our systematic review, that is we took the research question that we created in Lesson One and the patient population, the intervention and comparator information, and the outcomes of interest, and we built them out into a full protocol on the topic of Basilar Artery Stroke. We also planned out who we’d be working with and who would take on which specific tasks. We planned our starter search, which will also be the starter point for today’s lesson, and the inclusion exclusion criteria, so the methods that we’re going to use to separate the wheat from the chaff in the screening process.


KK: And lastly, we built out our hierarchy of both qualitative and quantitative concepts that we want to tag and extract in underlying articles. But before we actually execute on that protocol, before we collaborate and complete this review, we need a good search strategy because that’s effectively the set of articles that are the starting point for our review, and it’s also the Holy Grail of our review, and I’ve mainly one reason for that. You are going to be searching on a database of 30 million or more studies and using only a few key terms and a few operators, you need to bring back with extremely high fidelity articles that answer your research question while keeping out articles that do not.


KK: So the only way to avoid doing either thousands or even millions of articles of screening is to let your search strategy do the heavy lifting by crafting a search strategy that meets the goal of maximizing comprehensiveness. So finding as many includable studies as possible while also minimizing your workload. So finding as few studies that you’re eventually going to need excluded as possible. This also sets you up with good transparency, so another reviewer or a reader can come through and see your search strategy and how you landed on the records that you did, and I also think that it enhances the quality of your review. I generally find when I am including one of every five or 10 articles rather than one out of every hundred or thousand, that my rate of finding the right articles to include generally goes up. So if you can maximize comprehensiveness and minimize your own work, you’re also setting yourself up with a transparent and quality search strategy you can then bring forward to the screening stage.


KK: To get started, let’s go through the starter query that we built and dissect a little bit. So if you recall, we were reviewing Basilar Artery Strokes, which we broke out into the terms Basilar Artery and Acute Ischemic Stroke. We identified interventions that we care about, so thrombectomy or the endovascular therapy for stroke and Thrombolysis, which is the clot-busting drug that’s used for stroke. We also identified mortality and neurological outcome as the outcomes we wanted to collect in our study, but we also noticed that modified rank and scale score is a physician created scale that actually measures neurological outcomes, so it represents a synonym. But we kind of ignored the actual operators that we were using in our starter query, and so I think that today, let’s start by going through the exact function of not only the operators that we used in this search, but all of the operators that you can use in a PubMed search.


KK: To start, I’m going to go through some basic operators, these are actually not PubMed specific, so these are likely going to be usable on any database that you’re searching on. So they’re simple function and is a connector where it’s going to require both of the terms of interest or means either or of the terms of interest and not means you want to find studies that do not contain your term of interest. That part is relatively straightforward, it gets more complicated when you layer in parentheses, which tell you the order of operations of these ANDs, ORs and NOTs. Double quotes, which you put directly around your phrase of interest, if you want to have it search for the exact phrase rather than exploding as we’re about to go through, and then truncation where you can start a word and then put an asterisk to mean that you are okay with any ending or suffix to that word.


KK: A PubMed specific search filter that we’re going to explore today is MeSH. MeSH is a librarian-created hierarchy of concepts that are tagged in PubMed on articles that are potentially of interest to you, and you can also use these MeSH terms, both to explode your search so to drill down on potential synonyms or nested terms of interest within your topic or non-exploded, so search at a certain level in the MeSH hierarchy. We’re also going to go through some of the filters that PubMed enables, so putting in your search query and then adding filters for something like date, for author, for the type of publication, for affiliations, all the way down to saying, “I only want to search this term in the title, or I only want to search this term in the title or the abstract.”


KK: Let’s actually take that starter query and start building it out the whole time, taking on one of those pieces of our Boolean operators and our MeSH and our filters. Starting with Boolean operators. When we take our starter search and we want to go toward a final search, our first task is to really examine our parentheses, our quotes and our ANDs, ORs and NOTs. So here, you can see that we required both the term Basilar Artery and Acute Ischemic Stroke, because we don’t want just any acute ischemic stroke and we don’t want any lesion of the basilar artery, we require both. Then we ANDed that together with our phrase for our interventions of interest. Generally, you’re going to want studies to contain your P and your I and your O, so unless you’re returning effectively no articles from your search, I generally recommend connecting those sorts of phrases with ANDS while commonly using ORS for synonyms or for terms that effectively will bring back similar articles.


KK: Once we’ve looked over those parenthesis, double quotes, ANDs, ORs and NOTs, we can reduce the complexity of this and just look at our phrase and also ask ourselves, “Are any of these terms, the type that might have a suffix or a different ending, and therefore that we might want to search for only the start of the phrase?” And I think the best way to explain truncation is by doing it, so you can actually truncate thrombectomy into thrombect, thrombolysis into thrombol because they can be put in different word forms like thrombolytic. So you could search for a thrombolytic drug or for thrombolysis, and those would be synonyms. And also while you’re truncating, you might notice that a term like Basilar Artery, what we’re really searching for is the Basilar part, artery is going to be in many papers. So rather than searching for the term basilar artery, we might wanna reduce the complexity of that down to just search for basilar. So after both examining our ANDs ORs and NOTs, our parentheses, our quotation marks and now our truncations, we have reduced the complexity of our initial starter query. Now, I think we’re probably ready to make it more complex.


KK: So the way that we want to expand our query is generally by putting in any synonyms that would bring back articles that might not otherwise be found, but that might be on topic. The first way to do this is to find synonyms in our own brains, on Google or in general research. So if you happen to know synonyms for stroke, for basilar arteries, for thrombectomy, for thrombolysis, it’s generally best to add those to your search as well. So I can be the expert on this one, I’ll bring back my synonyms that I happen to know. One of the synonyms for thrombectomy is embolectomy, taking out a thrombus is effectively the same as taking out an embolus. And then there are several synonyms for thrombolysis. Thrombolytic drugs are generally tissue plasminogen activators, so they’re drugs that effectively make your body respond to plasmin differently, and those are both abbreviated as well. So on top of synonyms, you need to add the acronyms. The acronym for thrombolysis or intravenous thrombolysis is ivt. And the acronym for tissue plasminogen activator is IV tPA.


KK: Then you can see a synonym for mortality might be death, that one should be very straight forward for you. And then I also searched and found that there is another synonym for neurological outcome in the stroke literature. People often conflate it with functional outcome. And then I also found that there was a commonly used acronym for Modified Rankin Scale Score, which is mRS. So our first task is complete, we have found and added synonyms and acronyms for our terms of interest.


KK: However, our next task is to find synonyms more systematically. And for this, we’re going to go to MeSH’s controlled vocabulary. So as I said earlier, PubMed maintains librarians who tag underlying articles using this hierarchical controlled vocabulary, and it can be extraordinarily helpful in your search, even if you don’t know it. The way that this can happen without your knowledge is Automatic Term Mapping, Automatic Term Mapping is completed within PubMed on any generalized term that you put in. So if you just enter a word like stroke and you don’t put those double quotation marks on it, Automatic Term Mapping will assist you in your search. You are also going to want to do some MeSH specific searches if you want to actually use the controlled vocabulary, not using the Automatic Term Mapping but using your own manual research and term mapping.


KK: We’re going to examine the concept of explosion versus non-explosion, which will help you control within that controlled vocabulary which pieces of it you’re going to use in your search, and then we’re also going to go over some wrinkles on Automatic Term Mapping when you’re using Nested Knowledge specifically. So first, that Automatic Term Mapping or ATM, PubMed will apply Automatic Term Mapping to any uncontrolled search term, so a search term that doesn’t have a filter applied, it is not truncated and that does not have double quotes around it. It will not apply Automatic Term Mapping if it has quotation marks, truncation or one of the filters. Effectively, ATM can do some of the work for you on alternate spellings, so British and American spellings for word forms, so singular and plural, and other synonyms.


KK: But the main work that ATM does is actually based on that MeSH hierarchy. So it searches a table of MeSH terms, a journal translation table, and then also an index of authors and collaborators to try to find similar articles using these other methods of effectively synonym matching and of finding similar articles. The basic takeaway for Automatic Term Mapping is it’s generally the friend of the inexperienced searcher. If you just go in with your starter query, PubMed might be able to do some of the heavy lifting for you by taking your general uncontrolled term and doing Automatic Term Mapping effectively searching MeSH for you, searching for British spellings, singular plural word forms for you and allowing you to move forward with mostly the topics on top of mind.


KK: However, if you want to control the term mapping within the MeSH vocabulary, rather than trusting in the Automatic Term Mapping, you can also do MeSH specific searches by putting your term of interest and attaching in brackets, MH or MeSH or MeSH terms. This will turn off Automatic Term Mapping and will search within the MeSH hierarchy for the term at your level, and then all different terms that are nested below it. So if we search for stroke in the MeSH hierarchy, and I’m actually going to go out and show you what that looks like. So if we go to MeSH and we enter stroke, we can actually see the hierarchy that the medical librarians have built out for stroke, which is one of their major topics. So we just enter stroke, we could have also searched for acute stroke, we go to the MeSH topic, and it will display for us the entire tree of MeSH categories above stroke, all the way down to stroke. And then if we were to adopt the MeSH header as our term for this search, it will also automatically bring back any article that has been tagged with brain infarction with cerebral infarction, with hemorrhagic stroke, with any of these terms of interest within the MeSH hierarchy at or below the level of stroke.


KK: Let’s hop back in. You can also search by major topic, so MeSH terms are classified into major topics and sub-headings, if you only want the higher level studies, so the studies that address major topics, then include MeSH major topic attached to your term, and if you only want lower level studies, so studies that address the specific sub-headers below a term like stroke, then put in MeSH sub-heading as your term alter alteration, and we’ve already examined stroke as a MeSH major topic.


KK: Then explosion and non-explosion. If you are going this route and doing manual MeSH term mapping, it works a lot like the exploder in our inspector search. So if you’ve used inspector search in Nested Knowledge, you know that clicking and un-clicking the asterisk in our tag search will basically change it from searching for in this case, things tagged with procedural outcomes or below versus things that are tagged with very specifically the tag procedural outcomes. So when you’re looking at the MeSH hierarchy, consider explosion to mean extending downward in the hierarchy, whereas non-explosion means searching at the exact level in the hierarchy. So if you’re using exploded, which is the default, you’re just going to use MH or MeSH or MeSH terms as your field limiter, and if not exploded, you wanna use the specific field limiter of MeSH, no exp, so no explosion on your MeSH terms.


KK: And then a quick how to on actually using MeSH. You can effectively use MeSH without interacting with it at all, putting in general terms and letting Automatic Term Mapping do the work. That said, that might bring back much more than you want if it’s exploding and finding a lot of studies related to terms lower in the hierarchy, or it might not bring back all that you want because your general term might have no MeSH term related to it, and therefore have no explosion or Automatic Term Mapping based on synonyms. That leaves you option two of replacing your general terms with exploded MeSH terms, to ensure that you find all MeSH synonyms and to restrict your general search, so to keep your search from finding general words that are related to or close to what you’re searching, or to make sure you’re finding all actual MeSH synonyms in the hierarchy, you can use manual MeSH searching. And then option three, you can also drill down on major topics, on sub-headings, or use non-exploded searches if you’re trying to limit the extent or coverage of your search.


KK: Now, the wrinkles for Automatic Term Mapping in Nested Knowledge. We use the PubMed API or Application Program Interface, which is sort of like a way to get data from the database and download it directly into your Nest, but the PubMed API is still using old PubMed. The big difference between old and new PubMed is that the Automatic Term Mapping of old PubMed is not fully updated. So if you’re searching general terms on Nested Knowledge, you should expect to get fewer Automatic Term Mapping results than if you’re searching directly on PubMed, and if that’s a concern to you, you will want to use new PubMed and then import ended files, and for that, I direct you to our documentation which has a full outline on how to import ended files, but in effect, if you wanna get the exact Automatic Term Mapping that you get by searching in PubMed, make sure that you’re using ended-import rather than the direct search on Nested Knowledge.


KK: Okay, with that caveat aside, let’s go back to our MeSH terms. So if we wanted to take our Query which we’ve now built out with synonyms that we knew ourselves, how would we make this a manual MeSH search, so not depending on Automatic Term Mapping, but putting in our terms of interest, we could search for the MeSH heading, stroke, and I also checked, and there is actually a MeSH heading for mortality, so we could modify our mortality search term by searching for only that MeSH header. Then, now that we’ve examined Boolean operators, we have added manual synonyms and acronyms, we have examined Automatic Term Mapping as well as manual walking of the MeSH hierarchy, including things like explosion, non-explosion and the relevance of Automatic Term Mapping and MeSH to Nested Knowledge, let’s go through additional filters that you can add to your search. So these are going to be terms that don’t just modify your content-related topics, you’re actually going to be adding these as new terms appended to the end of your search, usually with an AND. And some of the major ones that are used in searches are publication date, so limiting your search to only from, say, 2015 to now.


KK: Author limited searches. So if you wanted to find articles by a specific author, publication type affiliations, journal titles. And then if you want to search a word within the title, or within only a title and abstract, you can also use filters for that. There are a whole bunch of other filters and I’m putting a link in the bottom of your page, so you can see all of the filters that you use in PubMed, but my personal recommendation is to focus on publication date filters which will help limit you to relevant recent evidence. Publication type filters, which can throw out articles that are generally on your topic, but not the right type of study, like an animal or an in-vivo study or an in vitro or petri dish study. And then title and abstract filters, which can help you limit down where you want the terms to be searched. If we’re gonna add filters to our recently built-out MeSH enhanced search, some that we might recommend would be searching some of our terms only within the title and abstract, so if we wanted to search basilar only within the title or abstract, we could append the field filter of TIAB to it. If we wanted to search only for publications from 2015 to now, we would search for 2015 date of publication to 3000 date of publication, which is how PubMed indicates up to the present. And then we could also limit by publication type, so if we wanted only clinical trials, we could put clinical trials with the field limiter of PT for publication type.


KK: Okay, so we have finished out a query and you can examine it compared to the Starter query, which effectively was just two terms for our population, two terms for our interventions and then three for our outcomes. And let’s look at the results. So our starter query brought us back 105 results, which is not bad. That’s generally about five hours if you’re doing quality controlled screening, and out of those three of those studies were includable in our Nest, and you’re gonna have to trust me on that part, I completed the screening myself. The final query that we just constructed together, so using MeSH headings, using title abstract limiters, adding a date filter, adding the clinical trial filter, and then building out synonyms and truncation and acronyms brought back 12 studies, which is about half an hour of work in the screening process and found the same three includable studies. This is a relatively small review, so this might not be representative of your work. Generally, I find that a more typical query will bring you down from several thousand results to two to 500 in your finalized query, and then you’re generally going to be including somewhere between six and 20 of those articles in a general final systematic review search.


KK: There are a couple of warnings that I wanna add as well. First, remember if you truncate or use double quotation marks, or if you use any of the field limiters that we’ve gone over today, you are not going to benefit from Automatic Term Mapping. It’s turned off on any of the specific terms that are truncated, quoted or field limited. Secondly, as you recall, Nested Knowledge is using the API to PubMed. So old PubMed’s Automatic Term Mapping is used. If you want to get new PubMed’s Automatic Term Mapping, search directly on PubMed and use an NBIB import.


KK: Next, as you recall, the basic syntax and operators of Boolean search are cross database. Everything that I went over regarding MeSH and field limiters are PubMed specific and would need to be changed if you’re going to search on multiple databases. And the last one, which I haven’t mentioned so far, because I built out the queries for you guys, is you definitely wanna debug your Search code, and by that, I mean run your search on PubMed, see what the results are, and if they’re incredibly off-base, go back and examine line by line, what you’ve put into your query to find where the errors might be and only move forward with the query once you’ve actually identified that, number one, it doesn’t have any basic errors, and number two, it’s bringing back things that are on topic for your search.


KK: Okay, with those warnings out of the way, let’s summarize what we learned today. So today, we learned that you should… To go from your starter query to your final query, to achieve the Holy Grail of systematic reviewing, you should start out by entering your P I and O from your research question. Alright, make sure you scrutinize how you’re using your Boolean operators, add in any truncations or direct quotation marks that you think should be used to limit down your search, find synonyms and acronyms yourself, but you can also use MeSH to do so, using either Automatic Term Mapping or manual walking of the MeSH hierarchy.


KK: You can also add filters such as date or publication type or title or title abstract or author filters. And then lastly, you should be debugging your code based on the results it brings back and based on finding errors in a line-by line review of your query. If you do so, then you should come back with a search that has fewer bugs, maximizes includable articles, while limiting the number you’re going to have to exclude, taking advantage of the hard work that the MeSH librarians have already put into building out hierarchies to help you find synonyms. And lastly, greater control over the articles that are coming back from your search. So the enlightened searcher takes their P I and O, uses Boolean operators, uses MeSH, uses all the tricks in their bag to gain control over the number and type of the results that are coming back from their search.


KK: With that said, I have a couple of resources. You can see these at the bottom as well, on how to search specifically using Nested Knowledge, how to use PubMed specific tools like search filters in Nested Knowledge, when new PubMed will be available, and then also PubMed’s full filter list. And with that, I think we have covered the Holy Grail of your systematic review, and I look forward to next week going over the entire study life cycle in a Nested Knowledge review. Thanks so much and see you next time.

Follow Us On Socials!