Lesson 5: Interpretation and Write-up

  • Scrutinizing Research Questions and Search Strategies
  • Topics to cover in Methods and Results
  • Odds Ratios, confidence intervals, and metrics of statistical significance
  • How to draft the perfect (easy) Discussion Section
  • How to read Forest Plots
  • Basilar Artery Strokes: What did we find? Is minimally invasive therapy effective at saving lives and brain matter?


  • Start a draft of the manuscript for your review
    • Introduction: One sentence identifying the area of interest, one sentence explaining your Research Question
    • Methods: Identify your Search Strategy, Inclusion/Exclusion criteria, and P, I, and O
    • Results: Preliminarily outline the sections where you will report your outcomes
    • Discussion: Preliminarily outline the sections where you will discuss the implications of your findings

Speaker 1: Hello, this is Kevin Kallmes with Nested Knowledge. Bringing you lesson five in our course on how to systematically review the medical literature. Today, we’re going to go over how to analyze and interpret your results and write them up as a publishable systematic review and an analysis. If you recall, after building out our research question, our study design, our finalized search strategy… Last lesson we finally got to see how a study flows through the review process. So first, it is screened and if it’s included, it’s also tagged for qualitative concepts of interest. It’s then extracted to reflect the quantitative data of the underlying study.


S1: And an entire study life cycle is reflected in a PRISMA diagram, which you include in your publishable review to show that flow through, from the databases that you searched or the studies that you added as expert editions, all the way through to what you included and extracted data from. Then we also went over study inspector where you can, in the Nested Knowledge software, review that entire process for a specific study and also edit any parts of a review, including using bulk actions to change content of multiple studies at once. But, once you’ve actually completed that process of searching, screening, tagging, extracting, and then reviewing and quality controlling your review, what is left for you? Well, in effect, it’s analyzing your data and writing up your findings, and so let’s hop right in.


S1: As we’re starting, the first thing that you’re going to want to think about as you’re writing up your methods is what the method section is for. The method section is for allowing an independent researcher to replicate your study. So, your systematic review should be completed and written up in a way that if an unknown researcher reading it, could recapitulate exactly what you did and have your same findings. So as we’re scrutinizing methods, let’s imagine that we’re actually a different researcher reading our own review and seeing if we can replicate it. And so the first thing we’d have to do is ask ourselves, what is their research question? And if you recall, we structured our research question very carefully with P, I, and O. So for patients with basilar artery stroke, had a thrombectomy and thrombolysis compare with respect to mortality and neurological outcome.


S1: I think it’s about as concrete and well spelled out as we can get. So the next thing to examine is, where did they search and what concepts did they search on that database. And for this, if you recall, we searched PubMed only. So we only doing PubMed-related records, and then on PubMed, we had a boolean search that captured basilar artery and acute ischemic stroke. And also we appended some further terms that helped us restrict by study type and also built out further synonyms for thrombectomy and for thrombolysis. Also, when you publish, you should include your exact search string. So when you write up a publication, make sure that you’re including your exact search string so that someone could replicate that if they were completing your review again.


S1: You should also reflect things outside of the P, I, and O that you use to select studies. So any inclusion criteria like study design that will help your reader distinguish and a further researcher distinguish which studies are in fact eligible. And in our case, we included randomized controlled trials or prospective studies as well as registries, as long as they directly compare thrombectomy to thrombolysis. And if you recall, we also required things like they’d be published after 2015 because of changes in standard of care. Then, you want to make sure that we can tell what the interventions being compared are, and what are the outcomes that they’re being compared with respect to.


S1: And last week we went over our entire hierarchy of all the tags and data elements that we were extracting, but in brief review, thrombectomy compared to thrombolysis where the primary outcome is modified Rankin Scale score at 90 days. But we also collected mortality as well as a few other data points, including things like symptomatic intracranial hemorrhage or brain bleed caused by the procedure. And then lastly, what statistical methods did they use to perform their analysis? And if you’re using the Nested Knowledge software, you’re in luck, we have already written up a full meta analytical methods description on our documentation, which you can find and learn, which you can also cite to in your publication.


S1: This describes the methods used in our network meta-analytical statistics that are calculated on quantitative syntheses. If you’re not using Nested Knowledge then make sure that you’re writing up the exact data that’s being analyzed, the analytics methods, and then how estimates are made and even the softwares that you’re using to actually complete that analysis. Okay. So, when you’re scrutinizing methods for replicability, you’re generally going to be examining the key questions or research questions, search strategy effectively, study design, interventions, outcomes and statistical methods, and those translate into sections of your paper. So your research question… And I wish it were a section in your publication but unfortunately, it’s usually going to be reflected in the purpose statement of your abstract.


S1: And also often is the last sentence in your introduction. So here you can see we affectively are saying, “We performed a systematic review and meta-analysis to better understand whether EBT or thrombectomy is beneficial for PCLVO,” which is the term we had to use for basilar artery strokes. Then, your search strategy should be fully spelled out, including the database that you searched on, the methods you used to search it. In our case, using the API on AutoIt [0:05:48.1] ____ and then the exact search terms that you use. So someone can recapitulate the search based on it. We also spell out our inclusion criteria and also our screening methods. So not only do we note you know, must be published, after 2014. We also note every exclusion reason that we could have applied and where appropriate included a justification for that exclusion criterion.


S1: Then we also describe the methods that we use. So we actually use dual screening in this nest. So two independent reviewers rated each study as include or exclude. And then a third independently adjudicated any disagreements. So make sure that not only the inclusion-exclusion [0:06:32.8] ____ criteria, but also the inclusion-exclusion methods that you used are reflected when you’re writing up your study. Then, for data collection, the key things to include are number one, what actual patient characteristics and outcomes were collected if they were dichotomized. So for mRS or modified Rankin Scale score, we actually collected it at two dichotomized levels. So mRS score of 0 to 3, and then also mRS score of 0 to 2 were both collected in our case. And that should be spelled out in your method, so no one’s surprised when they get to the results that mRS is reported in two different ways.


S1: Then you should also be reporting how these were collected. In our case, these were collected by two… By one author and then independently reviewed by two.


S1: Lastly, your statistical analysis section should build out… And first of all, it can cite to the Nested Knowledge documentation, but it should also go through the methods in a way where someone who is using both your citations, including the Nested Knowledge documentation and your publication could replicate exactly what you did down to how you calculated your 95% confidence intervals on any pooled estimates that you made. So, moving forward from methods to results. When you are writing up your results, what you’re generally going to report is, for every intervention, the effect size estimates… So, if it’s a continuous variable, you’re going to report the mean or the median. For dichotomous variables, you’re going to report the number of patients that achieved that.


S1: So with mortality, it would be number of patients with mortality out of the total patient population and so on. And then for each of those, you’re going to also include a confidence interval. So for means and medians, it’ll be a confidence interval on that mean or median. For your dichotomous variables, it will be the range of expected rates of that given outcome. And then between interventions, you’re going to report an odds ratio, which represents the odds of an outcome with one intervention compared to with a different intervention with its own confidence interval representing the range of confidence on that odds ratio, not on each underlying intervention, as well as a p-value for that comparison. And between the odds ratio showing how one intervention compared to another, and the confidence interval showing the range on that, and the p-value showing whether or not that finding is significant.


S1: You should be able to tell how intervention one compared to intervention two, and intervention three and so on. As you’re doing this, we’re now delving into statistically significant findings… And I’m sure you’ve heard these warnings before but I’ll give them again, when you are reporting results, be careful of noting significant results that have no meaningful difference. So if the effect sizes are very similar between two interventions with respect to an outcome, make sure that you’re not letting the fact that it happens to be a significant mislead you into thinking that that difference is meaningful. Then, I know that 0.05 is an arbitrary line drawn in the sand with respect to p-values when you’re determining significance. But, do be careful of language like trending towards significance or generally framing any result that wasn’t significant as close to or as significant.


S1: Generally, it’s better to err on the side of caution and give people the data and let them interpret it rather than trying to inculcate your interpretation into their brain that a nearly significant result is in fact so. Then, in a systematic review even more than in a primary study, because you have this heterogeneous population drawn from multiple studies that might have had different standards of care, you should be very careful of attributing all of the differences that you see to the interventions of interest. So population differences, differences in procedural practice, randomness… There are many factors that impact these outcomes. So make sure that you are not attributing the differences in any given outcome directly to the intervention without taking those things into account. So in a systematic review more than elsewhere, make sure that you are not asserting that correlation means causation.


S1: And then lastly, there’s some inherent limits of systematic review even beyond this that you should take into account. One of the most important is, that you are collecting population level and not patient-level data. So in a primary study, they are able to complete analysis to see if any given background characteristic is leading directly to the findings. You aren’t necessarily going to be able to complete these same regressions. You’re not going to be able to drill down on things like do the background characteristic definitively correlate with the outcomes because you are only working with what you find in published articles, which will not be every single patient’s full dataset, instead it will be the total patient population. And then the population for each Arm [0:11:46.0] ____ that achieved the given outcome or that had a certain characteristic, but no correlation among characteristics or outcomes.


S1: And that’s a misconception I see all over the place. So be very careful to note that when you are creating a statistical analysis on top of a systematic review dataset, what you are really doing is combining heterogeneous populations and the differences in treatment effect that are found among those populations and not tracking individual patient data down and seeing their performance on an individual level. So, I know that those are likely repeats for you, but do be careful of all of those ways in which you can overstate or misrepresent the results of your study. But with all caveats out of the way, I’m sure that you guys are excited to learn about the findings in the basilar artery stroke review. And along the way, learn how to present and interpret results.


S1: So obviously, your methods outlined the way that you searched but in the results you should actually outline what you found in your search. So in effect, You should outline the database that you searched, how many studies came back from it. And then also how many that you found by other means, and then you should also include your PRISMA chart often as figure number one in your review. And we went over to how to interpret this PRISMA chart last lesson, but you should also make sure that you’re including a caption that explains the flow of studies to your readers. And I also recommend generally, outlining the key exclusion criteria that led to the exclusion of the studies that you did not end up reporting on in the text of the article as well. Then, for studies that you did include, you should also note the total patient population. So in our case, we found three studies with 1,248 patients.


S1: And then you should also break that out and teach Arm [0:13:38.1] ____. So in our case, it was 860 thrombectomy patients and 388 thrombolysis patients. Study design details or any other biases, or study characteristics that you wanna report should also be noted. In our case, the major one is that we found three studies; one was a registry and two were randomized controlled trials. And then you can move on to actually starting to report your outcomes, usually primary first and then secondary. So again, briefly, report your search results, your screening findings, especially your PRISMA chart and exclusion reasons. And then for included studies, report the general patient characteristics in general study characteristics before jumping into outcomes. But, I can now start going over those.


S1: Our primary outcome for this review was modified Rankin Scale score, but it was actually dichotomized at zero to three. So, zero to three meaning good neurological outcome or 4 to 6 meaning bad neurological outcome. We found that in 39.9% of thrombectomy patients, good neurological outcome was achieved compared to just under 25% for thrombolysis patients. But we also included a 95% confidence interval for this estimate, and you can see that it’s actually quite wide. So for thrombectomy, the confidence interval is from 30.6 to 50.1, and for thrombolysis, it’s from 9.6 to 49.8. So, what we’re really seeing is not just an effect size estimate but also the range on that estimate. It is extremely wide in this case, probably driven by the small number of patients included in this three study review.


S1: We also should examine this as a forced plot, and let’s go… Through actually how to read a forced plot and generate odds ratios. So a forced plot displays usually an odds ratio between two therapies. Here, you can see it’s between endovascular therapy and standard medical therapy with the odds ratio depicted along the bottom and then individual studies displayed as each row. And then these estimates show the size of the patient population, as well as the odds ratio estimate and the confidence interval on that estimate for every underlying study. So for the longest… All study that we examined in the previous lesson. The odds of endovascular therapy patients achieving good neurological outcome compared to thrombolysis patients was below two and had a confidence interval crossing one. This means that we are not confident based on that study alone…


S1: That good neurological outcome is more likely in thrombectomy patients than thrombolysis patients, despite that large difference in the basic estimate that we saw out front. If you combine all those studies, you also get a total estimate that gives you your total odds ratio. So the odds ratio that we found was 2.14 with a range from 0.95 to 4.8 as represented by the last row in this forced plot showing the combined estimate. You can see that this estimate also crosses one. And for odds ratios generally, if your confidence interval crosses one, that means you are likely not confident that one therapy outperformed the other, even though the evidence does seem to suggest that thrombectomy is outperforming standard medical therapy. We cannot necessarily say so, if our confidence interval on our odds ratio crosses one.


S1: Okay. So secondary outcomes, we can look at mortality. Unfortunately, mortality was experienced by 42% of thrombectomy patients and nearly 53% of thrombolysis patients, even though there was about a 10% difference in that raw estimate of effect size. You can see that 95% confidence intervals for this definitely cross over, and if we open up a forced plot, we can see that even though the odds ratio… So the odds of endovascular therapy compared to standard medical therapy with respect to mortality were about 0.6, the confidence interval yet again crosses one, meaning that we are not necessarily confident that mortality is lowered by thrombectomy when compared to thrombolysis. And then lastly, I wanted to include symptomatic intracranial hemorrhage for two reasons.


S1: First of all, as an interesting comparison between an interventional and a drug therapy, where this is one of the outcomes that can happen when you undertake an endovascular intervention. We actually found that this was present in nearly 7% of patients treated with thrombectomy, but in 0.7% of patients treated with thrombolysis. And you can see the confidence intervals as well, this is a much larger difference. So if we open up that odds ratio, or we open up that forced plot and look at the odds ratio, we can see that the odds of symptomatic intracranial hemorrhage are 7.5 to 1 for thrombectomy compared to thrombolysis. And for this, even though some of the underlying studies had odds ratios crossing one, for the study as a whole, the odds were actually significant, so…


S1: The way that we would state this in our paper would be that, endovascular therapy was significantly more likely to lead to SICH when compared against thrombolysis. And we should give the exact odds ratio, 95% confidence interval, and then also a p-value on that calculation as well. Those are the major results of our basilar artery review. You can read the freely available full text of that review in the link that I’m including here. But I also wanted to zoom out and at the higher level, go through those result sections and how you should be writing them up. So if you recall, you wanna actually lead off with your search results where your main goal is to communicate which records came from which databases. And then go straight into screening where you’re reflecting after you’ve pulled back those studies from databases, which ones were included and when excluded, what reasons were applied.


S1: And generally, I recommend including your PRISMA chart as a figure, and then also discussing your main exclusion criteria that you used in the text as well. Then, you can jump into your study characteristics where study type and sort of design are the center of your analysis there. So generally, you’re gonna wanna start off by… In our case, saying that, We found two randomized controlled trials and one registry. And then go into any potentially biasing procedural practices or differences in the methods of those studies before jumping into the actual patient characteristics. And then, and only then should you start discussing your primary results. And generally, I like to try to keep a parity between the first… Sorry. Between my primary results in the first paragraph of my discussion, which we’re about to go into. So that there’s a mental parity for my reader where they remember the result and they’re ready for it to be discussed.


S1: And then the same with secondary results, match them against your second discussion paragraph. And for both primary and secondary results, make sure that you include your effect size estimates with confidence intervals, and then for all comparisons include a p-value with 95% confidence intervals for all intervention one against intervention two comparisons in the results. So that when you get to the discussion, you don’t need to report any further data or analysis. Then, we can proceed to the discussion section. I find this is the hardest part of a review to write, so I’m gonna lead off with some errors that I see commonly, and then we can go into a good structure.


S1: The first error that I see very commonly is restating results; your effect size estimates, your comparisons, your p-values, your confidence intervals, all of those belong in the results section, they should not be reported in the discussion at all. You should be referring up to your result section for each of those pieces of information. And secondly, the second most common error I find in discussion section is that the authors think that they need to educate their audience on the entire field rather than contextualizing their results. So again, the goal of your discussion section is to contextualize your results, not to teach your audience about your field. So generally, avoid going off on any tangents that are not directly related back to the findings in your study.


S1: On [0:22:02.6] ____ with that, let’s go through the structure for a perfect discussion section. And by perfect, by the way, I mean it is minimally difficult for you to get started and then put together your actual findings. And it includes nothing that your reader won’t want to read about your study. So, first paragraph, as I mentioned before, discuss your primary findings in context. And by in context, I mean tell us what they change about current interpretations with respect to your therapies and outcomes. So, in our field, thrombectomy is generally preferred over thrombolysis for most types of stroke but for basilar, it’s relatively uncertain. So in our case, we wanna say, our finding show that there might be a trend toward improved neurological outcome, but the jury is really still out.


S1: And we should not repeat our results, so we should not repeat our p-values, we should not repeat our confidence intervals, we should just state that we found an effect size difference in mRS 0 to 2, but that it was not significant, and so that community should still remain uncertain between these two interventions. Paragraph two, you should go through your secondary findings, though I do wanna note your reader’s attention span is shorter than yours as the writer, so make sure that you’re not going into any outcomes in the discussion that won’t be of interest to your reader.


S1: In paragraph three, I think you can go into a overview on previous research that’s similar to yours or that’s on topic. This is in order to provide that context of how the current interpretations came about. So this can mean discussing some of the studies you included in your review, so it could be after the BASICS trial, physicians thought that perhaps thrombectomy could show improvements over thrombolysis and so on, or it could be previous reviews. So after previous reviews, the community interpretation was such and such. You can also go through guidelines or standard of care in your field for the therapies or for the disease states [0:24:06.6] ____ of interest.


S1: Lastly, you should include interpretations of how your findings have now updated our understandings. So how is your work better or confirmatory or different than each of these earlier publications that you’re discussing. And then lastly, you should be including the limitations of your methods, and since this is a systematic review, there are a couple of limitations that you can include in most cases. And those are… So methodological limitations, you’re generally going to have the limitation of not being able to access patient-level data because you are only able to get the published versions of these studies, so that will limit your ability to find any correlation between patient characteristics and outcomes, which in turn makes us less certain of the impact of our interventions.


S1: Then, you should also note that in almost all cases in a systematic review, you are combining heterogeneous population. So, populations that are drawn from different cities, possibly with different inclusion criteria. It’s generally going to be a finding of your study that your limitations include a heterogeneous population. Then, because of those methodological limitations, it is generally difficult to attribute differences and outcomes to differences in treatment, because there may have been differences in the patient population or differences in the treatment between studies. So, make sure to note that in almost any limitation section in a systematic review.


S1: Then, though in our study, we found 2 RCTs, we also did include a registry, so we should make sure to note to the study quality and any limitations that may come from including sub-optimal study types in our review. And then, though calls for further research are probably seen too often in the scientific literature, generally, it’s actually a good thing to include in your limitation section for a systematic review. That’s because while systematic reviews are often seen as the highest level of evidence in a field, they often are also a call to action. So a systematic review’s findings, because of these methodological limitations, because of these uncertainties, they’re often the jumping off point for new RCTs that are based on the updated findings that you have in your review.


S1: So a call for further research where… Especially in our case, where we found differences in effect sizes that weren’t significant, we think a properly powered randomized controlled trial may be the distinguishing factor in actually determining whether or not thrombectomy should be adopted as standard of care for basilar artery strokes. And after that, you should also include a brief conclusion statement, it can be as little as a sentence, just listing your major findings that can include both the result and the interpretation. So in your conclusions, it is perfectly appropriate to mash up your main result with your main conclusion, but keep it at that. And once you’ve drafted that, you actually are done drafting your article. Though we are not done with this lesson, I have a bonus and an update.


S1: Well for you it was just a 10-minute wait between the results that we presented earlier and right now. We did call for further research in our review, and actually two further randomized controlled trials have been published between when we published our basilar artery review and this lesson. So very recently, the results from the attention randomized controlled trial led by Raul Nogueira and the BoCCE [0:27:28.7] ____ randomized controlled trial led by Tutor Jovan [0:27:31.7] ____ were recently presented. And these RCTs actually alter the trends that we found in our publication. So if we go back, we can see that the results with respect to modified Rankin Scale score, so the odds of mRS 0-3 were 2.4 to 1, but it wasn’t quite significant. So we couldn’t discuss it as a significant increase in the rate of good neurological outcome.


S1: And similarly for mortality, the odds were 0.6 to 1, but that total confidence interval crossed one as well, so we couldn’t necessarily assert that endovascular therapy improved patient’s survival. So, with those understood and with the results of the attention and BoCCE [0:28:14.3] ____ trials in, we can actually see how these randomized controlled trials might alter the results of our publication. This is an updated forced plot, including the attention trial, Nogueira et al and the Jovan trial… Sorry, the BoCCE [0:28:28.3] ____ trial, Jovan et al. They found very similar results with respect to mRS 0 to 2, that actually pushed our odds ratio up to 2.38 to 1. And now with this additional sample size, our results are significant. So with two further randomized controlled trials, the results of our systematic review are actually pushed from on the edge into significantly better.


S1: So based on the updated five study review, we can actually say that endovascular therapy outperforms thrombectomy with respect to neurological outcome. A major finding and you are among the first to hear it. And then secondly, with mortality, the Nogueira findings and the Jovan findings also pushed this odds ratio, which is still around 0.56 to 1 into significance. So, now the 95% confidence interval is from 0.38 to 0.81, and so we can actually say, endovascular therapy outperformed thrombolysis with respect to mortality in basilar artery strokes.


S1: And since those are the two major data points that a physician would look at in selecting a therapy, even with the higher rates of hemorrhage in the intervention group as compared to thrombolysis, many physicians will look at results like this and say, “If I’m going to improve my patient’s neurological outcome or their chance of survival, then I think that I should be reaching for my thrombectomy catheter rather than my thrombolytic syringe.” So, with updated results from the attention and the BoCCE [0:30:06.6] ____ RCTs, you are among the first to hear that thrombectomy may in fact soon be adopted as the standard of care for basilar artery strokes. I will also note that for SICH, the findings continued, though were slightly mitigated the odds with the Nogueira and Jovan results in are that it’s 6.79 to 1 and those results are still significant for the rate of SICH.


S1: With that bonus information, let’s get to our summary. So, when you’re writing up your findings, you want your methods to reflect your research question, your search strategy, your inclusion criteria, the methods that you used in collecting data, and then the methods that you used in analyzing your findings. Your results should show your primary and your secondary findings that will set up your discussion to contextualize the actual numerical data that you’re reporting in your result section. You should be careful about using the word significance around each of your findings and always include p-values and confidence intervals on any comparison. And then, we also hammered home that for systematic reviews, there are limitations of comparing populations rather than patient-level data, and of combining populations, so the review strategy of combining different publications…


S1: Generally, will lead to certain limitations that you should include in your discussion section. And with that, I am including the link to the open access publication on the basilar artery review as well as to our methods. So, thanks so much for your attention, and I’ll see you in Lesson Six.