AI Prompt: reliable scientific evidence

1. Access a study you want to analyse: Find and download the scientific study you want to evaluate for the most accurate assessment.

2. Upload the study: Open ChatGPT or any AI tool of your choice and upload the study (if the platform supports file uploads). This lets the AI analyse the content in detail.

3. Use the evaluation prompt: Copy and paste the prompt below, adjusting it with the study’s details. This will guide the AI to score the study across critical criteria like sample size, reproducibility, and funding source.

4. Review the AI’s analysis and score: The AI will provide a score out of 100, reflecting the study’s overall reliability. Use the score interpretation guide:

90-100: Highly reliable

70-89: Generally trustworthy

50-69: Mixed reliability

Below 50: Questionable reliability

This structured approach enables a comprehensive evaluation of research quality, helping you judge whether the study’s findings are reliable.

Prompt:
“Please analyze the following scientific study, [Insert study name or website here], and assess its reliability. Provide a probability score out of 100, with an explanation based on these factors. The analysis should be thorough and informed by a comparison with broader research in the field. Ensure the AI reads the entire paper carefully and consults wider studies for context. Key Factors to Analyze:
  1. Study Design and Hierarchy of Evidence: Identify where the study fits in the pyramid of evidence. Studies higher in the hierarchy, such as meta-analyses and randomized controlled trials (RCTs), are considered more reliable. Lower-tier studies, like case reports and animal trials, provide weaker evidence. RCTs are the “gold standard” because they control for bias by randomizing participants and often use blinding to reduce researcher and participant influence. Meta-analyses combine data from multiple RCTs, giving them even more strength, but only if the individual studies are of high quality. Be cautious with observational studies like cohort or case-control studies; these can show associations but are less likely to prove causation.
  2. Sample Size and Statistical Power: Evaluate the study’s sample size. Larger sample sizes increase the statistical power and reduce the likelihood of random chance affecting the results. Studies with small sample sizes may lead to unreliable results due to their inability to generalize findings to larger populations. Check if the study provides confidence intervals and p-values. Confidence intervals should be narrow, indicating precision, and p-values (ideally <0.05) suggest the result is unlikely to have occurred by chance. However, p-values alone do not confirm the importance of the effect.
  3. Effect Size: Analyze the effect size, which is crucial in determining how meaningful the results are. Even if a result is statistically significant, a small effect size may have little practical importance. For instance, if a drug reduces cholesterol by 1%, that may not have a significant clinical impact.
  4. Consistency with Existing Research: Compare the findings of this study with the broader body of research. Are the results consistent with what other high-quality studies have found? A study that contradicts established knowledge may not be reliable unless it presents strong evidence and is later replicated. Meta-analyses are especially helpful here, as they consolidate findings from multiple studies to provide a more robust conclusion.
  5. Bias and Confounding Variables: Examine the study for bias (e.g., selection bias, confirmation bias) and confounding factors. Were participants randomized properly, and were important factors like age, gender, and health status controlled? Studies that fail to control for these variables may produce misleading results. Non-randomized trials and observational studies are more prone to bias because the assignment of participants may not be neutral, potentially skewing results.
  6. Reproducibility: Has the study been replicated by other researchers? Reliable studies should yield similar results when replicated by independent teams. The more a study’s results are repeated across different contexts, the more trustworthy the findings become.
  7. Funding and Conflicts of Interest: Consider the funding source. Was the study sponsored by an entity that stands to benefit from the results, such as a pharmaceutical company? While this doesn’t automatically discredit a study, it raises the importance of scrutinizing the methodology and results more carefully.
  8. Internal vs. External Validity: Check for internal validity (how well the study was conducted) and external validity (whether the results can be applied to a broader population). High internal validity means the study was carefully designed to minimize errors, while high external validity means the findings are generalizable. For instance, if a study on a new drug was done only on young, healthy men, the results may not apply to older adults or women.
  9. Correlation vs. Causation: Make sure the study distinguishes between correlation and causation. Just because two things happen together (e.g., eating ice cream and more drownings in the summer) doesn’t mean one caused the other. Establishing causation requires well-designed experiments, often using RCTs, to rule out confounding factors.
  10. Context and Publication: Ensure the study is published in a reputable, peer-reviewed journal. Peer review adds a layer of scrutiny to the research, ensuring that other experts in the field have evaluated the methodology and results before publication.
Final Assessment: After analyzing all of these factors, provide a probability score out of 100 for the reliability of the study, explaining your reasoning in detail. Remember to encourage readers to read the full study themselves for a more complete understanding, as AI analysis is only one step in evaluating scientific evidence.”
  • Over 95 would be a good indication but will have to make your own mind up to be sure by reading the study.
  • Under 80 would be questionable and would need to dig further.
  • Anything under 70 is a poor study.