How to spot reliable scientific evidence

Dinner: 500 grams beef, 6 eggs, broccoli and kale, A cup of greek yoghurt, milk on the side.

mum: “You’re eating way too much protein!” my mum complained.
me: “How do you know how much protein is too much?” 

She used her phone to find a handful of studies that supported her point and showed me. 

me: “Well, if it’s on the internet, it must be true.”
mum: “They are all from scientific journals. You can’t argue with that!”

Surprised, I looked deeper into this and saw that she was right. Reputable journals coming to contradictory conclusions.

You can find a study to support almost anything, whether it’s in the news, on social media, or during a debate. This can be a problem because people tend to believe things more when they’re backed by studies, even if the study might not be reliable.

This made me curious. From my Biomedical Science degree, I knew that studies could come to different conclusions based on all sorts of factors, but I thought they’d always mention those variables. Turns out, it’s not that simple—especially if you don’t have a science background.

I decided to look into this for myself and pull together what I learned. Studies can feel like they’re written in another language, especially if you’re not a specialist, so I’m sharing my notes for anyone who’s curious about how to check if a study is legit.

And guess what? I’ll show you how you can use AI to help you! 

But more on that in a bit.

Science works in probabilities not certainties.
It’s always about how likely something is.

One thing that became really clear: science never deals in absolutes. It’s not about “proving” things in black and white, but about figuring out what’s most likely true based on the data. Good studies try to strip out bias (basically any assumptions or unfairness) and just look at the facts.

But even then, nothing is 100% certain—it’s all about probabilities.

I did the research. Here’s what I learned to watch out for in scientific studies.

1. What’s the Hypothesis?

Scientists start with something called the null hypothesis. It’s like saying, “Nothing’s going on here,” until they can prove otherwise. For example, if you want to test whether eating carrots helps you see better at night, the null hypothesis would be: carrots don’t make a difference. They run tests to see if they can reject that starting guess.

Same thing with testing if too much protein is bad for your kidneys. The null hypothesis would be: protein doesn’t affect kidney health. Then, they collect data to try and prove it wrong. This is the starting point they need to move beyond.

2. The Hierarchy of Scientific Evidence

Not all studies are created equal. Imagine a pyramid. At the top, you’ve got meta-analyses and systematic reviews—these look at tons of studies and combine the results to figure out what all the evidence points to. These are the gold standard.

Below that are randomized controlled trials (RCTs). These are solid because people are randomly assigned to groups to test treatments, which helps cut out bias. Then you’ve got cohort studies (they follow groups of people over time) and case-control studies (comparing people with and without a condition). Further down, you get animal trials and in vitro studies (done on cells in the lab).

At the bottom are things like case reports and opinion papers. These don’t have as much weight because they usually involve just a few people or no proper testing at all.

The higher up something is on the pyramid, the more reliable it is because it’s based on stronger, more rigorous research.

3. Is it Correlation or Causation?

Correlation just means two things happen together. It doesn’t mean one causes the other. Like, more ice cream is eaten in the summer, and more people drown in the summer. But obviously, ice cream isn’t causing the drownings.

To prove cause and effect, scientists need stronger evidence, like a controlled experiment that tests if one thing directly leads to the other, while ruling out other factors. Without this, a simple link isn’t enough to show cause and effect.

They might have two groups: one group eats ice cream, and the other doesn’t, while they control other factors like swimming or water activities. If the group that eats ice cream actually drowns more, and the researchers rule out other reasons, then they could say ice cream might cause drowning. But without that kind of clear, controlled test, a link between the two things isn’t enough to prove cause and effect.

4. How Big Was the Study?

This one is huge. Bigger studies give more reliable results by reducing sampling error (the chance that the results were just random). If only 10 people took part, the results could just be luck. But if 10,000 people were involved, the results are more likely to reflect reality. Bigger studies reduce the chance of random error and give a clearer picture. but that’s not the only thing that matters. How the study was designed is just as important.

Why Having Enough Participants Matters
Power is a fancy way of talking about how good a study is at finding a real effect if there is one. A study with more participants has more power, meaning it’s better at spotting real results. Think of it like this: if we’re studying whether eating oats make marathon runners faster, a study with only a few runners might miss the real effect, but with a bigger group, we’re more likely to find out if oats really help. In studies with low power (not enough participants), real differences might get missed. That’s why having enough people in a study is so important— it makes the results more reliable.

At the bottom are things like case reports and opinion papers. These don’t have as much weight because they usually involve just a few people or no proper testing at all.

The higher up something is on the pyramid, the more reliable it is because it’s based on stronger, more rigorous research.

5. How Big Was the Effect?

Even if a study finds a result that’s “statistically significant,” it doesn’t always mean it matters in real life. Statistical significance just means the result probably didn’t happen by chance, but it doesn’t say if the effect is meaningful in real life. For example, if a new drug lowers blood pressure by just 1 mmHg, that might be statistically significant, but it’s not going to make much of a difference to anyone’s health.

So, you always need to ask: is the effect big enough to actually be useful?

6. Watch Out for Bias and Confounding Factors

Bias happens when researchers’ choices or mistakes mess with the results, even if they didn’t mean for it to happen. Confounding factors are things the researchers didn’t account for that could be skewing the results.

For example, if a study doesn’t control for the fact that some patients are older or sicker than others, the results might not be accurate. You want to check if the study accounted for things like age, gender, or health status to avoid these issues.

At the bottom are things like case reports and opinion papers. These don’t have as much weight because they usually involve just a few people or no proper testing at all.

The higher up something is on the pyramid, the more reliable it is because it’s based on stronger, more rigorous research.

  1. Can the Study Be Repeated?

A good study should be reproducible, which means if other scientists do the same experiment, they should get the same results. This shows that the findings are reliable and not just a fluke. If a study hasn’t been repeated by other researchers, or if no one else has gotten the same results, it might not be trustworthy.

  1. Who Were the Participants?

Who was involved in the study? If a study only includes young, healthy men, the results might not apply to women, older people, or anyone with health issues.

 

For example, if a medicine works well in young men, that doesn’t mean it will work the same way for older women or people with heart disease. Studies need to include different types of people to make sure the results apply to everyone, not just a specific group.

9. Does It Match What We Already Know?

If a study comes up with results that totally contradict what’s already known, it’s not necessarily wrong—but it does need more scrutiny. Big claims need big evidence. If something sounds too good to be true, it probably needs more testing.

  1. Who Paid for the Study?

Always check who funded the research. If a drug company paid for a study showing their drug works great, it could mean there’s a conflict of interest. It doesn’t make the study wrong, but it’s something to consider.

A good study should be reproducible, which means if other scientists do the same experiment, they should get the same results. This shows that the findings are reliable and not just a fluke. If a study hasn’t been repeated by other researchers, or if no one else has gotten the same results, it might not be trustworthy.

This might sound like a lot to remember at once. That’s why I started using AI to help me sort through it all. I’ve built an AI prompt that makes it easy to evaluate any study.

The AI will give a score out of 100 based on how reliable the study is and explain why it gave that score. 

 

AI won’t replace your judgement, but it can definitely help make sense of the details faster.

Want My AI Prompt?

Drop your email, and I’ll send you my custom AI prompt that’ll help you evaluate any study in no time. With just a few clicks, you’ll know if the evidence is solid or if it needs a closer look.