Critical Evaluation
Develop skills to assess AI outputs and improve them iteratively
Why Critical Evaluation Matters
AI is confident, articulate, and often wrong. It presents information with the same assured tone whether it's accurately citing research or completely fabricating facts. This makes your evaluation skills essential.
The Hallucination Problem
AI systems can “hallucinate”—generate plausible-sounding but false information, including fake citations, invented statistics, and incorrect facts. The only defence is your critical evaluation.
The VERIFY Framework
Use this framework to evaluate any AI output:
Validate Facts
Check specific claims, statistics, and citations. AI frequently invents plausible-sounding sources.
If AI cites a study, search for it. If it quotes a statistic, verify the original source.
Examine Logic
Does the reasoning hold together? Are conclusions supported by the evidence presented?
Look for leaps in logic, unsupported assumptions, or circular reasoning.
Review for Bias
What perspectives might be missing? Is the output balanced or one-sided?
Consider whose voices are absent and what alternative viewpoints exist.
Inspect Relevance
Does this actually fit your context? Generic content may not suit your specific situation.
A lesson plan for 'Year 5' might not match your Year 5 in terms of curriculum or abilities.
Filter for Quality
Is this good enough? Just because AI produced it doesn't mean it meets your standards.
Would you be proud to use this? Would a colleague be impressed or disappointed?
Your Expertise
Apply your professional judgment. Does this align with what you know to be true and effective?
Your experience with students trumps AI's theoretical knowledge.
Common AI Errors to Watch For
Fabricated Citations
AI often invents academic references that don't exist, including realistic-sounding author names, journal titles, and publication years.
Defence: Always verify citations through Google Scholar or library databases. If you can't find it, it probably doesn't exist.
Outdated Information
AI knowledge has cutoff dates. Information about recent policies, current events, or updated research may be wrong or missing entirely.
Defence: For time-sensitive information, verify with current official sources.
Confident Incorrectness
AI presents wrong information with the same confidence as correct information. It doesn't flag its own uncertainty.
Defence: Never trust confidence as an indicator of accuracy. Verify independently.
Oversimplification
Complex topics may be reduced to overly simple explanations that miss important nuance or exceptions.
Defence: Ask yourself: “Is this too neat? What complexity is being glossed over?”
Context Mismatch
AI may apply approaches from different educational systems, grade levels, or cultural contexts without recognising the mismatch.
Defence: Always check whether suggestions fit your specific context and students.
The Iterative Improvement Process
Rarely is the first AI output ready to use. Here's how to refine systematically:
Evaluate the First Output
Read critically using the VERIFY framework. Note what's good, what's wrong, and what's missing.
Provide Specific Feedback
Tell AI exactly what to change. “Make it better” doesn't work. “The reading level is too high—rewrite for Year 3 vocabulary” does.
Request Targeted Revisions
Ask for specific improvements: “Remove the first two questions and add more application-based questions instead.”
Verify Changes
Check that revisions actually addressed your concerns and didn't introduce new problems.
Apply Final Judgment
Decide when it's good enough—or when you need to start over or finish manually.
Quick Evaluation Questions
Ask these questions for every AI output:
When to Reject AI Output Entirely
Sometimes the right answer is to start over or do it yourself. Reject and restart when:
- The output contains verifiable factual errors
- The approach fundamentally misunderstands your context
- More revision time would exceed doing it yourself
- The tone or style is irreparably wrong
- It would require rewriting more than half
- Your professional judgment says it's not right
Key Takeaways
- AI confidence is not an indicator of accuracy—verify everything
- Use the VERIFY framework systematically: Validate, Examine, Review, Inspect, Filter, Your expertise
- Iterate through specific, targeted feedback—not vague requests
- Know when to reject and start over instead of endless revision
- Your professional judgment is the final authority
Interactive Lab
VERIFY Evaluator
Practice critical evaluation of AI outputs using the VERIFY framework
SQL Injection Attack Summary for Cybersecurity 101 What is SQL Injection? SQL injection is when attackers insert malicious SQL code into user inputs to manipulate databases. It's been the #1 web vulnerability since 2005. How It Works: 1. Attacker finds a login form or search box 2. Instead of normal input, enters SQL code like: ' OR '1'='1 3. The database executes the malicious query 4. Attacker gains unauthorized access or extracts data Example Vulnerable Code: query = "SELECT * FROM users WHERE username='" + input + "'" Prevention Methods: - Use parameterized queries (100% effective) - Input validation - Web Application Firewalls - Disable detailed error messages Practice This: Try using sqlmap on DVWA or HackTheBox - it's how real pentesters find SQLi. Fun Fact: The 2017 Equifax breach was caused by SQL injection!
Validate Facts
What facts in this output need to be verified? Are there any claims that might be incorrect?
Check specific claims, statistics, and dates against reliable sources like OWASP or CVE databases.