Logo

Log In Sign Up |  An official publication of: American College of Emergency Physicians
Navigation
  • Home
  • Multimedia
    • Podcasts
    • Videos
  • Clinical
    • Airway Managment
    • Case Reports
    • Critical Care
    • Guidelines
    • Imaging & Ultrasound
    • Pain & Palliative Care
    • Pediatrics
    • Resuscitation
    • Trauma & Injury
  • Resource Centers
    • mTBI Resource Center
  • Career
    • Practice Management
      • Benchmarking
      • Reimbursement & Coding
      • Care Team
      • Legal
      • Operations
      • Quality & Safety
    • Awards
    • Certification
    • Compensation
    • Early Career
    • Education
    • Leadership
    • Profiles
    • Retirement
    • Work-Life Balance
  • Columns
    • ACEP4U
    • Airway
    • Benchmarking
    • Brief19
    • By the Numbers
    • Coding Wizard
    • EM Cases
    • End of the Rainbow
    • Equity Equation
    • FACEPs in the Crowd
    • Forensic Facts
    • From the College
    • Images in EM
    • Kids Korner
    • Medicolegal Mind
    • Opinion
      • Break Room
      • New Spin
      • Pro-Con
    • Pearls From EM Literature
    • Policy Rx
    • Practice Changers
    • Problem Solvers
    • Residency Spotlight
    • Resident Voice
    • Skeptics’ Guide to Emergency Medicine
    • Sound Advice
    • Special OPs
    • Toxicology Q&A
    • WorldTravelERs
  • Resources
    • ACEP.org
    • ACEP Knowledge Quiz
    • Issue Archives
    • CME Now
    • Annual Scientific Assembly
      • ACEP14
      • ACEP15
      • ACEP16
      • ACEP17
      • ACEP18
      • ACEP19
    • Annals of Emergency Medicine
    • JACEP Open
    • Emergency Medicine Foundation
  • About
    • Our Mission
    • Medical Editor in Chief
    • Editorial Advisory Board
    • Awards
    • Authors
    • Article Submission
    • Contact Us
    • Advertise
    • Subscribe
    • Privacy Policy
    • Copyright Information

Search with GRACE: Artificial Intelligence Prompts for Clinically Related Queries

By Daniel Fitzgerald, MD; Roya Z. Caloia, DO, MPH; and Jesse M. Pines, MD, MBA, MSCE | on October 9, 2025 | 3 Comments
Features
  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
Print-Friendly Version

Emergency physicians, along with physician assistants and nurse practitioners, are increasingly using generative artificial intelligence (GenAI) tools for fast access to medical information. This spans from general-purpose large language models (LLMs) like Google’s Gemini, Anthropic’s Claude, and OpenAI’s ChatGPT, to domain-specific platforms like OpenEvidence, which is grounded in evidence-based, peer-reviewed literature.1,2

You Might Also Like
  • Gradually Circling Around the GRACE Project’s “Reasonable Practice”
  • Tread Cautiously with Artificial Intelligence in Clinical Scenarios
  • Dr. Chatbot Will See You Now
Explore This Issue
ACEP Now: October 2025 (Digital)

General purpose AI tools tend to excel at providing background information, patient-friendly discharge instructions, and supporting research brainstorming. In contrast, clinician-specific tools like OpenEvidence are better suited for patient-specific queries that require accuracy, citations, and guideline-concordance, all of which are grounded in attributable sources.

Getting the AI Prompt Right

Although the quality and nature of outputs can vary greatly between different LLMs that power AI tools, the value and safety of AI is not intrinsic to the models themselves. Rather, it is about the quality and nature of the interaction between the AI and the human. This interaction is the “prompt:” that blank space where a human poses a question to the AI.

Poorly constructed prompts can lead to irrelevant, nonsensical, and nonactionable answers. They can also increase the likelihood of factually incorrect information, termed “hallucinations.”3 Hallucinations are more common when incomplete information is given to the AI, leaving the AI to fill in gaps. AI tools can create fake scientific facts claimed to be backed by evidence or even generate references that do not exist. AI models can also inherit or even amplify biases present in the data that feeds them, perpetuating health care disparities.4

How to “prompt” is not yet a skill emergency physicians and APCs are typically taught, but it is a skill that is becoming increasingly important with the growth of general and medical-specific AI.

The GRACE Framework

To optimize AI prompting, a systematic approach is needed to ensure reliability, trustworthiness, and consistency in that outputs are clinically relevant, accurate, and actionable.2 One such novel framework for prompt engineering for medical queries is called GRACE (Ground Rules, Roles, Ask, Chain of Thought, Expectations), which is designed for emergency physicians and APCs in acute care.

The GRACE framework (see Table 1) involves first setting ground rules—the “G” in GRACE. These set boundaries, constraints, and evidence standards for the AI’s response. This is important as AI models tend to anchor to terms at the beginning of the prompt and clear ground rules can reduce the likelihood of hallucinations. An example of ground rules may be “source-published, verifiable literature. Provide citations. Do not invent sources.” This is particularly important in queries to chatbots, like ChatGPT. Although it may be obvious that this is desired, explicitly telling the AI sets guardrails on the output and can lower the likelihood of hallucinations or other erroneous output.

Pages: 1 2 3 | Single Page

Topics: AIArtificial IntelligenceChatGPTClinical Decision ToolsInformation TechnologyTechnology

Related

  • Florida Emergency Department Adds Medication-Dispensing Kiosk

    November 7, 2025 - 1 Comment
  • The AI Legal Trap in Medicine

    August 14, 2025 - 0 Comment
  • AI Scribes Enter the Emergency Department

    August 11, 2025 - 2 Comments

Current Issue

ACEP Now: November 2025

Download PDF

Read More

3 Responses to “Search with GRACE: Artificial Intelligence Prompts for Clinically Related Queries”

  1. October 12, 2025

    GW MD Reply

    You’re making this much more difficult than it needs to be.

    Simply ask the model to design the prompt for you! (The query before the query).

    Tell it who you are and what your priorities are.

    You absolutely don’t need to memorize or work off of this chart. But it’s important to understand.

    Make a folder in the note section of your iPhone with your best prompts.

    Finally, there was no discussion of the most important thing: which model you’re using!

    Please please please use the most advanced models for complex medical searches. Not the default models.

    That’s gPT5-thinking or GPT5-pro
    Grok4-expert or groj4-pro
    Etc.
    or stick with open evidence

    Any AI discussion must mention the absolutely huge difference between models and this, results.

    GPT5 (the free default) is fine for doing the query to design your prompt.

  2. October 12, 2025

    GW MD Reply

    Here are two specific generic examples of prompts you can use:

    First by Grok4-Expert:

    You are a senior emergency medicine researcher with extensive expertise in evidence-based practice, akin to Robert Hoffman in toxicology and Rick Bukata in critical appraisal of medical literature. Your role is to act as an impartial educator and specialist, guiding board-certified emergency physicians in evaluating clinical evidence without bias, speculation, or unsubstantiated claims.
    Ground Rules: Base all responses exclusively on high-quality, peer-reviewed sources such as randomized controlled trials, systematic reviews, meta-analyses, guidelines from reputable organizations (e.g., ACEP, Cochrane, PubMed-indexed journals), and evidence hierarchies (e.g., GRADE or Oxford Levels of Evidence). Avoid hallucinations by citing verifiable sources for every claim; if evidence is lacking or inconclusive, state this explicitly. Prioritize recent evidence (post-2015 where possible) while acknowledging foundational studies. Assess evidence quality using criteria like study design, sample size, bias risk, applicability to emergency settings, and overall strength (e.g., high, moderate, low).
    Core Task: For the topic [insert specific clinical topic, e.g., “management of acute opioid overdose in the emergency department”], search for and summarize the available evidence, providing a detailed step-by-step rationale for its interpretation and relevance to emergency physicians.
    Chain of Thought: Proceed step-by-step as follows: 1) Identify key search terms and databases (e.g., PubMed, EMBASE). 2) Retrieve and list primary sources. 3) Evaluate each source’s methodology and quality (e.g., RCT with low bias = high quality). 4) Synthesize findings, highlighting consistencies, conflicts, and gaps. 5) Apply to emergency context, considering time-sensitive decisions. 6) Conclude with evidence-based recommendations or areas needing further research.
    Expectations: Structure your output as follows for usability: – Introduction: Brief overview of the topic and search approach. – Evidence Summary: Bullet-point list of key studies with citations, findings, and quality assessment. – Step-by-Step Rationale: Numbered explanation of how evidence leads to conclusions. – Clinical Implications: Practical guidance for emergency physicians. – Limitations and Gaps: Honest discussion of evidence weaknesses. Use formal, precise language; include full citations (e.g., APA format) at the end. Aim for comprehensive yet concise detail, approximately 800-1200 words. Example structure for a sample topic like “thrombolysis in acute stroke”: Introduction on guidelines; summaries of landmark trials (e.g., NINDS, ECASS); rationale linking fibrinolysis timing to outcomes; implications for ED protocols.

  3. October 12, 2025

    GW MD Reply

    This is an example of a generic research prompt from GPT5-Thinking.

    PRO TIP: Treat the AI Model as your experienced research assistant who doesn’t know exactly what you want. If the prompt could do better, send it back into the Model telling the model what you like and what you don’t like.

    Even go ACROSS MODELS saying to GROK4 that this is what GPT5 produced; can it do better.

    ————————————————————————————————————-
    Here’s a ready-to-use GRACE-aligned prompt you can drop into your LLM when you need an evidence search and appraisal for emergency medicine. It’s built to minimize hallucinations, force transparent sourcing, and reflect the skeptical, data-first voice of senior EM researchers.

    Title: GRACE Prompt – Evidence Appraisal for EM (ACEP Now format)

    G — Ground Rules
    • Audience: Board-certified emergency physicians. You are a senior EM researcher (Hoffman/Bukata style): skeptical, harm-aware, and concise.
    • Safety: Do NOT invent facts or citations. If evidence is insufficient, say so explicitly.
    • Sources: Use only verifiable, citable sources (PMID/DOI or official guideline URLs). Prioritize: ACEP Clinical Policies, Cochrane, high-quality society guidelines (AHA/ACC, IDSA, ATS, ADA, ACR, EAST, NAEMSP), top peer-reviewed journals (Ann Emerg Med, NEJM, JAMA, BMJ, Lancet), and major EM-relevant systematic reviews/meta-analyses.
    • Recency: Emphasize the last 5–10 years; include older landmark trials only if still practice-defining.
    • Scope: Clinical decision support for the ED; align with risk, time pressure, and resource constraints. Defer to local policy when conflicts arise.
    • If browsing is unavailable: Restrict to sources I provide/paste; otherwise state “evidence not verifiable with current access.”

    R — Roles
    • User role: EM physician asking a focused clinical question and needing defensible recommendations.
    • Model role: Evidence synthesizer and critical appraiser. Provide a decision-useful summary, not legal or billing advice.

    A — Ask (fill these in)
    • Clinical question (PICO/PECO): [Population/setting], [Intervention or Index test], [Comparator], [Outcomes that matter in ED], [Time horizon].
    • Context modifiers: [Pretest probability/clinical gestalt], [Red flags], [Resource limits], [Special populations], [Contraindications], [Shared-decision needs].
    • Jurisdictional lens (optional): [Country/region for guidelines].
    • Output needed: Bottom line, graded recommendation, and what to document in the chart.

    C — “Evidence Trace” (succinct, no inner monologue)
    1) Search Log: list databases/sites queried (e.g., PubMed, guideline sites), MeSH/keywords used, and date searched.
    2) Study Selection Snapshot: inclusion/exclusion in one line; number of items screened/kept.
    3) Evidence Table (bullet form):
    – For each key source: citation [Author, Journal, Year, PMID/DOI], design/size, population, main outcome(s), absolute effects (ARR/RRI), NNT/NNH with 95% CIs, important harms, follow-up length, major limits (bias/indirectness/imprecision).
    4) Diagnostic Questions (if applicable): sensitivity/specificity, LR+/LR–, pretest → post-test calculation for a realistic ED pretest probability.
    5) Therapeutic Questions (if applicable): effect size, time-to-benefit, number-needed calculations, early vs. late outcomes, dose/timing.
    6) Consistency & Heterogeneity: where studies agree/disagree and plausible reasons.
    7) External Validity: fit to ED population/workflow; key exclusions that limit applicability.
    8) Evidence Quality: grade each conclusion (use GRADE or Oxford levels) and state certainty (high/moderate/low/very low) with the reason.

    E — Expectations & Output Format
    Deliver these sections, labeled:

    A. Bottom Line (2–4 sentences): the “what to do tonight in the ED” answer with strength of recommendation and certainty (e.g., “Conditional recommendation, moderate certainty”).
    B. One-Page Summary:
    • Indications/Contraindications (bullet list)
    • Dose/Timing/Route or Test-Use Algorithm (ED-ready)
    • Benefits vs Harms (absolute numbers where possible)
    • Special Populations (pregnancy, pediatrics, elderly, renal/hepatic impairment)
    • Alternatives if unavailable/contraindicated
    C. Evidence Trace (from section C above; keep bullets tight, each with citation)
    D. Documentation Phrases (chart-ready, 3–5 bullets to reflect shared decision/risk discussion)
    E. Controversies & Gaps (what’s uncertain, active trials, practice variation)
    F. References (numbered list with PMID/DOI; no dead links). Include a “Source Integrity Check” line: confirm each citation matches the stated findings.

    Rules to minimize hallucinations:
    • Do not paraphrase beyond the data; quote brief key result phrases in quotation marks with citation when precision matters.
    • If a required data point cannot be verified, write: “Not found/insufficient evidence” rather than inferring.
    • If studies conflict, present both sides with effect sizes and explain which you would weight more and why.
    • End with: “Confidence Statement:” [Why the recommendation could be wrong and what would change your mind.]

    Now analyze this query:
    [PICO + context pasted here]

    If you’d like, I can tailor a filled-in example for a specific ED question (e.g., “single-dose oral dexamethasone vs. multi-dose for pediatric croup” or “pre-test–post-test math for CT head in minor trauma using CCHR”).

Leave a Reply Cancel Reply

Your email address will not be published. Required fields are marked *


*
*


Wiley
  • Home
  • About Us
  • Contact Us
  • Privacy
  • Terms of Use
  • Advertise
  • Cookie Preferences
Copyright © 2025 by John Wiley & Sons, Inc. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies. ISSN 2333-2603