Mind the 9 Blocks: a diagnostic canvas for clinical AI accountability

Johnson, Mo

FRAMEWORK BRIEF №01

Clinical AI Governance Accountability Frameworks

Abstract

Mind the 9 Blocks™ is a diagnostic canvas that surfaces and scores the clinical AI accountability gap inside a health system. It arranges the accountability requirements of any clinical AI deployment into nine blocks across three bands, Diagnose, Assess, and Close, and pairs that canvas with a Gap Score™ that rates the three governance layers a clinical AI decision must satisfy, AI Governance, Clinical AI Governance, and Decision Accountability, each from 1 to 5 for a total out of 15. The canvas diagnoses what a deployment requires; the score prices whether it is ready to sign off. The framework operationalizes the Named Owner Principle, which requires two named owners in the audit trail, a Governance Owner at the institutional level and a Decision Owner at the bedside.

Mind the 9 Blocks™ is a diagnostic canvas that surfaces and scores the clinical AI accountability gap inside a health system. It arranges the accountability requirements of any clinical AI deployment into nine blocks across three bands, Diagnose, Assess, and Close, and it produces a single number, the Gap Score™, that an executive can defend in a board meeting. The reader it is built for is the Chief Medical Officer, the Chief Medical Information Officer, the General Counsel, and the board member who carries the deployment on the institutional record. Each of the nine blocks is either substantively filled, with documented and owned governance infrastructure behind it, or it is empty. The Gap Score™ reads across the three governance layers a clinical AI decision must satisfy, AI Governance, Clinical AI Governance, and Decision Accountability, scoring each from 1 to 5 for a total out of 15: a deployment is defensible above 12 and holds unpriced liability below 9.

The canvas is run by three operations. The governance Cascade architects it, building the four layers of governance that a clinical AI deployment depends on. The Evaluate framework scores it, converting the three convergence layers into the Gap Score™. The CLOSE discipline runs it, walking each deployment through Categorize, Locate, Own, Score, and Evaluate. Most institutions, assessed honestly for the first time, score low. That is not a flaw in the instrument. It is the measurement the instrument exists to produce, and it is the first accurate picture of accountability exposure most health systems have ever had.

Every clinical AI deployment carries a distance between the accountability an institution can demonstrate and the accountability it owes. That distance is The Accountability Gap™ (TAG™), and closing it is what this framework is for. The canvas measures TAG™. The Gap Score™ prices it. Closure belongs to the architecture the score certifies: two seats named, six functions held, and the Handoff between the owners documented. The CLOSE discipline runs that closure, deployment by deployment.

Mind the 9 Blocks™ operationalizes the Named Owner Principle, which holds that every clinical AI deployment requires two named persons in the audit trail: a Governance Owner at the institutional level and a Decision Owner at the bedside. The principle states who must answer. The canvas states what they must answer for, across all nine blocks, and how close the institution stands to being able to answer at all.

Why a canvas, and why now

The Named Owner Principle states who must answer when a clinical AI recommendation reaches a patient and the outcome is questioned. It does not, by itself, tell an institution how far it stands from being able to answer. A principle names the requirement. An institution needs an instrument that measures the distance between the requirement and its own current state, deployment by deployment, in a form it can act on. Mind the 9 Blocks™ is that instrument.

The case for measuring now rests on three findings that are no longer in dispute. Clinical AI is already making decisions inside health systems: on Stanford’s MedAgentBench, a benchmark of clinically derived tasks run inside a realistic electronic health record environment under strict first-attempt scoring, the leading models reached roughly 70% task success, which leaves the best clinical AI agents wrong on close to one task in three. The errors concentrate where they are most dangerous: the NOHARM benchmark, developed by the ARISE Network in a Stanford and Harvard collaboration and released in January 2026, found that across 31 large language models the potential for severe harm from a model’s recommendation occurred in up to 22.2% of cases, and that 76.6% of those harmful errors were errors of omission, the model failing to raise a diagnosis or a necessary next step. And the institutions deploying these systems cannot yet see the gap: in a Black Book Research survey of 182 United States hospital leaders conducted in late 2025, only 22% reported high confidence that they could produce a complete, auditable AI explanation for regulators or payers within 30 days, and 33% cited unclear internal ownership as a top barrier to audit readiness.

That last figure is the one the canvas is built to move. A third of institutions already know they cannot say who owns the AI in their own workflows. Unclear ownership is not a documentation problem to be solved with a better template. It is a structural absence, and it persists because no instrument has made it visible and measurable at the level where it actually fails, which is the individual deployment reaching the individual patient.

The cost of leaving it unmeasured is no longer theoretical, because the parties whose business is pricing risk have already moved. Effective January 1, 2026, the Insurance Services Office introduced Form CG 40 47, a Commercial General Liability endorsement that excludes injury arising out of generative artificial intelligence. W. R. Berkley Corporation introduced Form PC 51380, an absolute artificial intelligence exclusion written for the directors and officers, errors and omissions, and fiduciary liability lines, the coverage that protects the executives who approve deployments. When an exclusion applies, the institution holds the AI risk itself, unindemnified. The renewal cycle has become an audit, conducted by underwriters, of exactly the accountability the institution has not yet been asked to demonstrate by anyone else.

This is why the instrument is a canvas rather than a checklist. A checklist confirms that items were completed. A canvas maps a structure, shows which parts of it are load-bearing, and reveals where the structure is hollow. Clinical AI accountability fails structurally, not item by item: a deployment can satisfy every line of a compliance checklist and still have no named owner at the bedside, no traceable record from recommendation to outcome, and no answer when the underwriter asks. The canvas is built to make that hollowness visible and to give it a number, so that the institution sees its exposure before an adverse event, a regulator, or a renewal forces the accounting. The instrument serves two functions at once: an executive can hold it, walk their own institution through it, and arrive at a score they can take to a risk committee; and the field can reach for its vocabulary as the common language for what clinical AI accountability actually requires.

The canvas: three bands, nine blocks

The canvas is read as a whole before it is read block by block. Nine blocks sit in three bands. The first band, Diagnose, asks whether the deployment is even set up to be governed as clinical AI. The second, Assess, asks what is happening at the bedside and whether the institution can see it. The third, Close, asks who answers for the decision and whether the answer holds over time. The bands are sequential by accountability logic: an institution cannot meaningfully assess a deployment it has not first distinguished as clinical AI, and it cannot close a gap it has not first assessed.

Each block is read twice. First as a diagnostic question, which surfaces whether the block is filled or empty. Then as a governance requirement, which states what filling it demands. A block is substantively filled only when a documented, owned, and current governance artifact stands behind it. A block that is technically present but unowned, or documented but stale, counts as empty.

Mind the 9 Blocks™ — The Clinical AI Accountability Canvas™

Diagnose. Assess. Close.

← the build · incurs costthe exposure · what comes back →

Governance Cascade

Are all four layers built?

Clinical AI Distinction

Clinical, not general?

Affected Populations

Who does it affect?

The Clinical Decision

The protected call. The hinge.

Clinician / Shaped / Parallel

The Named Owners

Governance + Decision Owner

Accountability Trail

Recommendation → outcome

Clinical Promise

What's promised, to whom?

Governance Actions & Partners

The cost of governance: policy, board, vendor, carrier, counsel

Risk Exposure & Gap Score™

What you're exposed to; the score that prices it, out of 15

Band I — Diagnose

Block 1 — Clinical AI Distinction. What it governs. Whether the institution governs this deployment as clinical AI at all, distinct from general enterprise AI. Clinical AI governance protects the patient decision at the bedside. General AI governance protects the institution’s operational decisions. Most institutions apply the general discipline to a clinical system and never build the distinct layer the bedside requires. Who owns it. The Chief Medical Officer or Chief Medical Information Officer, whoever holds clinical governance authority. The distinction is a clinical judgment, not an IT classification. What empty looks like. The deployment is logged in the institution’s general AI inventory. It passed an IT and data review. No one has asked whether a system that shapes a clinical recommendation needs governance that an operational system does not, and no separate clinical governance attaches to it. What filled looks like. The deployment is formally classified as clinical AI, governed under a clinical AI policy distinct from the general AI policy, with the bedside decision named as the thing being protected. The institutional challenge. For each AI system that touches a clinical decision in your institution, is it governed as clinical AI, or is it sitting in the same governance bucket as your scheduling and billing tools?

Block 2 — Governance Cascade. What it governs. Whether the four layers a clinical AI deployment depends on are actually built: Data Governance, AI Governance, Healthcare AI Governance, and Clinical AI Governance. Each layer depends on the one above it. The fourth layer, at the bedside, is the one most institutions have not built, and it is the one the Named Owner sits in. The four layers do not sit in sequence only. They converge, and the convergence point is where all four must meet for the patient decision to hold. When the clinical layer has no named owner, the center is exposed, and the accountability gap lives in that exposed center. Who owns it. The Chief Medical Information Officer, with the Chief Data Officer accountable for the layers beneath. What empty looks like. The institution has a data governance program and a general AI governance program, built years apart under different pressures, and treats the existence of those two as coverage for clinical AI. The fourth layer is named in no policy and operated by no one. What filled looks like. All four layers exist, each documented, each with named accountability, and the dependency between them is explicit, so that a failure at the clinical layer is not masked by maturity at the data layer. The institutional challenge. Your data governance is mature and your enterprise AI governance is funded. Can you produce the fourth layer, the one that governs the moment an AI-shaped recommendation reaches a patient, as a distinct and operated structure?

Block 3 — Affected Populations. What it governs. Who the deployment affects, at the level of the institution’s actual patient population rather than the vendor’s training cohort. The demographic gap between the data a model was validated on and the patients it now informs decisions about, and whether anyone is accountable for detecting disparate performance. Who owns it. The Chief Medical Officer, the Chief Nursing Officer, and the institution’s equity authority, jointly. Differential performance is a clinical and equity obligation, not a technical footnote. What empty looks like. The vendor’s documentation was reviewed and a pilot ran with acceptable aggregate performance. No subgroup analysis was performed, no written definition of the affected population exists, and no one is named to monitor for disparate impact after go-live. What filled looks like. A written affected-population definition per deployment, signed by clinical leadership, comparing validation data to the institution’s demographic profile, with a named owner for ongoing disparate-impact monitoring. The institutional challenge. For each clinical AI system you run, can you produce a signed document defining the affected population, showing how the model’s validation data compares to your patients, and naming who is accountable when performance diverges across groups?

Band II — Assess

Block 4 — Clinical Promise. What it governs. What the deployment is expected to deliver clinically, defined by the institution rather than the vendor. The vendor describes what a system can do in general. The clinical promise documents what this institution expects it to do specifically, in this patient population, against this baseline, over this horizon, measured by these named metrics. Who owns it. The Chief Medical Officer and the clinical service line that sponsors the deployment. The promise must be owned by a clinician who understands what improvement looks like in this context, not by IT and not by the vendor. What empty looks like. The deployment was approved on the strength of vendor case studies, peer adoption, or a conference demonstration. No baseline was established, no success metric was defined, and no clinician signed a statement of what improvement was expected. When performance is later questioned, there is no institutional standard to measure against. What filled looks like. A written clinical promise produced before go-live, containing a baseline, a defined metric, a timeline, and a named clinical sponsor accountable for whether the system delivers it. The institutional challenge. For each deployment, where is your institution’s written clinical promise, with a baseline, a metric, a review date, and a clinician’s name on it? Not the vendor’s stated use case. Yours.

Block 5 — The Three Decisions. What it governs. Whether the clinical record can distinguish three patterns that look identical in the chart but are not the same decision. A Clinician Decision, made on clinical judgment alone. A Shaped Decision, made on judgment that an AI recommendation influenced. A Parallel Decision, where the clinician and the AI reached the same conclusion independently. All three produce the same order. Only the documentation can tell them apart, and in most deployments it cannot. Who owns it. The Chief Medical Information Officer, who owns how the recommendation is captured in the workflow, with clinical informatics accountable for the data structure. What empty looks like. The AI recommendation appears in the workflow, the clinician acts, and the chart records only the action. Whether the recommendation shaped the decision, was overridden, or merely coincided with the clinician’s own judgment is invisible after the fact. What filled looks like. The recommendation is captured at the point of use, the clinician’s response is documented as acceptance, override, or independent concurrence, and the three patterns are distinguishable in the record without reconstruction. The institutional challenge. If an outcome were questioned tomorrow, could your chart show whether the AI shaped the decision, was overruled, or simply agreed with a judgment the clinician had already reached? If those three look the same in your record, the most consequential fact about the decision is missing.

Block 6 — Accountability Trail. What it governs. The governed record that links an AI recommendation to the clinical decision to the patient outcome, in a form reviewable without a technical reconstruction. A log records what a system did. The Accountability Trail records what the institution did, who was accountable, what they knew, what they decided, and what followed, in a form that demonstrates governance rather than mere activity. Who owns it. The Chief Medical Officer, the General Counsel, and Risk Management, jointly. The trail is clinical, legal, and quality infrastructure at once. What empty looks like. The recommendation may be logged in the AI system, and the decision is documented in the clinical record, but the two live in separate systems with no link between them and no connection to the outcome. When an adverse event is reviewed, reconstruction takes weeks and is often incomplete. What filled looks like. The recommendation is logged in the clinical record at the moment of use, the clinician’s response and rationale are captured as structured data, the outcome is linked to the decision node, and the whole trail is reviewable by governance within a defined window rather than reassembled under subpoena. The institutional challenge. If a patient were harmed after an AI-influenced decision today, how long would it take to produce a coherent trail from recommendation to decision to outcome, and would it survive a reviewer who was not satisfied by a system log?

Band III — Close

Block 7 — The Named Owners. What it governs. The two named persons the Named Owner Principle requires, carried on this deployment’s record. A Governance Owner at the institutional level, accountable for the decision to deploy, who signs the charter and answers to the board, the General Counsel, the underwriter, and the regulator. A Decision Owner at the bedside, the treating clinician accountable for the individual patient call. Both names must appear. Neither substitutes for the other.

Each seat holds three functions. The Governance Owner holds Charter, which sets the boundaries of what the deployment is authorized to do; Commission, which authorizes it into the clinical workflow; and Cover, which holds institutional responsibility when the deployment is challenged. The Decision Owner holds Decide, which owns the clinical call at the bedside; Document, which records the decision and its reasoning; and Defend, which stands behind the call when it is reviewed, audited, or litigated. A seat that is named but holds none of its functions is a name on a chart, not an owner.

Who owns it. The Governance Owner seat is held by the Chief Medical Officer, the Chief Medical Information Officer, or the Chief of Clinical Service. The Decision Owner seat is held by the treating clinician at the moment the recommendation is acted on. The block is owned, in effect, by whoever verifies that both seats are filled and current. What empty looks like. The charter names an institutional sponsor but says nothing about who owns the bedside decision, or the chart records the clinician who acted but nothing about the authority that approved the system. One seat filled, or neither, with the gap between them undocumented. What filled looks like. Both seats named, both names recorded in the audit trail, and both current as of the deployment’s present scope. A Governance Owner who still holds the role, and a Decision Owner who reflects the clinician actually on service. The institutional challenge. For each deployment, can you name the institutional executive accountable for the decision to deploy and the clinician accountable for the patient decision, and are both names current today rather than left over from a launch that has since rotated past them?

Block 8 — Governance Actions and Partners. What it governs. Two accountabilities that share a block because they share a failure mode: the formal institutional actions that create a governance record, and the external relationships that share accountability when something goes wrong. The actions are policies adopted, reviews documented, board visibility established, with dates and signatories. The partners are the vendors, regulators, malpractice carriers, and counsel whose accountability the institution must secure before an event rather than activate during one. Who owns it. The Medical Executive Committee and Board Quality Committee for the actions; the General Counsel, Chief Risk Officer, and Chief Medical Officer for the partners. What empty looks like. Governance is informal, made in IT steering committees and email threads, with no medical staff policy and no board-level visibility. The vendor contract is silent on AI accountability, the malpractice carrier does not know the scope of AI use, and counsel was not engaged in governance design. When an event occurs, the institution discovers it holds the accountability alone. What filled looks like. A clinical AI governance policy adopted by the medical staff, documented review cadence, board reporting on the record, and vendor contracts, carrier briefings, and legal engagement that allocate accountability explicitly before it is tested. The institutional challenge. What formal governance actions has your institution taken on clinical AI in the last twelve months, and if a recommendation contributed to harm tomorrow, which of your partners, the vendor, the carrier, the regulator, has already agreed in writing to share the accountability?

Block 9 — Risk Exposure & Gap Score™. What it governs. The synthesis. The institution’s aggregate view of its clinical AI risk: every deployment catalogued, each block assessed, and the Gap Score™ that states where the institution stands. This block cannot be filled until the other eight have been assessed with substance, because it is the measure of them. Who owns it. The Chief Risk Officer, the General Counsel, and the Chief Medical Officer, with the score reported to the Board Quality Committee. What empty looks like. Each deployment is governed, or not, in isolation. No enterprise register exists, the Gap Score™ has never been calculated, and clinical AI does not appear as a named category in the institution’s enterprise risk reporting. Leadership governs in the dark, deployment by deployment, with no aggregate picture. What filled looks like. An enterprise clinical AI risk register, a Gap Score™ calculated per deployment and in aggregate, reviewed on a defined cadence, with a board-level summary and a remediation roadmap sequenced by exposure. The institutional challenge. What is your institution’s aggregate Gap Score™ today? If you cannot state the number, you are operating clinical AI without a risk baseline, and every deployment decision and governance claim you make rests on a foundation you have not measured.

The method layer: how the canvas is run

The nine blocks describe what accountability requires. They do not, on their own, tell an institution how to build it, how to score it, or how to keep it current. Three operations do that work. They are the verbs that move an institution through the nine nouns.

The Cascade architects the canvas. The four-layer governance cascade, Data to AI to Healthcare AI to Clinical AI, is how an institution builds the Diagnose band. The cascade is not a separate exercise from the canvas; it is the construction sequence for the first three blocks. An institution that has not built the fourth layer cannot fill Block 1 or Block 2 honestly, because the distinction and the layered structure those blocks assess are exactly what the cascade produces. The cascade is the reason the canvas begins with Diagnose: an institution cannot assess or close what it has not first architected.

The Evaluate framework scores the deployment. Evaluate is the operation that produces the Gap Score™, rating each of the three convergence layers, AI Governance, Clinical AI Governance, and Decision Accountability, from 1 to 5 for a total out of 15. It is treated in full in the next section, because the score is the instrument’s central output and earns its own treatment. For the purpose of the method layer, the point is structural: Evaluate does not score the nine blocks directly. It scores the three layers the blocks feed, which is why the canvas and the score are two instruments rather than one.

The CLOSE discipline runs the canvas, deployment by deployment. CLOSE is the repeatable loop an institution applies to each clinical AI system: Categorize the deployment as clinical AI, Locate the point where the recommendation meets the clinical decision, Own both named seats, Score the deployment against the three layers, and Evaluate the trail the deployment produces. CLOSE is what makes the canvas an operating discipline rather than a one-time assessment. An institution does not fill the canvas once. It runs CLOSE on every deployment and re-runs it whenever the deployment changes.

Underneath the three operations sits a test that separates a substantively filled block from one that is merely documented. The VALUE conditions, Validate, Assign, Link, Uncover, and Embed, are the five conditions a deployment must meet for a block to count as filled rather than staged. Validate that the system performs for this population. Assign a named owner to the accountability. Link the recommendation to the decision and the outcome. Uncover the gaps the deployment introduces rather than waiting for them to surface. Embed the governance in the workflow rather than bolting it on. A block whose paperwork exists but whose VALUE conditions do not hold is an empty block with a document in front of it. The distinction matters because the most common failure of governance is not absence; it is the appearance of presence.

The reason the method layer matters now, rather than as a future refinement, is that the systems being governed are already acting. In a December 2025 Deloitte survey of United States health care technology executives, 61% reported already building or implementing agentic AI initiatives, beyond pilots and into deployment. And the autonomy of those systems is not hypothetical: McKinsey’s 2026 AI Trust Maturity Survey of roughly 500 organizations found that 80% had already encountered risky behaviors from AI agents, including improper data exposure and unauthorized system access. The operative governance question has shifted from whether a model is accurate to who is accountable when the system acts. The method layer exists to answer that question on a schedule the deployment sets, not on the schedule of the next adverse event.

The Gap Score™

The Gap Score™ is the instrument that converts a deployment’s accountability from a matter of judgment into a defensible number, produced before go-live rather than reconstructed after an outcome. Where the nine blocks diagnose what a deployment requires, the Gap Score™ prices whether the deployment is ready to sign off. It is scored across the three layers that must converge for a clinical AI decision to hold: AI Governance, Clinical AI Governance, and Decision Accountability.

Each layer answers one question, and each is scored from 1 to 5.

AI Governance, the enterprise layer, asks whether the enterprise can defend this. The readiness test is whether a documented enterprise governance trail exists for this specific deployment, not a policy that covers AI in general. A broad governance policy is not the same as a documented accountability trail for this model in this workflow, and the difference is the entire exposure the moment a regulator asks. This layer is scored on the governance trail, the audit documentation, and the regulatory alignment behind the specific deployment.

Clinical AI Governance, the patient safety layer, asks whether the deployment protects the patient. The readiness test is whether the training data has been validated against the actual patient population the model will serve. Edge cases, underrepresented populations, and clinical variable definitions must be audited before go-live, because a model trained on the wrong population fails the right one silently. This layer is scored on population validation, edge case testing, and clinical standard alignment.

Decision Accountability, the named owner layer, asks whether the owners are named. The readiness test is whether both named owners exist before the first output reaches a clinician: a Governance Owner accountable for the decision to deploy, and a Decision Owner accountable for the individual patient call. Not a committee. Not a vendor contact. Two names, one deployment, one accountability record that carries both. The layer scores the functions as well as the names: Charter, Commission, and Cover held at the institutional seat; Decide, Document, and Defend held at the bedside. A seat that is named but holds none of its functions scores as unfilled. The Named Owner Principle holds that neither seat substitutes for the other, so this layer scores low when only one is filled and fails when neither is. If those names do not exist before go-live, the accountability gap is already open. This is the layer that carries the Named Owner Principle into the score, and it is the convergence point the framework calls the place the owners must hold.

The three scores sum to a total out of 15, and the total sets the institution’s position.

A total above 12 is defensible. The institution can name the owners, trace the decision, and defend the governance trail when it matters most. This is the convergence the canvas calls the Sweet Spot: named, traceable, and defensible at the center where all three layers overlap.

A total below 9 is unpriced liability. The institution is holding risk it has not measured and cannot defend, at exactly the point in the workflow where an underwriter has begun to price the exposure out of the policy. The score between 9 and 12 is the contested middle, defensible only with intervention before go-live.

The discipline is a gate, not a gauge. If the institution cannot complete the score across all three layers before go-live, the deployment is not ready. The Gap Score™ is scored before the first output reaches a clinician, because its entire purpose is to surface the gap while it can still be closed rather than after a patient outcome has made it visible.

Each way the three layers fail to converge has a name, because each is a distinct exposure. When the enterprise and clinical layers are governed but no owner is named, the institution sits in the Legal Liability Gap: the framework exists, but accountability is exposed. When an owner is named but the governance trail will not hold, it sits in the Operational Gap: the decision is owned but not defensible. When the framework is present but accountability is absent at the bedside, it sits in the Regulatory Gap: the framework exists, accountability does not. Only the convergence of all three layers closes every gap at once, and that convergence is what a Gap Score™ above 12 certifies.

The nine blocks and the three-layer Gap Score™ are one framework with two instruments. The canvas is how an institution diagnoses its accountability, block by block, across the full surface a deployment exposes. The Gap Score™ is how it prices that accountability for the people who will ask, scored on the three layers that must converge at the bedside. Each scored layer is satisfied by a defined region of the canvas: the AI Governance score rests on the governance and trail blocks, the Clinical AI Governance score rests on Affected Populations and Clinical Promise, and the Decision Accountability score rests on the Named Owners block, where both the Governance Owner and the Decision Owner must be named. An institution fills the canvas to know where it stands and scores the three layers to defend it. Most institutions, scored honestly for the first time, land below 9. That is not a failure of the instrument. It is the first time the exposure has been priced before an adverse event, a regulator, or a renewal priced it for them.

The nine blocks surface the gap. The Gap Score™ prices it. Neither closes it. Closure is the work of TAG™: name the Governance Owner and the Decision Owner, activate the six functions across the two seats, and document the Handoff, the junction between the two owners where accountability either transfers cleanly or disappears. The canvas is the instrument. TAG™ is the outcome. The mechanics of the Handoff are treated in full in Working Paper №01.

What the canvas is not

Mind the 9 Blocks™ is frequently mistaken for instruments an institution already has. Each of those instruments is real. None of them is the canvas, and the difference is the point.

Not a compliance checklist. A checklist confirms that items were completed. The canvas measures whether a structure holds. A deployment can satisfy every line of a compliance checklist and still leave the bedside owner unnamed, the trail unreviewable, and the three layers short of convergence. The canvas exists to find the hollowness a checklist cannot see, which is why a block counts as filled only when the VALUE conditions hold behind it, not when a document exists in front of it.

Not general AI governance relabeled. The first three governance layers, Data, AI, and Healthcare AI, are largely general-purpose disciplines that apply to any AI deployment in any sector. The canvas operates at the fourth layer, where an AI-shaped recommendation reaches a patient and a clinician acts. An institution can hold a mature enterprise AI governance program and still have no answer at the bedside, because the fourth layer governs a decision the first three never reach. A scoping review of 77 healthcare AI governance frameworks, published in npj Digital Medicine, found that most were not applicable to real-world settings and that oversight mechanisms were the least common component, present in under a fifth of frameworks. The frameworks exist. The layer that operates them at the bedside is the one that does not, and that layer is what the canvas governs.

Not a single named owner. The Named Owner Principle requires two seats, a Governance Owner accountable for the decision to deploy and a Decision Owner accountable for the individual patient call, and neither substitutes for the other. The canvas is not a search for one accountable person. Concentrating both accountabilities in one name recreates the failure the principle exists to prevent: an institutional decision resting on a single clinician, or a bedside decision resting on an executive who was never in the room. The Decision Accountability layer scores low when only one seat is filled precisely because one owner is not the principle satisfied at half scale. It is the principle broken.

Not a substitute for clinical judgment. The Decision Owner is accountable for the call, not relieved of it. The canvas records who exercised clinical judgment on an AI-shaped recommendation. It does not replace that judgment with the model’s output, and it does not lower the standard the clinician is held to. A filled canvas makes a clinician’s judgment auditable. It does not make it optional.

Not satisfied by the score alone. The Gap Score™ prices a deployment’s accountability, but a number is not the accountability itself. An institution that produces a defensible score on paper while the underlying blocks fail the VALUE test has manufactured a false score, and a false score is more dangerous than no score, because it converts an unmeasured gap into a documented claim the institution cannot back. The score is only as honest as the assessment behind it, which is why the discipline is to score before go-live, when the gap can still be closed, rather than after an outcome has made the false score visible.

Not a one-time assessment. The canvas is run, not filled. Named owners rotate, deployments expand beyond the scope their owners were named for, and a model updated without notice can invalidate the validation behind the Clinical AI Governance layer. A deployment that scored above 12 at launch does not announce the day its score goes stale. The CLOSE discipline exists to be re-run at every rotation, every expansion, and every workflow change, because a score that was defensible once describes accountability that may no longer exist.

Not a vendor’s responsibility. The vendor can warrant the model, indemnify the contract, and support the deployment, but the vendor is not in the room when the recommendation reaches the patient and is not accountable for the clinical call. The canvas measures the institution’s accountability, not the vendor’s product. Contractual vendor accountability does not fill the Decision Accountability layer, because the named owner the layer requires is an institutional name, not a support contact.

Frequently asked questions

What is Mind the 9 Blocks™?: Mind the 9 Blocks™ is a diagnostic canvas that surfaces and scores the clinical AI accountability gap inside a health system. It arranges the accountability requirements of any clinical AI deployment into nine blocks across three bands, Diagnose, Assess, and Close, and it pairs that canvas with a Gap Score™ that prices a deployment's readiness before go-live. The canvas diagnoses; the score defends.
Who is the framework for?: It is built for the Chief Medical Officer, the Chief Medical Information Officer, the General Counsel, and the board member who carries a clinical AI deployment on the institutional record. These are the people who answer when a regulator, an underwriter, or a plaintiff asks who approved the system and who was accountable for the decision it shaped.
How is Clinical AI Governance different from general AI Governance?: General AI Governance protects the institution's operational decisions. Clinical AI Governance protects the patient decision at the bedside. They are different disciplines governing different decisions, and most institutions apply the general one to a clinical system and never build the fourth layer the bedside requires. The first block of the canvas exists to force that distinction before anything else is assessed.
What is the Named Owner Principle, and why two owners?: The Named Owner Principle holds that every clinical AI deployment requires two named persons in the audit trail: a Governance Owner at the institutional level, accountable for the decision to deploy, and a Decision Owner at the bedside, accountable for the individual patient call. Neither substitutes for the other. One owner alone leaves either an institutional decision resting on a single clinician or a bedside decision resting on an executive who was never in the room.
How does an institution score its gap?: The Gap Score™ rates the three layers a clinical AI decision must satisfy, AI Governance, Clinical AI Governance, and Decision Accountability, each from 1 to 5, for a total out of 15. A deployment scored above 12 is defensible: the institution can name the owners, trace the decision, and defend the governance trail. A deployment below 9 is unpriced liability. The score is taken before go-live, because its purpose is to surface the gap while it can still be closed.
How does an institution close the gap once it is scored?: Through the CLOSE discipline, applied to each deployment: Categorize it as clinical AI, Locate the point where the recommendation meets the clinical decision, Own both named seats, Score the three layers, and Evaluate the trail the deployment produces. CLOSE is a loop, not a one-time fix, re-run whenever owners rotate, deployments expand, or workflows change.
What is the cost of leaving the gap unmeasured?: The parties whose business is pricing risk have already moved. Effective January 2026, the Insurance Services Office introduced an endorsement excluding injury arising from generative AI, and W. R. Berkley introduced an absolute AI exclusion for the directors-and-officers, errors-and-omissions, and fiduciary lines that protect the executives who approve deployments. When an exclusion applies, the institution holds the AI risk itself, unindemnified. The renewal cycle has become an audit of accountability the institution has not yet been asked to demonstrate by anyone else.

Funding

None declared.

Conflicts of interest

Mo Johnson is the founder of GPe Research. The Clinical AI Accountability Canvas™, Mind the 9 Blocks™, the Gap Score™, the Named Owner Principle, and MedicoVigilance™ are works and marks of GPe Research. The author has a commercial interest in the adoption of the frameworks described in this brief.

Published under CC-BY-4.0. Free to share and adapt with attribution.

How to cite

Johnson, M. (2026). *Mind the 9 Blocks™: A diagnostic canvas for clinical AI accountability* (Framework Brief №01). GPe Research Publications. https://publications.gperesearch.com/papers/mind-the-9-blocks

@techreport{johnson2026mind9blocks,
  author      = {Johnson, Mo},
  title       = {Mind the 9 Blocks: A Diagnostic Canvas for Clinical AI Accountability},
  institution = {GPe Research Publications},
  type        = {Framework Brief},
  number      = {№01},
  year        = {2026},
  month       = {5},
  url         = {https://publications.gperesearch.com/papers/mind-the-9-blocks}
}

Closing the accountability gap in clinical AI.

Mind the 9 Blocks™: a diagnostic canvas for clinical AI accountability