Leadership & Strategy

When AI Agents Talk to Each Other: Governance Questions Nonprofit Boards Haven't Asked Yet

Multi-agent AI systems are spreading across the nonprofit sector, delegating tasks between automated systems with little human visibility into what those systems are deciding, accessing, or communicating. Most nonprofit boards are not asking the right questions, and the governance gaps this creates carry real liability, mission, and fiduciary consequences.

Published: May 4, 2026•14 min read•Leadership & Strategy

AI agent governance for nonprofit boards

Most nonprofit leaders understand AI as a tool that answers questions or drafts documents, something a staff member uses and then reviews before acting. Multi-agent AI systems are categorically different from this model. In a multi-agent architecture, several AI components each perform specialized tasks: one researches donor prospects, one drafts outreach messages, one updates the CRM, and one schedules follow-ups. These agents delegate instructions to each other in real time, producing outputs and initiating actions with no human step between them. The conversation happens between machines, and unless specific governance structures are in place, no one at the organization may know exactly what was decided or why.

According to research from the 2026 Nonprofit AI Adoption Report, 92% of nonprofits now use AI in some capacity, yet fewer than 10% have formal AI governance policies at the board level. This gap was tolerable when AI meant a staff member using a chatbot. It becomes legally and ethically significant when AI agents are making or influencing decisions about donor communications, service eligibility determinations, or financial reporting, without a human in the loop and without a board that knows it is happening.

Gartner projects that 40% of enterprise applications will include task-specific AI agents by the end of 2026. The nonprofit sector is part of this trajectory. Organizations that have moved into multi-agent territory, whether knowingly or through the gradual accumulation of connected AI tools, are operating systems that carry governance questions their boards have not yet considered. This article is designed to surface those questions, explain the risks behind them, and help nonprofit leaders build the oversight structures that responsible AI deployment now requires.

This piece connects to broader questions of AI risk management for boards, board-level AI oversight frameworks, and the specific operational mechanics of multi-agent workflow patterns that nonprofit programs are beginning to deploy.

What Changes When Agents Start Talking to Each Other

The fundamental governance challenge of multi-agent AI is that the decision chain becomes invisible. When a staff member uses an AI tool to draft a grant report and then reviews and submits it, the human is the final accountable actor. When an AI grant research agent pulls funder data, passes it to a drafting agent that writes a proposal, which a review agent checks for compliance before routing it to a submission agent, the human may only see the finished output, if they see it at all. The intermediate decisions, what information was prioritized, how the compliance check was framed, what assumptions the drafting agent made, are buried inside agent-to-agent communications that may not be logged anywhere.

This architectural shift matters for boards because it changes what "oversight" means. Traditional AI governance assumes that humans are in the loop at consequential decision points. Multi-agent systems often eliminate those checkpoints by design, since removing human bottlenecks is part of their efficiency value. The question boards must ask is not just whether humans are reviewing AI outputs, but whether the organization has identified which decisions require human review and built that requirement into how its agent systems actually function.

The State of AI Agent Security 2026 Report found that only 14.4% of organizations have full security or IT approval for their entire agent fleet, and that 47.1% of organizations actively monitor or secure fewer than half of their deployed AI agents. In the nonprofit context, this means the majority of organizations running multi-agent systems have significant portions of those systems operating without any logging, security review, or governance oversight. The agents are acting; the board does not know it.

Single AI Tool (Traditional)

Human-in-the-loop at each step

Staff member prompts the tool
Tool produces output for review
Human decides whether to act
Clear accountability at each decision point
Logs capture what the human did

Multi-Agent System (New Reality)

Agents delegate to agents autonomously

Human initiates a task, then steps back
Multiple agents communicate between themselves
Actions may be taken before any human review
Accountability chain is blurred or absent
Agent-to-agent communications often unlogged

The Governance Questions Boards Are Not Asking

Drawing on emerging governance frameworks from the World Economic Forum, Forvis Mazars, ISACA, and Singapore's IMDA, the following are the categories of questions most nonprofit boards have not yet asked about their AI agent deployments. These are not hypothetical concerns. They are the questions regulators, funders, and courts are beginning to require organizations to answer.

Questions About Agent Identity and Authorization

Who or what is acting, and who said it could?

Most multi-agent systems treat authorization as a one-time setup task: a staff member or IT administrator connects the tools, grants permissions, and the system runs. But what happens as agents evolve, as vendors update capabilities, or as one agent delegates tasks to a sub-agent that did not exist when the original permissions were granted? The 2026 AI Agent Security report found that 45.6% of organizations still rely on shared API keys for agent-to-agent authentication. When agents share credentials, it becomes impossible to determine which agent took which action, a fundamental problem for any post-incident investigation or audit.

Does each AI agent have its own verifiable digital identity, or do agents share credentials?
Who authorized each agent to act, and is that authorization documented?
When one agent instructs another, does any human see that instruction?
Can agents grant each other permissions that exceed what any human has approved?
When ephemeral sub-agents are spun up to complete a task, are they provisioned and deprovisioned with audit logs?

Questions About Decision Chains and Accountability

When something goes wrong, who is responsible?

ISACA identifies that agentic AI systems often lack traceability in their decision processes. When an agent makes a decision that affects a beneficiary, an inaccurate grant report, a denial of a service request, or a communication sent to a major donor that should not have been sent, the organization faces an accountability question it may not be able to answer: how did that decision happen, and who is responsible for it? Under traditional duty of care, board members cannot claim willful ignorance of how the organization operates. Legal analysis from Forvis Mazars in early 2026 concludes that willful ignorance of how the organization uses AI, particularly when it involves donor data, beneficiary services, or public communications, could expose directors to personal liability if harm results.

When an AI agent makes a decision affecting a beneficiary, who is accountable?
Can the organization reconstruct, step by step, how any particular decision was reached?
If an agent acts on incorrect information from another agent, who is responsible for the outcome?
Are there decisions being made autonomously that the board assumed required human sign-off?

Questions About Data Flows and Beneficiary Privacy

What are the agents accessing, combining, and transmitting?

When multiple AI agents operate across an organization's systems, each handling different data types and connecting to different platforms, the cumulative data exposure can be far larger than any individual tool's access would suggest. An agent that reads case notes to generate a service summary, passes that summary to a scheduling agent, which logs it in a CRM that a third-party vendor also accesses, has created a data flow that may not match what the organization disclosed in its privacy policy or donor data stewardship commitments. This is particularly acute in organizations serving vulnerable populations, where case files, health information, or immigration status may flow through agent pipelines.

What beneficiary data are agents accessing, combining, or transmitting to each other?
Is any of that data being sent outside organizational systems, to third-party APIs, or model providers?
Do donor records, clinical notes, or case files flow through agent pipelines?
Does the organization's privacy policy accurately reflect how agents are using constituent data?

Questions About Scope Creep, Autonomy, and Failure

What are the agents doing beyond their original mandate?

The Cooperative AI Foundation's 2025 technical paper on multi-agent risks identifies three primary failure modes: miscoordination (agents produce results no human intended), conflict (competing agent objectives generate harmful outputs), and collusion (agents coordinate in ways that serve emergent system goals rather than organizational mission). These are not theoretical abstractions. They are structural tendencies of systems where agents optimize independently without shared human-defined constraints. When agents encounter situations outside their original parameters, they often extrapolate in ways their designers did not anticipate, and without human checkpoints, those extrapolations can propagate through the system before anyone notices.

Are any agents doing more than they were originally authorized to do?
Has any agent been granted expanded capabilities or data access since the board last reviewed AI systems?
What happens when an agent encounters a situation outside its original parameters?
Does the organization have a tested emergency stop mechanism for its AI pipelines?
Has the organization defined what "harm" looks like in its specific AI context?

Four Accountability Gaps Specific to Nonprofits

The governance challenges of multi-agent AI are not unique to nonprofits, but several accountability gaps are particularly acute in the nonprofit context because of the sector's specific legal structures, constituent relationships, and resource constraints.

The Fiduciary Duty Gap

Nonprofit board members operate under a duty of care that requires informed decision-making. This duty does not disappear when decisions are made by AI agents; it shifts to requiring that board members understand how AI is being used and whether adequate oversight exists. Forvis Mazars' 2026 analysis of nonprofit AI governance notes that boards must now treat AI governance as a fiduciary responsibility, not a technical matter delegated entirely to staff. The standard being established is that willful ignorance of how the organization uses AI, particularly when that AI affects donor data, beneficiary services, or public communications, can expose directors to personal liability if harm results and the board never inquired.

The practical implication is that board members who have never been briefed on what AI agents their organization is running, what decisions those agents influence, and what oversight exists are not in compliance with their fiduciary obligations under an emerging standard of care. This is not about technical expertise; it is about whether the board is asking the right questions of management.

The Grant Reporting Liability Gap

AI agents used to assist with grant reporting create a specific form of liability that nonprofit boards rarely consider. When an agent researches program outcomes, synthesizes data, and drafts a funder report, it may introduce hallucinated statistics, fabricated quotes attributed to program participants, or inaccurate descriptions of how restricted funds were used. If the organization submits a grant report containing AI-generated inaccuracies and the funder discovers this, the consequences can include grant clawback, termination of the funding relationship, and reputational damage in a funder community that shares information extensively.

The board bears fiduciary responsibility for the accuracy of grant reports, and that responsibility is not satisfied by the existence of a human who nominally reviewed an AI-drafted document. It requires that the review process includes specific verification of any data claims, citations, or outcome statistics that an AI system generated or synthesized. Organizations that have not established this verification requirement as a documented policy are operating with an unaddressed liability.

The Beneficiary Harm Gap

When AI agents make or inform eligibility decisions for services, housing placements, case prioritization, or resource allocation, biases embedded in training data or optimization objectives can systematically disadvantage the populations the nonprofit exists to serve. Unlike a biased human decision-maker, a biased AI agent applies its bias consistently and at scale, potentially affecting every person who flows through a particular process. The board may have no visibility into whether this is happening, particularly if no bias audit has been conducted and no reporting mechanism exists.

State-level regulation is beginning to address this directly. Colorado's AI Act, effective June 2026, requires organizations using AI in consequential decisions affecting Colorado residents to conduct impact assessments and take reasonable care to avoid algorithmic discrimination. California's AB 316, effective January 1, 2026, explicitly precludes organizations from using an AI system's autonomous operation as a defense to liability claims. The board's exposure in these cases is not hypothetical; it is increasingly statutory.

The Vendor Liability Shift Gap

Baker Donelson's 2026 AI Legal Forecast identifies a significant trend in vendor contracts: liability for autonomous agent actions is increasingly being shifted onto the deploying organization rather than the AI provider. The vendor that built the model or the agent framework typically includes broad indemnification provisions that exclude liability for how the customer chooses to deploy the system. This means the nonprofit that deploys an agent, regardless of who built the underlying model, is legally responsible for what that agent does.

Most nonprofit boards have not reviewed their AI vendor contracts through this lens. When did the board last ask whether its AI vendor contracts include indemnification for autonomous agent actions, hallucinations, or data exposure? What is the organization's contractual right to audit the vendor's AI systems? Who is named as liable when a vendor's agent causes harm to a program participant? These are fiduciary questions that require board attention, not just management judgment.

What "Human Oversight" Must Actually Mean in an Agentic World

The phrase "human oversight" appears in nearly every AI governance framework, but it is often left undefined in ways that allow organizations to claim compliance while their agents operate with minimal actual human involvement. Singapore's IMDA Model AI Governance Framework for Agentic AI, released in 2025, introduces a more precise concept called Meaningful Human Control (MHC), which it defines as the unity of three elements: human understanding of what the AI is doing, the capacity to intervene in real time, and traceability of responsibility for agent actions.

This definition is more demanding than most organizations realize. It is not enough for a human to be available to stop an agent if something goes wrong; the human must also understand enough about what the agent is doing to recognize when something has gone wrong. It is not enough to log agent outputs; the logs must capture the reasoning behind decisions, not just the actions taken. And it is not enough to name a staff member as responsible for AI oversight; that person must have the actual capacity, time, authority, and information needed to exercise meaningful control.

For nonprofit boards, this reframes what they should be asking management to demonstrate. The question is not "do we have human oversight of our AI?" but rather "can management demonstrate meaningful human control, as defined by the capacity to understand, intervene, and assign responsibility, for each AI agent system we operate?"

What Meaningful Control Requires

Real-time ability to halt agent actions, not just review them afterward
Traceability that assigns every agent action to a named human who authorized it
Human approval required before irreversible actions (emails sent, records changed)
Audit logs capturing the "why" behind decisions, not just the "what"
Separate verifiable identity for each agent, no shared credentials
Documented and tested emergency shutdown procedure for all agents

What Is Not Sufficient

A staff member who is nominally responsible but lacks time or information to exercise control
Logging outputs without logging the reasoning behind them
Claiming human review occurs when the review is of final outputs only
Relying on the vendor's own audit tools as the sole source of oversight
Annual policy review without continuous monitoring of agent behavior
Shared credentials or API keys across multiple agents

Governance Frameworks Boards Can Apply Now

Several governance frameworks have emerged in 2025 and 2026 that nonprofit boards can draw on to structure their oversight of agentic AI. None of these require deep technical expertise to apply at the board level; they are governance frameworks, designed for organizational accountability rather than engineering decisions.

The NIST AI Risk Management Framework remains the leading U.S. standard for AI governance. Updated with agentic AI guidance for 2026, it provides a four-function structure (Govern, Map, Measure, Manage) that organizations can apply to their AI systems regardless of technical complexity. The Govern function is particularly relevant for boards: it establishes that accountability, policies, processes, and a risk-aware organizational culture must be in place before AI systems are deployed, not after an incident occurs.

The TM Forum AI Autonomy Governance Framework offers a 10-pillar structure applicable to any organization deploying agentic AI: risk boundaries, human accountability, technical safeguards, user literacy, data governance, transparency and auditability, continuous monitoring, ethical design, regulatory compliance, and organizational culture. Boards can use these pillars as a checklist for the questions they should be asking management to address, without needing to understand the underlying technical implementation.

For resource-constrained nonprofits, a simplified three-tier risk approach is emerging as a practical starting point. Low-risk agents, those that draft documents for human review or analyze data without taking action, require baseline logging and monitoring. Medium-risk agents, those that send communications or update records on behalf of the organization, require enhanced controls and regular audits. High-risk agents, those that influence eligibility decisions, financial transactions, or public-facing representations, require mandatory human-in-the-loop checkpoints and impact assessments before deployment.

Eight Actions Boards Should Take Now

The following actions represent a practical governance agenda that nonprofit boards can begin implementing now, without requiring technical expertise or significant new resources. Each is calibrated to what boards can actually do: ask questions, require reporting, and establish policies that management must implement.

Commission an AI Inventory

Request that management produce a complete list of every AI tool and agent the organization uses, what data each accesses, what decisions each influences, and whether each has been reviewed by IT or security. This baseline is necessary before any other governance action is meaningful.

Add AI Governance as a Standing Board Agenda Item

The shift from annual policy review to continuous oversight is now a governance expectation, not just best practice. At a minimum, each meeting should include a brief update on new AI deployments, incidents or near-misses, and whether vendor contracts have changed.

Establish a Human-in-the-Loop Policy for Beneficiary Decisions

Any AI-assisted decision affecting service eligibility, benefit determination, housing placement, or resource allocation must require documented human review before taking effect. This policy should be written, approved by the board, and verified by management on a regular basis.

Require Audit Trail Certification for High-Stakes Systems

For every AI system handling donor data, beneficiary records, or financial transactions, management must certify in writing that a complete, attributable audit trail exists and that it captures not just what the agent did, but why the agent made the decisions it did.

Review AI Vendor Contracts for Liability Provisions

Every AI vendor contract should be reviewed for indemnification clauses covering autonomous agent actions, hallucinations, and data exposure. The board should understand whether the organization, rather than the vendor, is carrying the primary liability for agent behavior.

Require Bias Audits for Agents Influencing Beneficiary Outcomes

Any AI system that influences who receives services, at what level, or in what priority should undergo regular bias testing, with results reported to the board. This is not optional in jurisdictions covered by Colorado's AI Act or similar state regulations.

Name Accountability Owners for Each AI System

For each AI agent or system, a specific named staff member must be responsible for monitoring, incident escalation, and reporting to the board. Diffuse responsibility is the equivalent of no responsibility when something goes wrong.

Test the Emergency Stop

Ask management to demonstrate that the organization has a documented and tested process for immediately halting all AI agent activity in the event of an incident. If this process does not exist or has never been tested, establish a timeline for creating and testing it.

Questions to Ask at Every Board Meeting

Beyond periodic reviews and policy work, effective AI governance requires that boards build AI accountability into their standing meeting practice. The following questions, if asked consistently, create an accountability rhythm that prevents the governance gaps described in this article from developing.

Has any AI agent taken an action in the last quarter that no human reviewed before it occurred?
Have any agents been granted new capabilities or data access since our last meeting?
Have there been any AI-related incidents, near-misses, or anomalies?
Are our vendor contracts current with 2026 liability standards, particularly for autonomous agent actions?
What percentage of our AI agents are covered by security monitoring and audit logging?
Has any AI system influenced a decision affecting beneficiary eligibility or resource allocation without documented human review?

Conclusion

The governance gap in multi-agent AI is not primarily a technology problem. It is an organizational accountability problem, and it sits at the board level. When AI agents communicate with each other, delegate tasks to each other, and take actions that affect beneficiaries, donors, and funders, the organization's traditional governance structures do not automatically extend to cover those actions. Boards must actively build oversight structures that match the actual architecture of the AI systems the organization is running.

The good news is that meaningful governance does not require technical expertise. It requires asking the right questions, establishing clear accountability requirements, reviewing vendor contracts with new eyes, and building AI reporting into the rhythm of board meetings. The frameworks exist. The legal standards are emerging. The cost of not acting is rising. Boards that take AI governance seriously now will be better positioned to protect their organizations, their beneficiaries, and themselves when the governance failures of this period of rapid agentic AI adoption eventually become apparent.

For organizations looking to connect this board-level governance work to operational implementation, the related discussions on AI risk registers, audit trail compliance, and communicating AI risks to the board provide the practical scaffolding that governance intentions require to become governance realities.

Build the Governance Structures Your AI Needs

One Hundred Nights helps nonprofit boards and leadership teams understand their AI exposure and design governance frameworks that match the reality of how their AI systems actually operate.

Start the Conversation View Our Services