0.x postThis post uses the 0.x package names and APIs. For the current release see the 1.x docs and the migration guide.
@eigenpal/docx-editor-agents is the integration layer between AI agents and .docx documents. Fourteen tools (read, find, comment, suggest, format, scroll) in OpenAI function-calling format, anchored against the stable Word paraId so coordinates survive multi-step loops. One catalog, three transports: live inside a React <DocxEditor>, headless against a parsed buffer in Node, or exposed over the Model Context Protocol.
This post covers the supported integration shapes, the paraId addressing model, and the framework adapters.
Common integrations
The integration surface is the same regardless of agent role; what differs is which tools the deployment exposes and what the system prompt tells the agent to do.
- Contract review: read +
add_comment+suggest_change. The agent leaves a tracked redline; an attorney accepts or rejects it in Word, Google Docs, or the embedded editor. - Compliance scans: read-only (
include: ['read_document', 'find_text', 'read_comments']). The agent flags PII, policy violations, or missing clauses as comments anchored to the offending phrase. - Document copilots: full catalog. A chat panel next to the user's draft, scoped via
include/excludeto whatever subset of the surface fits the product. - Word add-in alternative: for developers already on Office.js. The toolkit mirrors
Range.insertComment,comment.reply,body.search,range.scrollIntoViewon the web; most call sites port across without rewriting.
The tools
Fourteen of them. Exported as both raw definitions and ready-to-use schemas in OpenAI function-calling format. Anthropic tool use, the Vercel AI SDK, and anything else that takes OpenAI-shape tools accept the schemas as-is.
| Group | Tools | Purpose |
|---|---|---|
| Locate | read_document, read_selection, read_page, read_pages, find_text, read_comments, read_changes | Return paragraphs tagged with paraId. |
| Mutate | add_comment, suggest_change, apply_formatting, set_paragraph_style, reply_comment, resolve_comment | Take a paraId and an optional search phrase. |
| Navigate | scroll | Live transport only. |
Locate tools return paragraphs keyed by Word paraId; mutate tools take that paraId. This gives the agent a stable paragraph anchor across multi-step tool calls instead of relying on raw character offsets.
Building a custom agent
A custom agent picks the subset of the catalog the deployment needs, adds whatever domain-specific tools it owns, and writes the system prompt that drives the loop. The wiring is a single React hook plus a streaming chat client.
The Roastmaster demo embedded below is a worked example. It reads the document, chooses three to five passages, and leaves a comment anchored to each. It uses read and comment tools only; it cannot edit document text.
Step 1: scope the tools. Roastmaster never edits text, so useDocxAgentTools is given an include allow-list that drops every mutate tool other than add_comment:
import { useDocxAgentTools } from "@eigenpal/docx-editor-agents/react";
const { tools, executeToolCall, getContext } = useDocxAgentTools({
editorRef,
author: "Roastmaster",
include: [
"read_document", "read_selection", "find_text",
"read_comments", "read_changes", "scroll",
"add_comment", "reply_comment", "resolve_comment",
],
});The same include mechanism can expose a read-only compliance scanner, a full-catalog copilot, or a redliner with read + add_comment + suggest_change. Custom tools merge in the same way: pass them under tools, and they appear in the catalog alongside the built-ins.
Step 2: write the system prompt. The prompt defines the agent behavior. Roastmaster's prompt caps the agent at five comments per turn, tells it to anchor each comment to a unique phrase from the paragraph, and specifies the review tone. A runnable end-to-end setup is in examples/agent-chat-demo.
Step 3: run the loop. useChat from @ai-sdk/react handles the streaming, and executeToolCall from the hook runs each tool call client-side against the live editor. The server route emits schemas and receives chat/tool-call text; it does not receive the DOCX buffer.
const chat = useChat({
transport: new DefaultChatTransport({
api: "/api/agent-chat",
prepareSendMessagesRequest: ({ messages }) => ({
body: { messages, context: getContext() },
}),
}),
sendAutomaticallyWhen: lastAssistantMessageIsCompleteWithToolCalls,
onToolCall: ({ toolCall }) => {
const result = executeToolCall(toolCall.toolName, toolCall.input);
chatRef.current?.addToolResult({
tool: toolCall.toolName,
toolCallId: toolCall.toolCallId,
output: result.success ? String(result.data) : result.error ?? "",
});
},
});getContext() returns the user's current selection and page; piping it through prepareSendMessagesRequest lets the system prompt know what the user is looking at without an extra tool round-trip. sendAutomaticallyWhen: lastAssistantMessageIsCompleteWithToolCalls keeps the loop running until the model writes a non-tool reply.
The full source for the example below, including the matching server route, is on the Live editor docs page.
"use client";
import { useMemo, useRef, useState } from "react";
import { DocxEditor, AgentChatLog, AgentComposer } from "@eigenpal/docx-js-editor";
import { useDocxAgentTools, getToolDisplayName } from "@eigenpal/docx-editor-agents/react";
import { toAgentMessages } from "@eigenpal/docx-editor-agents/ai-sdk/react";
import { useChat } from "@ai-sdk/react";
import { DefaultChatTransport, lastAssistantMessageIsCompleteWithToolCalls } from "ai";
export function EditorWithAgent({ buffer }: { buffer: ArrayBuffer }) {
const editorRef = useRef(null);
const { executeToolCall, getContext } = useDocxAgentTools({ editorRef, author: "Agent" });
const chatRef = useRef(null);
const chat = useChat({
transport: new DefaultChatTransport({
api: "/api/chat",
prepareSendMessagesRequest: ({ messages }) => ({
body: { messages, context: getContext() },
}),
}),
sendAutomaticallyWhen: lastAssistantMessageIsCompleteWithToolCalls,
onToolCall: ({ toolCall }) => {
const result = executeToolCall(toolCall.toolName, toolCall.input);
void chatRef.current?.addToolResult({
tool: toolCall.toolName,
toolCallId: toolCall.toolCallId,
output: result.success ? String(result.data) : (result.error ?? ""),
});
},
});
chatRef.current = chat;
const messages = useMemo(() => toAgentMessages(chat.messages, chat.status), [chat]);
const [input, setInput] = useState("");
return (
<DocxEditor
ref={editorRef}
documentBuffer={buffer}
agentPanel={{
title: "Agent",
render: () => (
<>
<AgentChatLog messages={messages} humanizeToolName={getToolDisplayName} />
<AgentComposer
value={input}
onChange={setInput}
onSubmit={() => {
chat.sendMessage({ text: input });
setInput("");
}}
/>
</>
),
}}
/>
);
}The matching server route uses getAiSdkTools() from /ai-sdk/server and Vercel AI SDK's streamText({ tools }). Full source on the Live editor docs page.
Try it
The demo below is the same component, hosted inside this docs site. The Roastmaster agent has read and comment tools only; it cannot edit document text. Pick a suggestion or type a prompt to see it operate on the sample document.
Scope
- Tracked-change acceptance is human-only. There are no
accept_changeorreject_changetools; the human keeps the final commit on every revision. - Formatting verbs.
apply_formattingcovers bold, italic, underline, strike, color, highlight, font size, and font family. Paragraph-level mutators (alignment, spacing) are not yet wired through the toolkit.
Get started
- Live agent demo: opens the editor with the Roastmaster agent panel enabled. Click a suggestion chip to add a comment to the sample document.
- Agent API documentation: the full reference, including the live-editor transport, the headless reviewer, and the MCP server.
- Tool catalog: every tool's input schema, output shape, and behavior.
- Word JS API parity: the mapping table for developers migrating from Office.js.
- Components reference: the React UI kit (
AgentPanel,AgentChatLog,AgentComposer,AgentSuggestionChip,AgentTimeline). examples/agent-chat-demo: a runnable Next.js app demonstrating the live transport.examples/agent-use-demo: the server-side review pattern.
The source is on GitHub. Issues and pull requests are welcome.
Related reading: Track Changes in a React DOCX Editor and Real-time Collaboration in docx-js-editor.