docs/plans/Comment-Anchoring-via-ProseMirror-Marks.md at main

Plan to implement

Comment Anchoring via ProseMirror Marks

Context

Comments are currently anchored by positional paragraph index (p-0, p-1). When paragraphs are
inserted or deleted, comments drift to wrong content. This is fundamentally broken.

The industry-standard solution for ProseMirror editors is mark-based anchoring: a ProseMirror
mark wraps the commented text with a threadId attribute. Since marks are part of the document
model, they move with the text automatically — edits, undo/redo, copy/paste, and collaboration
all preserve the association. This is how Tiptap Comments, Remirror, and The Guardian's
prosemirror-noting work.

Approach: Custom commentMark in Milkdown

How it works:

A ProseMirror mark named comment wraps the highlighted text range. The mark carries a
threadId attribute (8-char hex).
Comment metadata (author, text, timestamp) lives in Document.Comments[] array, keyed by the
same threadId.
The mark serializes to markdown as text — Goldmark
already has html.WithUnsafe() enabled, so these pass through to rendered output.
When text is deleted, the mark disappears with it. Orphaned comments (threadId in Comments
array but no matching mark in doc) are cleaned up on save.

Key files to modify:

milkdown-entry.js — export $markSchema, $markAttr from @milkdown/kit/utils
static/js/comment-mark.js — new file: defines commentMarkSchema using Milkdown's $markSchema
API
templates/document_edit.html — import comment mark, register with editor, rewrite comment UI
to use text selection + marks
internal/model/models.go — update EmbeddedComment (replace ParagraphID with ThreadID)
internal/handler/handler.go — update CommentCreate/CommentList (threadId instead of
paragraphId)
static/css/editor.css — style for comment highlights and sidebar
Dockerfile — bundle new JS file (or inline in milkdown-entry.js)

Existing code to reuse:

milkdown-entry.js already exports ProseMirror APIs; link mark in
@milkdown/preset-commonmark/src/mark/link.ts is the exact pattern to follow
CommentCreate/CommentList handlers already do read-modify-write on owner's document — just
change the identifier from paragraphId to threadId
Comment sidebar HTML/CSS already exists — update the rendering JS
randomID() helper in handler.go already generates 8-char hex IDs
Goldmark html.WithUnsafe() already enabled — no renderer changes needed

Implementation Tasks

Task 1: Define the comment mark plugin

Create static/js/comment-mark.js (bundled via esbuild alongside milkdown.js):

import { $markSchema, $markAttr } from '@milkdown/kit/utils';

export const commentAttr = $markAttr('comment');

export const commentSchema = $markSchema('comment', (ctx) => ({
attrs: {
threadId: { default: null, validate: 'string|null' },
},
inclusive: false, // new text typed at mark boundary is NOT marked
parseDOM: [{
tag: 'span[data-thread]',
getAttrs: (dom) => ({ threadId: dom.getAttribute('data-thread') }),
}],
toDOM: (mark) => ['span', {
...ctx.get(commentAttr.key)(mark),
'data-thread': mark.attrs.threadId,
class: 'comment-highlight',
}, 0],
parseMarkdown: {
match: (node) => node.type === 'html' && typeof node.value === 'string'
&& node.value.startsWith('<span data-thread='),
runner: (state, node, markType) => {
// This won't work directly — HTML nodes in remark are opaque.
// Instead, use a remark plugin to transform the HTML spans.
},
},
toMarkdown: {
match: (mark) => mark.type.name === 'comment',
runner: (state, mark) => {
// Wrap text in ...
},
},
}));

However, Milkdown's markdown↔ProseMirror round-trip for custom inline HTML marks is complex. The cleaner approach is:

Alternative (simpler): Don't persist comments as marks in the markdown at all. Instead:

Marks exist only in the live ProseMirror document during editing
Document.Comments[] stores threadId + a quotedText field (the exact text the comment was
attached to)
On editor load, re-anchor comments by finding quotedText in the document and applying marks
On save, marks are stripped (standard markdown serialization ignores unknown marks)

This avoids the markdown serialization problem entirely, at the cost of fuzzy re-anchoring
(Google Docs does the same thing).

Recommended: Hybrid mark + quotedText approach

During editing: Comments are ProseMirror marks with threadId attrs. Marks move with the text
automatically.
On save: Before serializing to markdown, scan all comment marks, update Comments[].quotedText with the current marked text. Serialize to clean markdown (no spans).
On load: Read Comments[], for each comment search the document for quotedText, apply the
mark. If exact match fails, try fuzzy matching (substring, normalized whitespace).
Orphan cleanup: Comments whose quotedText can't be found in the document are flagged as
orphaned (shown in sidebar as "detached" with option to delete).

Task 2: Update the data model

internal/model/models.go — change EmbeddedComment:

type EmbeddedComment struct {
ID string json:"id"
ThreadID string json:"threadId" // links to ProseMirror mark
QuotedText string json:"quotedText" // text the comment was anchored to
Text string json:"text"
Author string json:"author"
AuthorHandle string json:"authorHandle"
CreatedAt string json:"createdAt"
}

Task 3: Update CommentCreate handler

internal/handler/handler.go — accept threadId and quotedText instead of paragraphId:

var req struct {
OwnerDID string json:"ownerDID"
ThreadID string json:"threadId"
QuotedText string json:"quotedText"
Text string json:"text"
}

Generate ThreadID server-side via randomID() if not provided. Return the threadId in the
response so the client can apply the mark.

Task 4: Update milkdown-entry.js and bundle

Add exports for $markSchema, $markAttr, $prose from @milkdown/kit/utils so the template JS can
define the mark inline. Update the Dockerfile esbuild step.

Task 5: Rewrite comment UI in document_edit.html

Replace the paragraph-click comment flow with text-selection comment flow:

Show comment button when user selects text (use ProseMirror update listener to detect
non-empty selection)
On "Comment" click: capture selection.from, selection.to, extract selected text as quotedText
POST to /api/docs/{rkey}/comments with { quotedText, text, ownerDID } — server returns {
threadId }
Apply mark: tr.addMark(from, to, schema.marks.comment.create({ threadId }))
Sidebar click → jump: find mark with matching threadId in the doc, scroll to it

Task 6: Re-anchor comments on editor load

After Milkdown creates the editor, iterate Document.Comments[]:

For each comment, search ProseMirror doc for quotedText
If found, apply comment mark with threadId
If not found, show as "detached" in sidebar

Task 7: Update quotedText on save

In saveDocument / autosave, before serializing markdown:

Walk the ProseMirror doc, find all comment marks
For each mark, extract the current text it wraps
POST updated quotedText values to the server (or include in the save payload)

Task 8: Update comment sidebar rendering

Group by threadId instead of paragraphId
Show quotedText snippet as the thread label (truncated)
Click scrolls to the mark in the editor (find mark by threadId, get its position, scroll)
Detached comments shown with warning icon

Task 9: Style comment highlights

CSS for .comment-highlight in the editor:
.comment-highlight {
background: rgba(234, 179, 8, 0.2);
border-bottom: 2px solid rgba(234, 179, 8, 0.6);
cursor: pointer;
}

Task 10: Cleanup

Remove old ParagraphID references
Remove setupParagraphCommentTrigger function
Remove old paragraph-index comment button positioning code

Verification

Create a document with several paragraphs
Select text in paragraph 3, add a comment → yellow highlight appears, sidebar shows comment
with quoted text
Delete paragraph 1 → comment stays on the correct text (highlight moves with it)
Save and reload → comment re-anchors to the correct text via quotedText matching
Edit the commented text slightly → mark stretches/shrinks with the edit
Delete all the commented text → mark disappears, comment becomes "detached" in sidebar
Collaborator view: both owner and collaborator can see and add comments