Automerge Migration
Date: 2025-01-05
Note: This is an exploration document. For the actual implementation plan, see 505-automerge-migration-plan.
This document explores migrating Minuta to Automerge, a mature CRDT library, instead of implementing custom CRDT primitives.
What is Automerge?
Automerge is a library of data structures for building collaborative applications. Key characteristics:
- Conflict-free Replicated Data Type (CRDT) - automatic merge without conflicts
- Cross-platform - Swift, JavaScript, Rust, WASM
- Network-agnostic - works with any transport (WebSocket, Bluetooth, file sync)
- Immutable state - each change produces a new document state
- Change history - full audit trail, branching, time travel
- Binary format - compact storage, efficient sync
Automerge-Swift
automerge-swift v0.6.1 provides:
Document- the core CRDT container- Built-in types: Text, List, Map, Counter
fork()/merge()for branching- Binary serialization (
document.save()/Document(data)) - Sync protocol for efficient delta exchange
Automerge-Repo-Swift
automerge-repo-swift extends the base library with:
- Multi-document management
- Pluggable storage providers
- Pluggable network providers
- Document discovery and sharing
Note: automerge-repo-swift is pre-release and API is not stable yet.
Custom CRDT vs Automerge
| Aspect | Custom CRDT (503) | Automerge |
|---|---|---|
| Implementation effort | ~2000 lines | ~200 lines (integration) |
| Battle-tested | No | Yes (years of production use) |
| Cross-platform sync | Manual | Built-in (JS, Rust, Swift) |
| Storage format | JSON (readable) | Binary (compact) |
| Debuggability | High (JSON files) | Lower (binary, need tools) |
| Text collaboration | LWW only | Character-level CRDT |
| Dependencies | None | automerge-swift (~2MB) |
| Flexibility | Full control | Automerge’s model |
| Sync protocol | Build yourself | Built-in, efficient |
| Performance | Unknown | Optimized (Rust core) |
Document Model Design
Option A: Single Document (All Data)
One Automerge document contains all tags and records:
// Document structure
{
"tags": {
"<uuid>": { "name": "Work", "color": "#4A90D9", ... },
"<uuid>": { "name": "Personal", "color": "#E57373", ... }
},
"records": {
"<uuid>": { "startTime": ..., "endTime": ..., "tagId": ..., ... },
"<uuid>": { ... }
}
} Pros:
- Simple to implement
- Atomic operations across tags and records
- Single sync unit
Cons:
- Entire document syncs on any change
- Poor scalability (1000+ records = large doc)
- No partial sync
Option B: Document Per Entity
Each tag and record is a separate Automerge document:
~/Documents/Minuta/
├── tags/
│ └── {uuid}.automerge # One doc per tag
└── records/
└── 2024/01/
└── {uuid}.automerge # One doc per record Pros:
- Granular sync (only changed docs transfer)
- Better scalability
- Parallel loading
Cons:
- More files to manage
- Cross-document references need care
- Requires automerge-repo for multi-doc management
Option C: Hybrid (Recommended)
- Tags: Single document (small, rarely changes)
- Records: Document per month or per record
~/Documents/Minuta/
├── tags.automerge # All tags in one doc
└── records/
└── 2024/
└── 01.automerge # All records for Jan 2024 Pros:
- Balanced granularity
- Monthly documents keep size manageable (~100 records/month)
- Simpler than per-record files
Data Structure Mapping
Current Tag → Automerge
// Current
public struct Tag {
public let id: UUID
public var name: String
public var color: String
public let createdAt: Date
public var isArchived: Bool
}
// Automerge document structure
// Root is a Map with tag IDs as keys
doc.putObject(obj: .ROOT, key: tagId.uuidString, ty: .Map)
doc.put(obj: tagObj, key: "name", value: .String(name))
doc.put(obj: tagObj, key: "color", value: .String(color))
doc.put(obj: tagObj, key: "createdAt", value: .Timestamp(createdAt))
doc.put(obj: tagObj, key: "isArchived", value: .Boolean(isArchived)) Current TimeRecord → Automerge
// Current
public struct TimeRecord {
public let id: UUID
public var startTime: Date
public var endTime: Date?
public var tagId: UUID?
public var comment: String?
public var images: [String]
}
// Automerge document structure
let recordObj = doc.putObject(obj: .ROOT, key: recordId.uuidString, ty: .Map)
doc.put(obj: recordObj, key: "startTime", value: .Timestamp(startTime))
if let endTime = endTime {
doc.put(obj: recordObj, key: "endTime", value: .Timestamp(endTime))
}
if let tagId = tagId {
doc.put(obj: recordObj, key: "tagId", value: .String(tagId.uuidString))
}
if let comment = comment {
// Use Automerge Text for collaborative editing
let textObj = doc.putObject(obj: recordObj, key: "comment", ty: .Text)
doc.spliceText(obj: textObj, start: 0, delete: 0, value: comment)
}
// Images as a List
let imagesObj = doc.putObject(obj: recordObj, key: "images", ty: .List)
for (i, image) in images.enumerated() {
doc.insert(obj: imagesObj, index: UInt64(i), value: .String(image))
} Wrapper Types
AutomergeTag
import Automerge
/// Wrapper to read/write Tag from Automerge document
struct AutomergeTag {
let document: Document
let objectId: ObjId
var id: UUID {
// ID is the key in parent map, stored externally
fatalError("ID should be tracked by caller")
}
var name: String {
get {
guard case .Scalar(.String(let value)) = try? document.get(obj: objectId, key: "name") else {
return ""
}
return value
}
set {
try? document.put(obj: objectId, key: "name", value: .String(newValue))
}
}
var color: String {
get {
guard case .Scalar(.String(let value)) = try? document.get(obj: objectId, key: "color") else {
return "#808080"
}
return value
}
set {
try? document.put(obj: objectId, key: "color", value: .String(newValue))
}
}
var createdAt: Date {
get {
guard case .Scalar(.Timestamp(let value)) = try? document.get(obj: objectId, key: "createdAt") else {
return Date()
}
// Automerge.Timestamp is milliseconds since epoch
return Date(timeIntervalSince1970: Double(value) / 1000.0)
}
set {
// Automerge.Timestamp is milliseconds since epoch
try? document.put(obj: objectId, key: "createdAt", value: .Timestamp(Int64(newValue.timeIntervalSince1970 * 1000)))
}
}
var isArchived: Bool {
get {
guard case .Scalar(.Boolean(let value)) = try? document.get(obj: objectId, key: "isArchived") else {
return false
}
return value
}
set {
try? document.put(obj: objectId, key: "isArchived", value: .Boolean(newValue))
}
}
/// Convert to plain Tag for UI
func asTag(id: UUID) -> Tag {
Tag(
id: id,
name: name,
color: color,
createdAt: createdAt,
isArchived: isArchived
)
}
} AutomergeTimeRecord
import Automerge
/// Wrapper to read/write TimeRecord from Automerge document
struct AutomergeTimeRecord {
let document: Document
let objectId: ObjId
var startTime: Date {
get {
guard case .Scalar(.Timestamp(let value)) = try? document.get(obj: objectId, key: "startTime") else {
return Date()
}
// Automerge.Timestamp is milliseconds since epoch
return Date(timeIntervalSince1970: Double(value) / 1000.0)
}
set {
// Automerge.Timestamp is milliseconds since epoch
try? document.put(obj: objectId, key: "startTime", value: .Timestamp(Int64(newValue.timeIntervalSince1970 * 1000)))
}
}
var endTime: Date? {
get {
guard case .Scalar(.Timestamp(let value)) = try? document.get(obj: objectId, key: "endTime") else {
return nil
}
// Automerge.Timestamp is milliseconds since epoch
return Date(timeIntervalSince1970: Double(value) / 1000.0)
}
set {
if let date = newValue {
// Automerge.Timestamp is milliseconds since epoch
try? document.put(obj: objectId, key: "endTime", value: .Timestamp(Int64(date.timeIntervalSince1970 * 1000)))
} else {
try? document.delete(obj: objectId, key: "endTime")
}
}
}
var tagId: UUID? {
get {
guard case .Scalar(.String(let value)) = try? document.get(obj: objectId, key: "tagId") else {
return nil
}
return UUID(uuidString: value)
}
set {
if let id = newValue {
try? document.put(obj: objectId, key: "tagId", value: .String(id.uuidString))
} else {
try? document.delete(obj: objectId, key: "tagId")
}
}
}
var comment: String? {
get {
guard case .Object(let textId, .Text) = try? document.get(obj: objectId, key: "comment") else {
return nil
}
return try? document.text(obj: textId)
}
set {
if let text = newValue {
// Get or create text object
if case .Object(let textId, .Text) = try? document.get(obj: objectId, key: "comment") {
// Replace existing text
let currentLen = (try? document.length(obj: textId)) ?? 0
try? document.spliceText(obj: textId, start: 0, delete: Int64(currentLen), value: text)
} else {
// Create new text object
let textId = try? document.putObject(obj: objectId, key: "comment", ty: .Text)
if let textId = textId {
try? document.spliceText(obj: textId, start: 0, delete: 0, value: text)
}
}
} else {
try? document.delete(obj: objectId, key: "comment")
}
}
}
var images: [String] {
get {
guard case .Object(let listId, .List) = try? document.get(obj: objectId, key: "images") else {
return []
}
let length = (try? document.length(obj: listId)) ?? 0
var result: [String] = []
for i in 0..<length {
if case .Scalar(.String(let value)) = try? document.get(obj: listId, index: i) {
result.append(value)
}
}
return result
}
set {
// Delete existing list and recreate
try? document.delete(obj: objectId, key: "images")
let listId = try? document.putObject(obj: objectId, key: "images", ty: .List)
if let listId = listId {
for (i, image) in newValue.enumerated() {
try? document.insert(obj: listId, index: UInt64(i), value: .String(image))
}
}
}
}
/// Convert to plain TimeRecord for UI
func asTimeRecord(id: UUID) -> TimeRecord {
TimeRecord(
id: id,
startTime: startTime,
endTime: endTime,
tagId: tagId,
comment: comment,
images: images
)
}
} Storage Service
AutomergeStorageService
import Automerge
import Foundation
/// Storage service using Automerge for CRDT sync
actor AutomergeStorageService: FileStorageServiceProtocol {
private let storageURL: URL
private var tagsDocument: Document
private var recordDocuments: [String: Document] = [:] // "YYYY-MM" -> Document
init(storageURL: URL) async throws {
self.storageURL = storageURL
// Load or create tags document
let tagsURL = storageURL.appendingPathComponent("tags.automerge")
if FileManager.default.fileExists(atPath: tagsURL.path) {
let data = try Data(contentsOf: tagsURL)
self.tagsDocument = try Document(data)
} else {
self.tagsDocument = Document()
}
}
// MARK: - Tags
func loadTags() async throws -> [Tag] {
var tags: [Tag] = []
let keys = try tagsDocument.keys(obj: .ROOT)
for key in keys {
guard let id = UUID(uuidString: key),
case .Object(let objId, .Map) = try tagsDocument.get(obj: .ROOT, key: key) else {
continue
}
let wrapper = AutomergeTag(document: tagsDocument, objectId: objId)
tags.append(wrapper.asTag(id: id))
}
return tags
}
func saveTag(_ tag: Tag) async throws {
let key = tag.id.uuidString
// Create or get existing object
let objId: ObjId
if case .Object(let existingId, .Map) = try? tagsDocument.get(obj: .ROOT, key: key) {
objId = existingId
} else {
objId = try tagsDocument.putObject(obj: .ROOT, key: key, ty: .Map)
}
// Update fields
var wrapper = AutomergeTag(document: tagsDocument, objectId: objId)
wrapper.name = tag.name
wrapper.color = tag.color
wrapper.createdAt = tag.createdAt
wrapper.isArchived = tag.isArchived
// Persist
try await persistTagsDocument()
}
func deleteTag(id: UUID) async throws {
try tagsDocument.delete(obj: .ROOT, key: id.uuidString)
try await persistTagsDocument()
}
private func persistTagsDocument() async throws {
let data = tagsDocument.save()
let url = storageURL.appendingPathComponent("tags.automerge")
try data.write(to: url)
}
// MARK: - Records
func loadRecords(from startDate: Date, to endDate: Date) async throws -> [TimeRecord] {
var records: [TimeRecord] = []
// Determine which monthly documents to load
let months = monthsBetween(start: startDate, end: endDate)
for month in months {
let doc = try await loadOrCreateRecordDocument(for: month)
let keys = try doc.keys(obj: .ROOT)
for key in keys {
guard let id = UUID(uuidString: key),
case .Object(let objId, .Map) = try doc.get(obj: .ROOT, key: key) else {
continue
}
let wrapper = AutomergeTimeRecord(document: doc, objectId: objId)
let record = wrapper.asTimeRecord(id: id)
// Filter by date range
if record.startTime >= startDate && record.startTime <= endDate {
records.append(record)
}
}
}
return records.sorted { $0.startTime > $1.startTime }
}
func saveRecord(_ record: TimeRecord) async throws {
let month = monthKey(for: record.startTime)
let doc = try await loadOrCreateRecordDocument(for: month)
let key = record.id.uuidString
// Create or get existing object
let objId: ObjId
if case .Object(let existingId, .Map) = try? doc.get(obj: .ROOT, key: key) {
objId = existingId
} else {
objId = try doc.putObject(obj: .ROOT, key: key, ty: .Map)
}
// Update fields
var wrapper = AutomergeTimeRecord(document: doc, objectId: objId)
wrapper.startTime = record.startTime
wrapper.endTime = record.endTime
wrapper.tagId = record.tagId
wrapper.comment = record.comment
wrapper.images = record.images
// Persist
try await persistRecordDocument(doc, for: month)
}
func deleteRecord(_ record: TimeRecord) async throws {
let month = monthKey(for: record.startTime)
let doc = try await loadOrCreateRecordDocument(for: month)
try doc.delete(obj: .ROOT, key: record.id.uuidString)
try await persistRecordDocument(doc, for: month)
}
// MARK: - Helpers
private func monthKey(for date: Date) -> String {
let calendar = Calendar.current
let year = calendar.component(.year, from: date)
let month = calendar.component(.month, from: date)
return String(format: "%04d-%02d", year, month)
}
private func monthsBetween(start: Date, end: Date) -> [String] {
var months: [String] = []
var current = start
let calendar = Calendar.current
while current <= end {
months.append(monthKey(for: current))
guard let next = calendar.date(byAdding: .month, value: 1, to: current) else { break }
current = next
}
return months
}
private func loadOrCreateRecordDocument(for month: String) async throws -> Document {
if let cached = recordDocuments[month] {
return cached
}
let components = month.split(separator: "-")
let year = String(components[0])
let monthNum = String(components[1])
let dirURL = storageURL.appendingPathComponent("records/\(year)")
try FileManager.default.createDirectory(at: dirURL, withIntermediateDirectories: true)
let url = dirURL.appendingPathComponent("\(monthNum).automerge")
let doc: Document
if FileManager.default.fileExists(atPath: url.path) {
let data = try Data(contentsOf: url)
doc = try Document(data)
} else {
doc = Document()
}
recordDocuments[month] = doc
return doc
}
private func persistRecordDocument(_ doc: Document, for month: String) async throws {
let components = month.split(separator: "-")
let year = String(components[0])
let monthNum = String(components[1])
let dirURL = storageURL.appendingPathComponent("records/\(year)")
try FileManager.default.createDirectory(at: dirURL, withIntermediateDirectories: true)
let url = dirURL.appendingPathComponent("\(monthNum).automerge")
let data = doc.save()
try data.write(to: url)
}
} Folder Sync Integration
Automerge works exceptionally well with folder-based cloud sync (Dropbox, iCloud Documents, Google Drive, WebDAV). The key insight: folder sync services create conflict files, Automerge merges them automatically.
The Problem
Folder sync services don’t understand file contents - they sync bytes. When two devices edit the same file offline:
Device A (offline): Edits 01.automerge
Device B (offline): Edits 01.automerge
Both come online...
Dropbox creates:
├── 01.automerge # "Winner" (last upload)
└── 01 (conflicted copy 2025-01-05).automerge # "Loser" The Solution: Auto-Merge Conflict Files
actor FolderSyncConflictResolver {
private let storageURL: URL
/// Scan for and resolve conflict files
func resolveConflicts() async throws {
let files = try FileManager.default.contentsOfDirectory(
at: storageURL,
includingPropertiesForKeys: nil
)
// Group by base name
let groups = groupConflictFiles(files)
for (baseName, conflictFiles) in groups where conflictFiles.count > 1 {
try await mergeConflictFiles(baseName: baseName, files: conflictFiles)
}
}
private func mergeConflictFiles(baseName: String, files: [URL]) async throws {
// Load primary document
let primaryURL = files.first { !$0.lastPathComponent.contains("conflicted") }!
var primaryDoc = try Document(Data(contentsOf: primaryURL))
// Merge all conflict copies into primary
for conflictURL in files where conflictURL != primaryURL {
let conflictDoc = try Document(Data(contentsOf: conflictURL))
try primaryDoc.merge(other: conflictDoc) // <-- Automerge magic!
// Delete conflict file after successful merge
try FileManager.default.removeItem(at: conflictURL)
}
// Save merged document
try primaryDoc.save().write(to: primaryURL)
AppLogger.sync.info("Merged \(files.count) versions of \(baseName)")
}
private func groupConflictFiles(_ files: [URL]) -> [String: [URL]] {
var groups: [String: [URL]] = [:]
for file in files where file.pathExtension == "automerge" {
let name = file.lastPathComponent
let baseName = extractBaseName(from: name)
groups[baseName, default: []].append(file)
}
return groups
}
private func extractBaseName(from filename: String) -> String {
// Handle various conflict patterns:
// Dropbox: "file (conflicted copy 2025-01-05).ext"
// iCloud: "file 2.ext" or "file (conflict).ext"
// Google Drive: "file (1).ext"
let patterns = [
#"\s*\(conflicted copy[^)]*\)"#, // Dropbox
#"\s*\(\d+\)"#, // Google Drive
#"\s*\d+(?=\.)"#, // iCloud numeric suffix
#"\s*\(conflict[^)]*\)"# // Generic
]
var result = filename.replacingOccurrences(of: ".automerge", with: "")
for pattern in patterns {
result = result.replacingOccurrences(
of: pattern,
with: "",
options: .regularExpression
)
}
return result.trimmingCharacters(in: .whitespaces)
}
} Sync Flow Diagram
┌─────────────────────────────────────────────────────────────────────────┐
│ Folder Sync + Automerge │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Device A │ │ Device B │ │
│ │ │ │ │ │
│ │ 01.automerge │ │ 01.automerge │ │
│ │ (edit locally)│ │ (edit locally)│ │
│ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │
│ │ sync sync │ │
│ v v │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Cloud Folder │ │
│ │ ┌─────────────────┐ ┌────────────────────────────────────┐ │ │
│ │ │ 01.automerge │ │ 01 (conflicted copy).automerge │ │ │
│ │ │ (Device B wins) │ │ (Device A's version) │ │ │
│ │ └─────────────────┘ └────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │ │
│ │ sync down to both devices │
│ v │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ App Conflict Resolution │ │
│ │ │ │
│ │ 1. Detect conflict files (pattern matching) │ │
│ │ 2. Load both Automerge documents │ │
│ │ 3. primaryDoc.merge(other: conflictDoc) <-- AUTOMATIC! │ │
│ │ 4. Save merged document │ │
│ │ 5. Delete conflict file │ │
│ │ │ │
│ │ Result: Single 01.automerge with ALL changes from A and B │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │ │
│ │ syncs back to cloud │
│ v │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Cloud Folder │ │
│ │ ┌─────────────────┐ │ │
│ │ │ 01.automerge │ <-- Merged, no conflicts │ │
│ │ └─────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘ Integration with Cloud Adapters
/// Extended storage service with conflict resolution
actor AutomergeFolderSyncService {
private let localStorage: AutomergeStorageService
private let cloudAdapter: CloudAdapterProtocol
private let conflictResolver: FolderSyncConflictResolver
/// Full sync cycle
func sync() async throws {
// 1. Download all remote files
let remoteFiles = try await cloudAdapter.listFiles()
for file in remoteFiles {
let localPath = localURL(for: file.path)
if shouldDownload(remote: file, local: localPath) {
let data = try await cloudAdapter.download(path: file.path)
try data.write(to: localPath)
}
}
// 2. Resolve any conflict files created by folder sync
try await conflictResolver.resolveConflicts()
// 3. Upload local changes
let localFiles = try listLocalAutomergeFiles()
for localFile in localFiles {
let remotePath = remotePath(for: localFile)
if shouldUpload(local: localFile, remotePath: remotePath) {
let data = try Data(contentsOf: localFile)
_ = try await cloudAdapter.upload(path: remotePath, data: data)
}
}
// 4. Clean up remote conflict files (already merged locally)
for file in remoteFiles where isConflictFile(file.path) {
try await cloudAdapter.delete(path: file.path)
}
}
} App Integration
/// On app launch or periodic sync
func onSyncTrigger() async {
// 1. Let folder sync do its thing (Dropbox/iCloud/etc syncs files)
await waitForFolderSyncToSettle()
// 2. Resolve any conflicts Automerge-style
try await conflictResolver.resolveConflicts()
// 3. Reload data from merged documents
await appState.reloadData()
}
/// On file change (FSEvents / DispatchSource)
func onFileChange(url: URL) async {
if isConflictFile(url) {
// New conflict detected - merge immediately
try await conflictResolver.resolveConflicts()
await appState.reloadData()
} else if url.pathExtension == "automerge" {
// Regular update from another device
await appState.reloadData()
}
} Comparison: JSON CRDT vs Automerge for Folder Sync
| Aspect | Custom CRDT (JSON) | Automerge (Binary) |
|---|---|---|
| Conflict detection | Same (file patterns) | Same (file patterns) |
| Merge complexity | Manual field-by-field | One line: doc.merge(other) |
| Merge correctness | Must implement correctly | Battle-tested |
| Text conflicts | LWW (loses edits) | Character-level merge |
| History after merge | Lost | Preserved |
| Debug conflicts | Easy (read JSON) | Hard (binary) |
Why Automerge Excels at Folder Sync
- Conflict files are input to
merge()- designed for this - Merging is automatic and correct - no custom merge logic
- No data loss - concurrent edits are merged, not overwritten
- Simple integration - detect conflict files, call
merge(), delete conflict
Sync Implementation
Sync Protocol
Automerge provides a built-in sync protocol for efficient delta exchange:
import Automerge
class AutomergeSyncManager {
private var syncStates: [String: SyncState] = [:] // peerId -> state
/// Generate sync message to send to peer
func generateSyncMessage(for document: Document, peerId: String) -> Data? {
let state = syncStates[peerId] ?? SyncState()
guard let message = document.generateSyncMessage(state: state) else {
return nil // Already in sync
}
return message
}
/// Receive sync message from peer
func receiveSyncMessage(_ message: Data, for document: Document, peerId: String) throws {
var state = syncStates[peerId] ?? SyncState()
try document.receiveSyncMessage(state: &state, message: message)
syncStates[peerId] = state
}
/// Check if in sync with peer
func isInSync(for document: Document, peerId: String) -> Bool {
let state = syncStates[peerId] ?? SyncState()
return document.generateSyncMessage(state: state) == nil
}
} Cloud Sync with Automerge
/// Sync Automerge documents to cloud storage
class CloudAutomergeSyncService {
private let cloudAdapter: CloudAdapterProtocol
private let syncManager: AutomergeSyncManager
func syncDocument(_ document: Document, path: String) async throws {
// 1. Download remote document if exists
if let remoteData = try? await cloudAdapter.download(path: path) {
let remoteDoc = try Document(remoteData)
// 2. Merge remote into local
try document.merge(other: remoteDoc)
}
// 3. Upload merged document
let data = document.save()
_ = try await cloudAdapter.upload(path: path, data: data)
}
/// Efficient sync using sync protocol (for real-time)
func syncWithPeer(
_ document: Document,
peerId: String,
send: (Data) async throws -> Void,
receive: () async throws -> Data?
) async throws {
// Exchange sync messages until converged
while true {
// Send our changes
if let outgoing = syncManager.generateSyncMessage(for: document, peerId: peerId) {
try await send(outgoing)
}
// Receive their changes
if let incoming = try await receive() {
try syncManager.receiveSyncMessage(incoming, for: document, peerId: peerId)
}
// Check if converged
if syncManager.isInSync(for: document, peerId: peerId) {
break
}
}
}
} Migration Strategy
Phase 1: Add Automerge Dependency
// Package.swift
dependencies: [
.package(url: "https://github.com/automerge/automerge-swift.git", from: "0.6.1")
] Phase 2: Create Migration Service
/// Migrates JSON data to Automerge format
class AutomergeMigrationService {
private let legacyStorage: LocalFileStorageService
private let automergeStorage: AutomergeStorageService
func migrate() async throws {
// 1. Migrate tags
let tags = try await legacyStorage.loadTags()
for tag in tags {
try await automergeStorage.saveTag(tag)
}
// 2. Migrate records (in batches by month)
let allRecords = try await legacyStorage.loadAllRecords()
for record in allRecords {
try await automergeStorage.saveRecord(record)
}
// 3. Mark migration complete
UserDefaults.standard.set(true, forKey: "automerge_migration_complete")
// 4. Optionally backup and remove legacy files
}
var needsMigration: Bool {
!UserDefaults.standard.bool(forKey: "automerge_migration_complete")
}
} Phase 3: Dual-Read Support
/// Storage service that reads from both formats during transition
actor HybridStorageService: FileStorageServiceProtocol {
private let legacy: LocalFileStorageService
private let automerge: AutomergeStorageService
private let migrationComplete: Bool
func loadRecords(from: Date, to: Date) async throws -> [TimeRecord] {
if migrationComplete {
return try await automerge.loadRecords(from: from, to: to)
} else {
// Read from both, prefer Automerge if exists
let automergeRecords = try await automerge.loadRecords(from: from, to: to)
let legacyRecords = try await legacy.loadRecords(from: from, to: to)
// Merge, preferring Automerge versions
let automergeIds = Set(automergeRecords.map(\.id))
let uniqueLegacy = legacyRecords.filter { !automergeIds.contains($0.id) }
return automergeRecords + uniqueLegacy
}
}
func saveRecord(_ record: TimeRecord) async throws {
// Always write to Automerge
try await automerge.saveRecord(record)
}
} Storage Format Comparison
JSON (Current)
records/2024/01/2024-01-15_093000.json (250 bytes)
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"startTime": "2024-01-15T09:30:00Z",
"endTime": "2024-01-15T11:45:00Z",
"tagId": "550e8400-e29b-41d4-a716-446655440000",
"comment": "Working on feature X",
"images": ["123e4567_0.jpg"]
} Automerge Binary
records/2024/01.automerge (~150 bytes per record + overhead)
Binary format containing:
- Document metadata (actor ID, sequence numbers)
- Operations log (compressed)
- Current state snapshot
- Change history Size comparison (100 records):
- JSON: ~25KB (100 files)
- Automerge: ~20KB (1 file with history)
Image Storage
Never store images inside Automerge documents. Store only filename references.
Why Not Store Images in Automerge
- History bloat - Every image modification is stored forever (tombstones)
- Sync inefficiency - Entire document re-syncs when any image changes
- Memory pressure - Document must load all images into memory
- No partial sync - Can’t download just the record without its images
- No true deletion - Removed images stay in history
Recommended Structure
~/Documents/Minuta/
├── tags.automerge
├── records/
│ └── 2024/
│ └── 01.automerge # Only references image filenames
└── images/
└── 2024/
└── 01/
├── abc123_0.jpg # Actual image files
├── abc123_1.jpg
└── def456_0.jpg Record Stores Only References
// In TimeRecord (inside Automerge document)
{
"id": "abc123",
"startTime": ...,
"images": ["abc123_0.jpg", "abc123_1.jpg"] // Just filenames, not data
} Image Storage Service
/// Image storage - separate from Automerge
actor ImageStorageService {
private let imagesURL: URL
func saveImage(_ data: Data, filename: String) async throws {
let url = imagesURL.appendingPathComponent(filename)
try data.write(to: url)
}
func loadImage(filename: String) async throws -> Data {
let url = imagesURL.appendingPathComponent(filename)
return try Data(contentsOf: url)
}
func deleteImage(filename: String) async throws {
let url = imagesURL.appendingPathComponent(filename)
try FileManager.default.removeItem(at: url) // Actually deleted!
}
} Image Conflict Handling
Images are immutable by filename - no conflicts possible:
// Filename includes record ID + index - unique per image
func imageFilename(recordId: UUID, index: Int, ext: String) -> String {
"\(recordId.uuidString)_\(index).\(ext)"
}
// Same filename = same content = no conflict
// New image = new filename = no conflict For content-addressable storage (optional):
// Hash-based filename (like Git)
func imageFilename(data: Data, ext: String) -> String {
let hash = SHA256.hash(data: data).prefix(16).hexString
return "\(hash).\(ext)"
} Comparison
| Storage | In Automerge | As Separate Files |
|---|---|---|
| Size impact | Permanent bloat | Deletable |
| Sync | Full doc re-sync | Only changed files |
| Memory | All in RAM | Load on demand |
| Conflicts | CRDT overhead | None (immutable) |
| True deletion | No (tombstone) | Yes |
Current Minuta approach is correct - images: [String] stores filenames, actual images are separate files. Keep this pattern with Automerge.
Data Deletion and Compaction
Automerge uses tombstones - deleted items are marked as deleted but remain in document history. This is necessary for CRDT correctness.
Options for Permanent Deletion
Option 1: Tombstones (Default)
try doc.delete(obj: .ROOT, key: recordId) // Marked deleted, stays in history Option 2: Rebuild Document (Breaks Sync)
/// Permanently remove deleted items by rebuilding
func compactDocument(_ doc: Document) -> Document {
let newDoc = Document()
// Copy only non-deleted items to new document
// WARNING: Breaks sync with other devices!
return newDoc
} Option 3: Monthly Documents (Recommended)
With monthly document structure, old months can be deleted entirely:
func purgeOldData(olderThan years: Int) async throws {
let cutoff = Calendar.current.date(byAdding: .year, value: -years, to: Date())!
for monthFile in try listMonthlyFiles() {
if monthFile.date < cutoff {
// Delete entire file - truly gone
try FileManager.default.removeItem(at: monthFile.url)
}
}
} This is why monthly documents work well - delete entire .automerge files for old months without breaking sync for current data.
Deletion Summary
| Method | Truly Deletes | Sync Safe | Use Case |
|---|---|---|---|
doc.delete() | No | Yes | Normal deletion |
| Rebuild document | Yes | No | Never (breaks sync) |
| Delete old monthly files | Yes | Yes | Data retention policy |
Trade-offs Summary
Advantages of Automerge
- Proven implementation - Years of production use, edge cases handled
- Text CRDT - Character-level merging for comments (better than LWW)
- Built-in sync protocol - Efficient delta exchange
- Change history - Time travel, branching, undo built-in
- Cross-platform - Same document format in Swift, JS, Rust
- Active development - Regular updates, growing ecosystem
Disadvantages of Automerge
- Binary format - Can’t inspect/edit files manually
- Dependency - ~2MB added to app bundle
- Learning curve - Different mental model than JSON
- Pre-release repo - automerge-repo-swift not stable yet
- Debugging - Harder to diagnose sync issues
- Overkill - Full history may be unnecessary
When to Choose Automerge
- Need real-time collaboration
- Cross-platform sync (iOS + web)
- Text editing with concurrent users
- Want proven CRDT implementation
When to Choose Custom CRDT
- JSON debuggability is critical
- Simple sync (file-based, not real-time)
- No cross-platform requirements
- Want full control over format
Recommendation
For Minuta’s use case (single user, multiple devices, offline-first, multi-cloud folder sync):
Automerge is the better choice for folder-based sync because:
- Conflict resolution is trivial -
doc.merge(other)handles everything - No custom merge logic - battle-tested, correct by construction
- Text merging - character-level merge for comments (vs LWW losing edits)
- Works with any folder sync - Dropbox, iCloud, Google Drive, WebDAV
Trade-off accepted:
- Binary format (can’t inspect files manually)
- ~2MB dependency added to bundle
Custom CRDT (document 503) is better if:
- JSON debuggability is critical for development
- Want zero dependencies
- Don’t need text collaboration
Recommendation: Start with Automerge for production sync. The folder sync + conflict merge pattern is simpler than implementing custom CRDT merge logic correctly.
Related
- 502-multi-cloud-sync - Multi-cloud sync architecture
- 503-crdt-data-structures - Custom CRDT implementation
- 101-overview - System architecture
- 301-file-storage - File storage service