Web page change tracking with structured diffs. markgrab + snapgrab integration, MCP native.
from diffgrab import DiffTracker
tracker = DiffTracker()
await tracker.track("https://example.com")
changes = await tracker.check()
for c in changes:
if c.changed:
print(c.summary) # "3 lines added, 1 lines removed in sections: Introduction."
print(c.unified_diff) # Standard unified diff output
await tracker.close()- Change detection — track any URL, detect content changes via content hashing
- Structured diffs — unified diff + section-level analysis (which headings changed)
- Human-readable summaries — "5 lines added, 2 removed in sections: Intro, Methods"
- Snapshot history — SQLite storage, browse past versions of any page
- markgrab powered — HTML/YouTube/PDF/DOCX extraction via markgrab
- Visual diff — optional screenshot comparison via snapgrab
- MCP server — 5 tools for Claude Code / MCP clients
- CLI included —
diffgrab track,check,diff,history,untrack
pip install diffgrabOptional extras:
pip install 'diffgrab[cli]' # CLI with click + rich
pip install 'diffgrab[visual]' # Visual diff with snapgrab
pip install 'diffgrab[mcp]' # MCP server with fastmcp
pip install 'diffgrab[all]' # Everythingimport asyncio
from diffgrab import DiffTracker
async def main():
tracker = DiffTracker()
# Track a URL (takes initial snapshot)
await tracker.track("https://example.com", interval_hours=12)
# Check for changes
changes = await tracker.check()
for change in changes:
if change.changed:
print(change.summary)
print(change.unified_diff)
# Get diff between specific snapshots
result = await tracker.diff("https://example.com", before_id=1, after_id=2)
# Browse snapshot history
history = await tracker.history("https://example.com", count=20)
# Stop tracking
await tracker.untrack("https://example.com")
await tracker.close()
asyncio.run(main())from diffgrab import track, check, diff, history, untrack
await track("https://example.com")
changes = await check()
result = await diff("https://example.com")
snaps = await history("https://example.com")
await untrack("https://example.com")# Track a URL
diffgrab track https://example.com --interval 12
# Check all tracked URLs for changes
diffgrab check
# Check a specific URL
diffgrab check https://example.com
# Show diff between snapshots
diffgrab diff https://example.com
diffgrab diff https://example.com --before 1 --after 3
# View snapshot history
diffgrab history https://example.com --count 20
# Stop tracking
diffgrab untrack https://example.comAdd to your Claude Code MCP config:
{
"mcpServers": {
"diffgrab": {
"command": "diffgrab-mcp",
"args": []
}
}
}Or with uvx:
{
"mcpServers": {
"diffgrab": {
"command": "uvx",
"args": ["--from", "diffgrab[mcp]", "diffgrab-mcp"]
}
}
}MCP Tools:
| Tool | Description |
|---|---|
track_url |
Register a URL for change tracking |
check_changes |
Check tracked URLs for changes |
get_diff |
Get structured diff between snapshots |
get_history |
Browse snapshot history |
untrack_url |
Stop tracking a URL |
Every diff operation returns a DiffResult:
@dataclass
class DiffResult:
url: str # The tracked URL
changed: bool # Whether content changed
added_lines: int # Lines added
removed_lines: int # Lines removed
changed_sections: list[str] # Markdown headings with changes
unified_diff: str # Standard unified diff
summary: str # Human-readable summary
before_snapshot_id: int | None # DB ID of older snapshot
after_snapshot_id: int | None # DB ID of newer snapshot
before_timestamp: str # When older snapshot was taken
after_timestamp: str # When newer snapshot was takenSnapshots are stored in SQLite at ~/.local/share/diffgrab/diffgrab.db (auto-created). Custom path:
tracker = DiffTracker(db_path="/path/to/custom.db")| Package | Role | PyPI |
|---|---|---|
| markgrab | HTML/YouTube/PDF/DOCX to markdown | pip install markgrab |
| snapgrab | URL to screenshot + metadata | pip install snapgrab |
| docpick | OCR + LLM document extraction | pip install docpick |
| feedkit | RSS feed collection | pip install feedkit |
| diffgrab | Web page change tracking | pip install diffgrab |
| browsegrab | Browser agent for LLMs | Coming soon |