@dataclass class MarkdownDocument: # Tier 1-2: Syntactic + Semantic frontmatter: dict # YAML metadata structure: DocumentTree # parsed AST # Tier 3: Pragmatic document_type: str # 'article'|'tutorial'|'reference'|'proof'|'mfl-entry' # Tier 4: Embedded semantics math_expressions: list # LaTeX/KaTeX diagrams: list # Mermaid/PlantUML code_blocks: list # executable or reference citations: list # BibTeX/Pandoc # Tier 5: Knowledge graph wikilinks: list # [[Fleet Page]] links tags: list # #mfl #fleet #sostle # γ₁ anchor gamma1_stamp: float # = 14.134725141734693 corpus_version: str # v13, v14...
markdown doc parse AST extract triples Neo4j / Qdrant xml doc parse DOM extract triples Neo4j / Qdrant