MEEK SEARCH BOXINER · Sovereign Regulated Search · 35 Laws

◈ SOVEREIGN SEARCH PIPELINE · FULL STACK

QUERY ENTRY — Convo-Loom F03 Intent Extraction

Free text → structured intent object. Multi-stage: Convo-Loom extracts purpose, scope, urgency, regulated type. Saybook stage classifies purpose (treatment / legal_hold / operations / audit). Query itself classified for sensitivity before any logging.

LOCAL LLM — qwen3:14b via mal

GID IDENTITY GATE — SOSTLE L0

Who is searching? GID token scopes the entire search. L0–L4 open: broad fleet knowledge. L5 gated: sovereign layers. L6–L7 closed: floor-only + silence. GID also generates role + story + scene context for this session.

SOVEREIGN — GID TOKEN REQUIRED

ABACUS GATE — C1→C5 Search Depth

Consideration level determines search width and mode. C1–C2: Perl mode (fast, pattern-first, metadata only). C3–C5: Ruby mode (typed, governed, full policy pipeline). Abacus gate is the carry-depth of the search. Wider bead = deeper search = more governance.

ABACUS C1–C5 · ARB1-MEEK-ABACUS-003

CLASSIFICATION — Never Index Before Classifying

ABSOLUTE LAW. Every asset gets classified before any index write. Types: PII / PHI / PCI / SECRETS / LEGAL / HR / FINANCIAL / SOVEREIGN. Unclassified → routed to Merostone for classification. Index write blocked until classification stamp exists.

MEMECHET FC-2 GATE

PROJECTION BUILDER — Never Index Raw

Source asset NEVER goes directly into index. SearchPolicy from Protobuf envelope defines: searchable_fields / denied_fields / masked_fields / allowed_purposes. Projection computed. Projection hash = H(asset_hash + schema_hash + rules_version + indexed_fields + masks). The index IS regulated data.

SOVEREIGN PROTOBUF ENVELOPE

AUTHORIZATION — AuthorizedScope(user, purpose, context)

D = all documents. Q = matching query. A(u) = user authorized. P(p) = purpose allowed. J(j) = jurisdiction allowed. R = Q ∩ A(u) ∩ P(p) ∩ J(j). Field level: VisibleFields = Fields(d) ∩ AllowedFields(u, p, classification). Policy BEFORE retrieval. Policy BEFORE scoring. Policy BEFORE snippets.

SOSTLE L2 — ACCESS KEEP

ADELIC LAYER SEARCH — 8 SOSTLE Lenses (parallel)

All 8 SOSTLE layers fire simultaneously. L0: identity context (GID scope). L1: PEMCLAU inventory (qdrant 2-hop, 60-80 nodes). L2: Merostone field-level gate. L3: live fleet state (active containers/incidents). L4: war room (open ARBs/TRBs/sorries). L5: sorry-flow restoration lineage. L6: γ₁-ranked (eigenspace distance). L7: SILENCE — what cannot be returned.

PEMCLAU + MEROSTONE + γ₁ FLOOR

LOCAL LLM CONTEXT BUILDER — Sovereign First, External Trial Second

Build order mandatory: 1) Convo-Loom intent → 2) GID role+story+scene (local qwen3:14b or qwen2.5:32b) → 3) PEMCLAU retrieval against GID context → 4) Bonixer scores each chunk → 5) MEMECHET verifier checks intent. Only THEN: wrap in GID envelope for external LLM trial. External response ingested back to PEMCLAU (learn + grow).

LOCAL FIRST · mal/:9334 · qwen2.5:32b

JOFFE-MATH RANKER — γ₁-Distance + Prime Shape + Adelic Norm

Phase 0: γ₁-distance scoring (eigenspace proximity to Riemann floor). Phase 2: prime shape map (query terms → prime shapes 2=SHAPE-SHIFTER → 8128=PERFECT → result weight). Adelic norm: p-adic local (silo-specific) + archimedean global (fleet-wide). 18-Wave harmonic confidence gates: W04/W08/W14/W16. Rank ONLY after policy filtering.

JOFFE-MATH · pcdev :9383-9385

MEMECHET VERIFIER — FC-2 Check + FC-3 Seal

Does result match original intent? Verifier-generator split: FC-2 verifies result against Convo-Loom structured intent. PASS → proceed to result shaping. FAIL → return to Convo-Loom for refinement (up to 3 passes). FC-3 seals the verified result set with session hash.

MEMECHET · LABR-069 SEALED

RESULT SHAPING — Masks / Snippets / Facets / Silence

Per-field policy: raw(field) if fully allowed · mask(field) if partially allowed · hidden if denied. Snippets generated AFTER masking (never before — snippets are mini-disclosures). Facets: counts suppressed if they reveal existence. Fuzzy search disabled for PHI/PII identifiers. L7 triggered → structured SILENCE result (not 404, not error — the refusal IS the answer).

L7 SILENCE GATE · MOAT-091

BONIXER ASSESSMENT — 4 Layers Per Result

Layer 1: Identity match — GID scope alignment (is this result for this searcher?). Layer 2: Sovereignty check — L7 gate (would returning this violate sovereign boundary?). Layer 3: Floor quality — γ₁ threshold (is this above the inference quality floor?). Layer 4: Relevance — PEMCLAU cosine similarity to query in embedding space.

BONIXER · pemos.ca/bonixer

VIZASL RENDER — Constellation / Galaxy / Helix

Search results rendered as VIZASL constellation. Each result = one point, coloured by SOSTLE layer of origin (L0=cyan, L1=green, L2=gold, L3=amber, L4=pink, L5=violet, L6=deep violet, L7=moat/hidden). Size = Bonixer score. Proximity = γ₁-distance. Click-through → full Boxiner assessment panel.

VIZASL · 14-GOAT INTERPRETATION

AUDIT LEDGER — Tamper-Evident Chain

Every regulated search produces: query_hash (not raw — query may be sensitive) · user/purpose/policy_version/index_version · result_count/fields_returned/hidden_count_disclosed · search_event_hash chained to previous event. Query logs are regulated: log raw only in security audit zone, hash only for normal analytics. Search telemetry can be more sensitive than search results.

AUDIT CHAIN · EVERY QUERY

INGEST BACK — Learn + Grow (PEMCLAU re-ingest)

Search interaction → new PEMCLAU entry (the fleet learns from every search). GID role/story/scene → Saybook entry (builds sovereign context corpus). Audit event → sorry-flow if gap found (known gaps become named sorries). External LLM response → ingested back after GID wrapper verification. Search is the fleet's fastest learning loop.

PEMCLAU INGEST · yone:6333

◈ 35 LAWS OF SOVEREIGN REGULATED SEARCH

LAW 01

The Index Is Regulated Data Too

Not just the source file. The search index, embeddings, cached snippets, highlights, thumbnails, OCR output, metadata, and query logs may all become regulated data. Protect all layers, not just the original asset.

LAW 02

Policy Before Retrieval

Never index before classification. Never search before authorization. Never return before masking. Never forget to audit. These are absolute. There are no exceptions.

LAW 03

Search = Data-Access Event

Every search produces an audit event. query → identity → policy → purpose → scope → filtered index → masked result → audit event → lineage record. This is the full pipeline, always.

LAW 04

Regulated Search Is Set Filtering

R = Q ∩ A(u) ∩ P(p) ∩ J(j). Visible results = intersection of query match, user auth, purpose allowed, jurisdiction allowed. Rank ONLY after this intersection. Never rank then filter.

LAW 05

Never Index Raw — Always Project

Source asset NEVER goes directly into the index. Compute a SearchPolicy projection first. Projection hash = H(asset_hash + schema_hash + rules_version). The projection is what gets indexed.

LAW 06

Snippets Are Mini-Disclosures

Mask/redact text BEFORE generating snippets. Never generate snippet from raw text then mask after. For high-risk data: no snippets by default. Metadata-only results until explicit authorization.

LAW 07

Query Logs Are Regulated Data

A query can itself be sensitive: "patient HIV status" / "fraud investigation CFO" / "child custody abuse evidence". Log query hash for analytics. Raw query only in security audit zone. Search telemetry can be more sensitive than search results.

LAW 08

Inverted Index Risk

Normal inverted index leaks regulated terms. Solution: do not index sensitive tokens / redact before indexing / tokenize before indexing / encrypt index fields / partition indexes by access domain / apply document-level security before results.

LAW 09

Existence Leakage

Even returning 0 results can leak. Facet count "3 hidden matches" leaks existence. Policy before scoring. Policy before facets. Policy before counts. Never reveal existence of a record the user cannot access.

LAW 10

Vector Search Is Regulated Too

Embeddings may encode sensitive facts. Chunks may contain PII. RAG retrieval is data access. Prompt context is data disclosure. LLM output is derived data. The model cannot leak what it never receives. Retrieval policy is the main safety boundary.

LAW 11

Silence Is First-Class Output

When L7 triggers: return structured refusal, not a 404, not an error. Refusal includes: what was asked (hashed), which layer triggered, why, what alternate scope might be authorized. Refusal IS the answer.

LAW 12

Transitive Deletion

Deleting a source asset triggers DeleteClosure(asset): source → OCR → projection → inverted index → vector index → snippet cache → autocomplete → LLM summary cache → query result cache. All derived artifacts. No orphans.

LAW 13

Two-Index Pattern

Index A: safe metadata (broader access, C1–C2 queries). Index B: regulated full-text/chunks (narrow access, C3–C5 only). PEMCLAU = Index B. Merostone lattice = Index A. Separation reduces blast radius.

LAW 14

Break-Glass Search

Emergency access: requires stated reason + elevated role + time-limit + post-review commitment. Break-glass audit = full (raw query logged). Creates sorry-flow entry. The gap must be closed or justified. Not admin bypass — controlled exception.

LAW 15

Autocomplete Is Regulated

Autocomplete should not train on restricted terms unless the user is authorized. Typing "jo" should not suggest "John Smith HIV result." Separate autocomplete indexes by classification. Filter suggestions by user policy.

LAW 16–35

The Complete 35-Law Corpus

Full corpus in: TRB-MEEK-SEARCH-BOXINER-001 · ARB1-MEEK-SEARCH-BOXINER-001 · arch/merchant-spear/ · fleet-wiki :9400. All 35 laws + 12 ARB decisions + Perl/Ruby mode spec + projection hash protocol + sovereign Protobuf search envelope.

ID	Decision	Verdict
D1	Search is a regulated data-access event — every query produces audit event	STRONGLY APPROVED
D2	Sovereign Protobuf search envelope with SearchPolicy field	APPROVED
D3	Never index before classification — absolute law	APPROVED
D4	12-layer regulated search ladder (complete stack L0–L11)	APPROVED
D5	Two-index pattern: metadata (A) + regulated full-text (B)	APPROVED
D6	Perl mode (C1–C2) vs Ruby mode (C3–C5) search grammar	APPROVED
D7	Local LLM context-building pipeline — sovereign first, external trial second	STRONGLY APPROVED
D8	Transitive deletion — DeleteClosure propagates through all derived artifacts	APPROVED
D9	Query logs are regulated data — hash for analytics, raw only in audit zone	APPROVED
D10	L7 Silence as structured output — refusal IS the answer	STRONGLY APPROVED
D11	Break-glass search — controlled, logged exception with sorry-flow entry	APPROVED
D12	Search projection as sovereign derived regulated data (own hash, lineage, retention)	APPROVED