<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
  <title>Juthoor · Audits &amp; Research Notes</title>
  <subtitle>Lab notebook for The Arabic Tongue: data audits, anomaly investigations, structural deltas.</subtitle>
  <link href="https://arabicjuthoor.com/" />
  <link rel="self" type="application/atom+xml" href="https://arabicjuthoor.com/feed.xml" />
  <id>https://arabicjuthoor.com/feed.xml</id>
  <updated>2026-05-24T00:00:00Z</updated>
  <author><name>Yassine Temessek</name></author>
  <rights>© Temessek for Research, Publishing &amp; Training</rights>

  <entry>
    <title>LV1, The Arabic Linguistic Genome: Complete Overview</title>
    <link href="https://arabicjuthoor.com/05-audits/lv1-overview-archived.md" />
    <id>https://arabicjuthoor.com/05-audits/lv1-overview-archived.md</id>
    <updated>2026-05-19T13:25:42Z</updated>
    <published>2026-05-19T13:25:42Z</published>
    <summary>**Project:** Juthoor Linguistic Genealogy **Module:** Juthoor-ArabicGenome-LV1 **Author:** Yassine Temessek</summary>
  </entry>
  <entry>
    <title>Second attack on the "didn't work" cases — Arabic ↔ Ancient Egyptian</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-24-aa-second-attack.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-24-aa-second-attack.md</id>
    <updated>2026-05-24T00:00:00Z</updated>
    <published>2026-05-24T00:00:00Z</published>
    <summary>**Date:** 2026-05-24 **Pool:** the 79 entries in the Afro-Asiatic roster marked PROVISIONAL or PARALLEL-STEMS (i.e. the ones the first pass filed as "different roots, can't claim a link") **Method:** re-attack each one with the full framework — *don't accept "different roots."* For each, ask: **is there a better Arabic word than the obvious one?** (the زنجبيل → زن+جبيل move, the مَجوس → فَعول move</summary>
  </entry>
  <entry>
    <title>Rounds 3–4 · confirmed cracks + new cognates · Arabic ↔ Ancient Egyptian</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-24-aa-round3-4-new-cognates.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-24-aa-round3-4-new-cognates.md</id>
    <updated>2026-05-24T00:00:00Z</updated>
    <published>2026-05-24T00:00:00Z</published>
    <summary>**Date:** 2026-05-24 (same-day continuation of the [second-attack audit](2026-05-24-aa-second-attack.md)) **What this round did:** (1) confirmed and promoted the tentative cracks from the second attack, (2) re-attacked more of the hard remainder, and (3) **mined the Egyptian lexicon for entirely new cognate pairs** the original 200-entry roster never contained. **Result:** Egyptian top-three-tier </summary>
  </entry>
  <entry>
    <title>Pass-2 calibrated re-review · the 84 PROVISIONAL entries in the Afro-Asiatic roster</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-24-aa-pass2-audit.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-24-aa-pass2-audit.md</id>
    <updated>2026-05-24T00:00:00Z</updated>
    <published>2026-05-24T00:00:00Z</published>
    <summary>**Date:** 2026-05-24 **Pool:** the 84 PROVISIONAL entries in [`tier-a-afro-asiatic-cognates.md`](../04-cross-linguistic/tier-a-afro-asiatic-cognates.md) **Method:** the same blind-rescore methodology that took the Indo-European Pass-1 ≥0.65 pool of 770 candidates down to a calibrated 189 (see [`2026-05-21-opus-calibrated-770.md`](2026-05-21-opus-calibrated-770.md)), now applied to the AA Tier-A ro</summary>
  </entry>
  <entry>
    <title>Per-Language z-Score · Arabic ↔ IE Cognate Signal</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-21-per-language-zscore.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-21-per-language-zscore.md</id>
    <updated>2026-05-21T00:00:00Z</updated>
    <published>2026-05-21T00:00:00Z</published>
    <summary>**Date:** 2026-05-21  ·  **Seed:** 20260521  ·  **Permutations:** 1000</summary>
  </entry>
  <entry>
    <title>Pass 3 · can the framework predict unused four-letter words too?</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-21-pass3-evaluation.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-21-pass3-evaluation.md</id>
    <updated>2026-05-21T00:00:00Z</updated>
    <published>2026-05-21T00:00:00Z</published>
    <summary>The earlier sixteen-pair test confirmed the operative grammar is genuinely **generative at the binary level**: when given an unused two-letter combination, it can predict what Arabic *would* have meant there, and is right about half the time.</summary>
  </entry>
  <entry>
    <title>Re-scoring all 770 cognates · the calibrated second pass</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-21-opus-calibrated-770.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-21-opus-calibrated-770.md</id>
    <updated>2026-05-21T00:00:00Z</updated>
    <published>2026-05-21T00:00:00Z</published>
    <summary>Our first sweep produced 770 Arabic ↔ European cognates at score ≥ 0.65. A second look at a 150-pair sample showed the first pass was a little too generous near the borderline. Rather than adding a footnote, we **re-scored every one of the 770 pairs** under a stricter rubric — without showing the second rater the first score. The result: **189** still come out as likely (was 770), **120** as stron</summary>
  </entry>
  <entry>
    <title>Khshim's sound-substitution laws · tested on real cognate data</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-21-khshim-laws-audit.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-21-khshim-laws-audit.md</id>
    <updated>2026-05-21T00:00:00Z</updated>
    <published>2026-05-21T00:00:00Z</published>
    <summary>When Arabic and a European word share a meaning, their consonants rarely match letter-for-letter — they shift in predictable ways: ك becomes Q, ف becomes P, the throaty Arabic ع matches an H or a silent gap, and so on. Dr Ali Fahmi Khshim's *رحلة الكلمات* listed **nine such laws**. We took our 770 confirmed cognate pairs and counted how often each law actually fires in real data. The answer: all n</summary>
  </entry>
  <entry>
    <title>Wrong-Test Transparency Doc · Alternate-model IRR · un-calibrated</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-21-irr-gemini-uncalibrated.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-21-irr-gemini-uncalibrated.md</id>
    <updated>2026-05-21T00:00:00Z</updated>
    <published>2026-05-21T00:00:00Z</published>
    <summary>**Date:** 2026-05-21  ·  **Sample size:** 150 pairs (stratified random from the 770-cognate pool at score ≥ 0.65)</summary>
  </entry>
  <entry>
    <title>Inter-Rater Agreement (IRR) · Pass 1 pipeline vs Pass 2 calibrated</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-21-inter-rater-agreement.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-21-inter-rater-agreement.md</id>
    <updated>2026-05-21T00:00:00Z</updated>
    <published>2026-05-21T00:00:00Z</published>
    <summary>**Date:** 2026-05-21  ·  **Sample size:** 150 pairs (stratified random from the 770-cognate pool at score ≥ 0.65)</summary>
  </entry>
  <entry>
    <title>Tier-A Cross-Linguistic Skeleton-Match Audit</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-20-tier-a-cross-language-audit.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-20-tier-a-cross-language-audit.md</id>
    <updated>2026-05-20T00:00:00Z</updated>
    <published>2026-05-20T00:00:00Z</published>
    <summary>**Date:** 2026-05-20 **Source:** `Juthoor-CognateDiscovery-LV2/outputs/eye1_full_scale_*.jsonl` **Method:** for each of the 19 Tier-A Quranic-anchored Arabic roots, scan the per-language Eye-1 skeleton-match output for the highest-scoring target-lemma candidate. The skeleton match drops vowels and reduces both Arabic and target consonants to a comparable skeleton, then scores by Jaccard overlap an</summary>
  </entry>
  <entry>
    <title>Layer-2 v2 · Statistical Predictive Test — Robustness Audit</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-20-stat-test-robustness.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-20-stat-test-robustness.md</id>
    <updated>2026-05-20T00:00:00Z</updated>
    <published>2026-05-20T00:00:00Z</published>
    <summary>**Date:** 2026-05-20  ·  **Seed:** 20260520  ·  **Permutations:** 1000 **Corpus:** 2285 records  ·  **Split:** A=1142, B=1143</summary>
  </entry>
  <entry>
    <title>Pass 2 Evaluation · Quadriliteral catalogue 50 → 150</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-20-pass2-evaluation.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-20-pass2-evaluation.md</id>
    <updated>2026-05-20T00:00:00Z</updated>
    <published>2026-05-20T00:00:00Z</published>
    <summary>| # | Prediction | Pass condition | Actual | Verdict | |--:|-----------|---------------|--------|:------:| | P1 | ≥ 92 of 100 readable | ≥ 92 | **95 of 96 attested** (96/100 if you count the 2 dropped-as-unattested and 2 duplicates as un-readable; 95/96 = 99% of net new attested entries) | 🟢 **PASS** | | P2 | Path-mix within bands: Reduplication 15–45%, B-on-B 20–45%, T+A/stacked 20–50% | All thre</summary>
  </entry>
  <entry>
    <title>Cross-Branch Eye-2 Audit · Seven Indo-European Branches</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-20-cross-branch-eye2-audit.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-20-cross-branch-eye2-audit.md</id>
    <updated>2026-05-20T00:00:00Z</updated>
    <published>2026-05-20T00:00:00Z</published>
    <summary>Earlier Tier-A work centred on Greek and Latin. The Eye-2 pipeline has since been run across **seven Indo-European branches** — covering Hellenic, Italic, two Celtic sub-branches, and all three Germanic sub-branches. This audit consolidates the results and asks: does the Arabic ↔ IE cognate signal survive outside Greek and Latin?</summary>
  </entry>
  <entry>
    <title>Sixteen unused letter-pairs · what the framework predicts they would mean</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-20-coherent-gaps-held-out.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-20-coherent-gaps-held-out.md</id>
    <updated>2026-05-20T00:00:00Z</updated>
    <published>2026-05-20T00:00:00Z</published>
    <summary>Arabic could in principle make a root from any two letters out of 28. But ~140 two-letter pairs sit unused — no root in the standard lexicon starts with them. We picked **the 16 pairs the framework reads cleanly** and wrote down, in advance, what each one *should* mean if a root were ever built on it. Then we searched the classical dictionaries (Lisān al-ʿArab, Tāj al-ʿArūs) to see whether some fo</summary>
  </entry>
  <entry>
    <title>Statistical predictive test · letter-charges predict mode clustering</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-19-statistical-predictive-test.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-19-statistical-predictive-test.md</id>
    <updated>2026-05-19T00:00:00Z</updated>
    <published>2026-05-19T00:00:00Z</published>
    <summary>Both were registered in [`docs/PROJECT_METRICS.md`](https://github.com/ustaz-dev/Juthoor-Linguistic-Genealogy/blob/main/docs/PROJECT_METRICS.md) **before** any statistical test was run — they are pre-registered, not post-hoc:</summary>
  </entry>
  <entry>
    <title>Non-Quranic corpora extension test (2026-05-19)</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-19-non-quranic-extension-test.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-19-non-quranic-extension-test.md</id>
    <updated>2026-05-19T00:00:00Z</updated>
    <published>2026-05-19T00:00:00Z</published>
    <summary>The eleven-mode operative grammar was developed and tested against Quran-anchored vocabulary. Does it also read **pre-Quranic** Arabic (Jahili poetry) and **post-Quranic** Arabic (modern MSA, including academy-coined neologisms)?</summary>
  </entry>
  <entry>
    <title>Jabal data anomalies — 4 non-nucleus entries</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-11-jabal-data-anomalies.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-11-jabal-data-anomalies.md</id>
    <updated>2026-05-11T00:00:00Z</updated>
    <published>2026-05-11T00:00:00Z</published>
    <summary>The Jabal lexicon contains 4 entries in the `binary_root` column that are not true binary nuclei. They are data-quality artefacts and should be filtered when computing nucleus statistics.</summary>
  </entry>
  <entry>
    <title>Deep investigation — the 4 notearned nuclei (manual analysis)</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-10-not-earned-nuclei-deep-investigation.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-10-not-earned-nuclei-deep-investigation.md</id>
    <updated>2026-05-10T00:00:00Z</updated>
    <published>2026-05-10T00:00:00Z</published>
    <summary>**Date:** 2026-05-10 **Audit item:** §7.1.1 follow-up — the 4 nuclei (`ثن`, `ضه`, `ظف`, `ون`) whose composed reading from consensus charges did not earn corroboration from Jabal's trilateral family. **Method:** manual reasoning (Claude Pass 2), grounded in the vault's consensus letter charges and Jabal's recorded family for each nucleus. The earlier codex pass diagnosed three of these as "low data</summary>
  </entry>
  <entry>
    <title>Audit closure — §7.2.5: the «14-root delta» between Jabal xlsx and Juthoor's roots.jsonl</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-09-7.2.5-fourteen-root-delta-resolved.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-09-7.2.5-fourteen-root-delta-resolved.md</id>
    <updated>2026-05-09T00:00:00Z</updated>
    <published>2026-05-09T00:00:00Z</published>
    <summary>**Date:** 2026-05-09 **Item closed:** [`our-contributions-and-roadmap.md`](../02-architecture/our-contributions-and-roadmap.md) §7.2.5 **Original framing:** «xlsx جبل (1,924 جذرًا) ومُخرَجات Codex (1,938 جذرًا = +14)» **Verdict:** original framing was **misleading**; both numbers were row counts, not unique-root counts. The real discrepancy is much larger and traces to two distinct bugs in Juthoor</summary>
  </entry>
  <entry>
    <title>The reversibility test — structural pass</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-09-7.1.4-anbar-golden-rule-structural.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-09-7.1.4-anbar-golden-rule-structural.md</id>
    <updated>2026-05-09T00:00:00Z</updated>
    <published>2026-05-09T00:00:00Z</published>
    <summary>**Date:** 2026-05-09 **Hypothesis tested:** if R = (a, b, c) is a trilateral root, the reverse R' = (c, b, a) carries an inverted or related meaning to R.</summary>
  </entry>
  <entry>
    <title>The reversibility test — full graded report</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-05-09-7.1.4-anbar-golden-rule-graded.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-05-09-7.1.4-anbar-golden-rule-graded.md</id>
    <updated>2026-05-09T00:00:00Z</updated>
    <published>2026-05-09T00:00:00Z</published>
    <summary>**Date:** 2026-05-09 **Hypothesis tested:** if R = (a, b, c) is a trilateral root, the reverse R' = (c, b, a) carries an inverted or related meaning to R. **Method:** structural pass over 1,955 trilateral roots in Jabal's lexicon → 155 distinct reverse-permutation pairs. Each pair graded against a 4-level scale: inverse / related / adjacent / unrelated.</summary>
  </entry>
  <entry>
    <title>LV1 — Verified Data Audit &amp; Cross-Platform Reconciliation</title>
    <link href="https://arabicjuthoor.com/05-audits/2026-03-24-lv1-data-audit.md" />
    <id>https://arabicjuthoor.com/05-audits/2026-03-24-lv1-data-audit.md</id>
    <updated>2026-03-24T00:00:00Z</updated>
    <published>2026-03-24T00:00:00Z</published>
    <summary>**Date:** 2026-03-24 **Purpose:** Fact-check all claims about scholar data, letter coverage, and dataset counts against actual source files, NotebookLM extractions, and Jabal's xlsx. Reconcile discrepancies between this platform (Cowork/Claude) and the Codex platform.</summary>
  </entry>
</feed>