مَنْبَع
AI-Powered Academic Research

The Research Platform for Classical Scholarship

Import classical manuscripts, historical texts, and academic PDFs. AI-powered OCR, semantic search, and structured note-taking — built for serious scholars.

Desktop App
·
Mobile App
·
Offline First
27K+
Annotated text blocks in database
21
Languages supported for OCR
96%
Arabic HTR model accuracy
Features

Everything a scholar needs

Built from the ground up for classical manuscript studies and academic research workflows.

AI-Powered PDF Import

Import any Arabic or Ottoman PDF. Our AI automatically detects headings, footnotes, structured text blocks, and commentary sections with high precision.

Manuscript OCR

Process scanned manuscripts with our custom Arabic/Ottoman HTR model trained on H100 GPUs. Supports Naskh, Talik, Riqa, Sülüs, and Divani scripts.

Semantic Search

Search across all your sources simultaneously. Find related primary texts, scholarly commentary, and cross-references with context-aware results.

Rich Note-Taking

Write structured notes with rich text, link directly to source passages, and organize by topic. Full TipTap editor with Arabic RTL support.

Source Library

Pre-loaded with classical academic texts. Add your own books, manuscripts, or academic PDFs from any source.

Fully Offline

Your data stays on your device. No cloud dependency, no privacy concerns. Works completely offline after initial setup using SQLite locally.

How it works

From PDF to structured knowledge

Import any book, manuscript, or document and start researching in minutes.

1

Import your source

Add a PDF or image of any Arabic/Ottoman manuscript. Our AI pipeline automatically detects the document type — digital PDF, scanned book, or handwritten manuscript.

2

AI processes the structure

Our parser identifies headings, body text, footnotes, structured text blocks, and commentary sections. Scanned documents are processed by our custom Arabic/Ottoman OCR model.

3

Search and discover

Instantly search across all your imported sources. Results are ranked by relevance and grouped by source, making cross-referencing effortless.

4

Take linked notes

Write notes that link directly to source passages. Build a personal knowledge base with full context preserved — ready for research papers or teaching.

Training Corpora

OCR/HTR and handwriting recognition models are trained on the following corpora. Some dataset names are available upon request via info@menba.app.

  1. Arabic manuscript corpus, 9th–19th c. (80K+ lines)
  2. Arabic online handwriting strokes (15K+ writers)
  3. Ottoman administrative document collection (35K+ lines)
  4. Arabic word-level recognition corpus, isolated & connected
  5. Arabic calligraphy corpus, 9 scripts (50K+ lines)
  6. Farsi/Persian online handwriting collection
  7. Levantine & Anatolian manuscript archive (28K+ lines)
  8. Classical text handwriting dataset, 200 writers, word-level
  9. Open-access Arabic text ground-truth collection (40K+ lines)
  10. Historical Ottoman calligraphy stroke collection
  11. Arabic printed text corpus, classical & modern editions
بسم الله الرحمن الرحيم

Join the waitlist

Menba is currently in development. Be the first to know when we launch.

Request Access

Or contact us directly: info@menba.app