In Development

ScholarOS — Structured Research Execution Platform

Personal

⚡Generic AI tools applied to academic research produce fluent text with no evidence traceability — hallucinated synthesis is structurally indistinguishable from grounded synthesis; literature review, contradiction detection, hypothesis validation, evidence extraction, and proposal drafting each require separate manual workflows with no shared execution model; hypothesis stress-testing relies on the same model that generated the hypothesis — no adversarial challenge, no convergence gate, no provenance on the resulting claim.

⚠️ScholarOS is a structured research execution platform — five capabilities delivered as a DAG-executed MCP workflow, not a chatbox. Every output is bound to source evidence. Contradiction detection runs across the full corpus, not per-query. Hypothesis critique uses a bounded Hypothesis / Critic agent loop with convergence detection — not unconstrained generation. Nine services process research artifacts with rule-based, schema-defined, reproducible logic. No service imports another service — all data flows through the orchestrator via MCP tool invocations.

⚙️MCP Orchestrator executes workflows as DAGs with pause/resume, session management, and full trace logging. Five capabilities: Literature Mapping (HDBSCAN clustering + LLM cluster labeling + paper ranking), Contradiction & Consensus (claim extraction → metric normalization → polarity/value divergence detection → Belief Engine confidence assignment), Hypothesis & Critique (bounded Hypothesis/Critic loop, max 5 iterations, grounded to source claim identifiers), Multimodal Evidence Extraction (tables, figures, metrics from PDFs → structured output), Proposal Assistant (validated hypotheses → Markdown/LaTeX with citation assembly). Data layer: Chroma (vector), SQLite (metadata), Redis (session). Local inference: Ollama qwen2.5:32b, sentence-transformers all-MiniLM-L6-v2.

🛡️100% determinism rate — identical inputs produce identical outputs; no stochastic processes in the deterministic pipeline. Nine independently testable services with no global state and no inter-service imports — all data flows through the orchestrator, eliminating hidden state. Agent reasoning is explicitly bounded: max 5 iterations per hypothesis loop with required grounding to source claim identifiers. March 2026 validation: 5,479 chunks processed, 180 claims extracted, 76 contradictions detected.

🚀Five research output artifacts — ClusterMap (JSON), Contradiction Report (JSON), Validated Hypotheses (JSON), Research Proposals (Markdown · LaTeX), Extracted Evidence (CSV · JSON). Fully local and self-hostable — no external API dependency for any deterministic pipeline stage.

↳ 5,479 chunks · 180 claims · 76 contradictions detected · 100% determinism rate · fully local execution

MCP Orchestrator · DAG Execution9 Deterministic ServicesHypothesis · Critic Agent LoopEvidence-Bound OutputsChroma · SQLite · RedisLocal-First · Self-Hostable

View on GitHub