audit-case-rag
v0.1.0Local-first, event-driven RAG for commercial real estate audit & investigation case folders. Index a case directory named like "项目问题编号标题" (with stage subfolders such as 01_policy_basis/02_process/04_settlement_payment) and query it with citations (file:// links + PDF #page). Use when you need to org...
Installation
audit-case-rag
This skill packages a local-only workflow to build a searchable evidence index for a single audit/investigation case and query it with page-level citations.
Workflow
0) Prepare a case folder (事件驱动)
Create a case directory named:
- <项目问题编号>__<标题>
Inside, use stage folders (stage is inferred from folder name):
- 01_policy_basis/ (basis) — 制度/流程/授权
- 02_process/ (process) — 招采/定标/过程证据
- 03_contract/ (contract) — 合同/补充协议
- 04_settlement_payment/ (payment) — 结算/付款/发票/验收
- 05_comm/ (comm) — 邮件/会议纪要/IM
- 06_interviews/ (interview) — 访谈/笔录/询证
- 07_workpapers/ (workpaper) — 底稿/抽样/复核表
- 09_rectification/ (rectification) — 整改/闭环
Full template: references/case-folder-template.md
1) Install dependencies (local)
From the skill folder (or copy the script into your repo):
python3 -m venv .venv
source .venv/bin/activate
pip install -r scripts/requirements.txt
LibreOffice is recommended for Office→PDF page citations:
- soffice must be available (PATH) or pass --soffice /path/to/soffice.
2) Index the case
./scripts/audit_case_rag.py index
--case-dir "/path/to/<项目问题编号>__<标题>"
--out-dir "/path/to/audit_rag_db"
Outputs:
- manifest.jsonl written into the case directory
- audit_rag_db/<case_id>.joblib (persistent local index)
3) Query with event filters
./scripts/audit_case_rag.py query
--case "<项目问题编号>"
--stage payment
"付款节点是否倒挂?请给出处页码"
Notes:
- Evidence lines include clickable file://...#page=N citations when possible.
- Retrieval is hybrid: embedding recall + TF‑IDF rerank (alpha configurable).
Safety/Privacy
- No cloud APIs. Everything runs locally.
- Do not commit outputs (indices, converted PDFs) to git.