jama×faiss×azure
← 海报Poster 安装Install →
REV2026.05.29 LIVE
① OFFLINE · BUILD PIPELINE ② RUNTIME · RESPONSE PATH ③ ONLINE · USER ORIGIN Jama API SOURCE · 增量拉取 SOURCE · INCREMENTAL REQUIREMENTS · TESTS Azure Pipeline CRON · 每小时 CRON · HOURLY FETCH → MERGE → EMBED auto_ansible GIT · 原始 CSV GIT · RAW CSV VERSIONED · DIFF-MERGED 向量化引擎 Vectorize all-MiniLM-L6-v2 · 384d FAISS · 增量 · CPU FAISS · INCREMENTAL · CPU Azure Artifacts SHARED REGISTRY VERSIONED · IMMUTABLE Vector REST API FAISS · Cohere rerank FAISS · COHERE RERANK /search_multi · /jama/* 公网可达 · IN-MEMORY PUBLIC NET · IN-MEMORY Jama Native API DETAILS · 直连 Jama DETAILS · DIRECT TO JAMA ID → FULL ITEM 前端调用 · 不经检索 FRONTEND · BYPASSES RETRIEVAL Azure OpenAI · gpt-5 终答 · 流式 FINAL ANSWER · STREAMING finish_answer → REPLY 企业内部 · in-tenant IN-TENANT 插件展示 Render in Ext RENDER · IN-PAGE 逐字流式渲染 TOKEN-BY-TOKEN STREAM 在 Jama 对话窗口 IN JAMA CHAT PANEL Edge Ext · 提问 Edge Ext · Query FRONTEND · 用户输入 FRONTEND · USER INPUT MANIFEST V3 · IN-PAGE Azure OpenAI · gpt-5 ReAct · 多轮决策 ≤3 ReAct · ≤3 ROUNDS PICK TOOL(S) ↻ OBSERVE fetch commit embed publish ① 提问 ① ASK ② 检索请求 ② RETRIEVE ③ Item IDs ④ 取详情 ④ FETCH ⑤ 应答 ⑤ REPLY on-demand pull · 版本自检 on-demand pull · self-check
ZONE

Title

Description.

jama×faiss
×azure

Jama 需求库 转成 AI 能直接查的语义索引—— 离线侧每小时跑一次:Azure Pipeline 拉差量数据 → 用 sentence-transformers 的 all-MiniLM-L6-v2 模型(384 维、CPU 推理) 把每条 Item 转成向量 → 写进项目对应的 .faiss 文件 → 推到 Azure Artifacts。 在线侧用户在 Jama 页面里问一句,Azure OpenAI gpt-5 走 ReAct 循环(最多 3 轮)自己挑工具——做 FAISS 关键词检索、或顺着 Jama 关系图 抓子项 / 上下游 / 测试记录 / 评论——直到觉得够了再生成最终回答,逐字流回对话面板。 Turn the Jama requirements library into a semantic index AI can query directly. Offline, every hour: Azure Pipeline pulls a diff → the sentence-transformers all-MiniLM-L6-v2 model (384-dim, CPU) turns each Item into a vector → written into that project's .faiss → pushed to Azure Artifacts. Online, the user asks a question inside Jama; Azure OpenAI gpt-5 drives a ReAct loop (≤3 rounds) where the model itself picks tools — FAISS keyword search, or walking the Jama relationship graph for children / upstream-downstream / test runs / comments — until it has enough, then streams the final answer back token-by-token into the chat panel.

8strata architectural strata
1h incremental cadence
2flows offline · online
3fallbacks graceful degradation
点击图中任意节点查看其职责与数据形态;切换 §02 / §03 / §04 标签深入八层架构、流程拆解与关键机制。 Click any node in the diagram to see its role and data shape; switch §02 / §03 / §04 to dive into the eight strata, the walkthroughs, and the design mechanisms.
§ 02 · Strata

八层 体系架构 Eight Strata

L·01
Data Source数据采集层acquisitionSOURCE

全量 / 增量拉取 Jama 中的需求、用例、任务等 Item 元数据;首次全量快照,之后基于更新时间戳与版本号做差量。Full and incremental pull of Jama Items — requirements, test cases, tasks. First a full snapshot, then differential by update timestamp and version.

Jama API同步脚本sync scriptOAuth Token
L·02
Code & Data Repository代码仓储层repositorySTORE

原始 Jama CSV、同步脚本与向量化脚本均托管于 automation 组的 Git 仓库——版本化数据归档与流水线代码源合一。Raw Jama CSV, sync and vectorization scripts all live in the automation team's Git repo — versioned data archive and pipeline source-of-truth in one.

auto_ansible GitCSVscripts
L·03
Orchestration调度编排层orchestrationPIPE

每小时定时触发,串行执行 fetch → merge → embed → publish 全链路;任一环节失败保留上一版可用产物。Triggered every hour, runs the full fetch → merge → embed → publish chain serially; on failure, the previous good artifact is preserved.

Azure Pipelinecron · 1h
L·04
Vectorization向量处理层embeddingCOMPUTE

从每条 Item 抽出标题、描述、备注等核心文本,拼成一段字符串,喂给 sentence-transformers/all-MiniLM-L6-v2 模型,得到一串 384 维浮点向量——这串数字就是"这条 Item 的语义指纹",意思越接近的两条 Item,向量越像。模型是开源轻量级(~80MB),跑在 Azure VM 的 CPU 上,不调任何外部 API、不打外网。向量按 item_id 追加进项目对应的 .faiss 文件——只动新增 / 修改过的条目,不重做全量。From each Item, the title, description, and notes are concatenated into one string and fed to sentence-transformers/all-MiniLM-L6-v2, producing a 384-dim float vector — the "semantic fingerprint" of that Item. Items with similar meaning have similar vectors. The model is open-source and lightweight (~80MB), runs on the Azure VM's CPU, makes no external API calls. Vectors are appended by item_id into the project's .faiss file — only added or modified Items are touched; nothing is recomputed.

all-MiniLM-L6-v2384 dimFAISSCPU · no GPUid ↔ vector
L·05
Artifacts Repository制品仓库层artifactsSTORE

FAISS 索引 + ID 映射元数据按时间 / 版本号打标,统一归档于 Azure Artifacts;运行端按需拉取,支持回滚。FAISS index + ID-mapping metadata are tagged by time and version, archived in Azure Artifacts; runtime pulls on demand, with rollback supported.

Azure ArtifactsSemVerimmutable
L·06
Retrieval Service检索服务层retrievalRUNTIME

常驻 FastAPI 服务(Azure VM, 端口 8080),启动时自动检查本地 FAISS 是否缺失/落后 → 缺则去 Azure Artifacts 拉最新版 → 全部加载进内存。对外暴露两类端点:
检索类——/api/search(单 query,给 AI Search 模式用)、/api/search_multi(一次接 3-5 个关键短语,合并成一个候选池而不是各搜各的,再用 Cohere-rerank-v4.0-pro用户原问题整段打分——这是 ReAct 主路径)。
图导航类——/api/jama/{item, children, relationships, testruns, comments},代理 Jama 原生 API,统一软处理 404(让 LLM 看到 total:0 自然换工具,而不是报错)。
Long-running FastAPI service (Azure VM, port 8080). On start, self-checks local FAISS — pulls from Azure Artifacts if missing/stale, loads everything into memory. Exposes two endpoint families:
Retrieval/api/search (single query, used by AI Search mode); /api/search_multi takes 3-5 phrases at once and merges hits into one pool instead of querying each phrase separately, then Cohere-rerank-v4.0-pro reranks against the user's original question as a whole (this is the ReAct main path).
Graph navigation/api/jama/{item, children, relationships, testruns, comments}, proxies Jama's native API with uniform soft-404 handling (LLM sees total:0 and naturally tries another tool instead of getting an error).

FastAPI · :8080/search_multiCohere-rerank-v4.0-pro/jama/* graph nav公网可达 *public net *
L·07
Frontend Integration前端集成层frontendFRONT

Manifest V3 Edge 扩展,只注入到 jabra.jamacloud.com/*content.js 往 Jama 顶栏塞一个"Jama AI"按钮;点开后 chat.js 注入右侧栏(两种模式 tab:AI Agent 走 ReAct,AI Search 直接出 FAISS 命中);ReAct 主循环 runAgentLoop 写在 api.js 里,直接在 content script 里发 SSE 调 gpt-5——绕开 service worker,避免 MV3 那个 30 秒空闲超时把流式中断。同一条 Jama item 的详情拉取也在前端发起。Manifest V3 Edge extension, injected only into jabra.jamacloud.com/*. content.js drops a "Jama AI" button into Jama's top nav; clicking it has chat.js mount the right-side panel (two tabs: AI Agent uses ReAct, AI Search shows raw FAISS hits). The ReAct main loop runAgentLoop lives in api.js and streams SSE to gpt-5 directly from the content script — bypassing the service worker to dodge MV3's 30-second idle timeout that would otherwise kill the stream. Per-Item detail fetches are also issued from the frontend.

Manifest V3content.js / chat.js / api.jsrunAgentLoopSSE 绕开 SWSSE bypasses SW
L·08
AI ReasoningAI 推理层reasoningAI

Azure OpenAI 企业租户内的 gpt-5,承担一条用户问题的全部推理工作——既负责"想下一步该查什么",也负责"看完所有材料后怎么回答"。具体走 ReAct 循环(最多 3 轮):每轮 LLM 看到当前 messages + 7 个可用工具的说明 → 自己挑要调哪些工具(同一轮可并行调多个)→ 工具结果以 role:'tool' 灌回 → 进入下一轮。直到 LLM 自己调 finish_answer 工具,第二段 SSE 流出最终回答。跨轮自动去重已展示过的 item id(这样它可以放心地"同样的关键词、更大的 top_k"扩展搜索)。gpt-5, hosted in Azure OpenAI's enterprise tenant, does all the reasoning for one user question — both "what should I look up next" and "now that I've seen everything, how do I answer". It runs a ReAct loop (≤3 rounds): each round the LLM sees the current messages + 7 tool definitions → picks which tool(s) to call (parallel calls allowed within one round) → tool results come back as role:'tool' messages → next round begins. When the LLM calls finish_answer, a second SSE stream produces the final answer. Already-emitted item IDs are auto-deduped across rounds, so the model can safely "re-search the same phrase with a larger top_k" to expand without seeing duplicates.

Azure OpenAI gpt-5单模型 · 决策+终答one model · decisions+answer7 tools≤3 rounds跨轮 item 去重cross-round dedup
§ 03 · Walkthrough

逐步 展开 流程 Step through each flow

ModeOFFLINE · 每小时增量hourly incremental 5 stages
01
触发调度Trigger

Azure Pipeline 定时触发器拉起任务。Azure Pipeline's scheduled trigger fires the job.

每小时整点流水线被自动唤起,运行环境与凭据由 Azure 托管,无需人工介入。The pipeline wakes itself on the hour; the runtime environment and credentials are Azure-managed — no human intervention needed.

02
增量数据拉取Incremental fetch

基于更新时间戳 / 版本号差量同步。Diff-sync by update timestamp / version.

调用 Jama API,仅拉取上次成功同步以来发生新增、修改、删除的 Item,保留变更类型标记。Calls the Jama API, fetches only Items added / modified / deleted since the last successful sync, with the change-type flag preserved.

03
CSV 数据更新CSV update

合并 → 去重 → 提交至 auto_ansibleMerge → dedupe → commit to auto_ansible.

增量数据与历史 CSV 合并、去重、变更标记,自动提交进入 Git 版本流——CSV 既是数据源也是审计追踪。Incremental rows are merged with the historical CSV, deduped, and tagged, then auto-committed into the Git stream — CSV is both data source and audit trail.

04
批量文本向量化Batch vectorization

all-MiniLM-L6-v2(384 维、CPU)增量写入 FAISS。all-MiniLM-L6-v2 (384-dim, CPU) writes incrementally into FAISS.

抽取标题 / 描述 / 备注等核心文本,拼成字符串后过 sentence-transformers/all-MiniLM-L6-v2 模型得到 384 维向量,按 item_id 写回项目对应的 .faiss——只动变更条目。模型本身~80MB、跑在 Azure VM 的 CPU 上,不打外网。Title / description / notes are concatenated and fed to sentence-transformers/all-MiniLM-L6-v2 to produce a 384-dim vector, written back into the project's .faiss by item_id — only changed Items are touched. The model itself is ~80MB and runs on the Azure VM's CPU; no external calls.

05
向量制品打包上传Publish artifact

FAISS + ID 映射推至 Azure Artifacts。FAISS + ID map pushed to Azure Artifacts.

打版本标 → 推送至 Artifacts → 保留历史版本以支持回滚;记录条数日志,异常自动告警,下游零感知。Version-tag → push to Artifacts → keep history for rollback; row counts logged, exceptions alert automatically, downstream stays unaware.

§ 04 · Design Mechanisms

四组 关键机制 Four design mechanisms

M·01

FAISS 文件 版本化托管 Version-pinned FAISS storage

向量文件统一托管 Azure Artifacts,做版本化管理 + 回滚;VM 端检索服务开机自启 + 自动校验,缺失 / 落后则按需拉取。增量向量化只更新变更 Item。 Vector files are centrally stored in Azure Artifacts with versioning and rollback; the VM-side retrieval service starts on boot and self-validates — pulling on demand when missing or stale. Incremental vectorization only touches changed Items.

versionedself-healingincremental
M·02

数据链路 双向隔离 Bidirectional path isolation

离线层:Git 存原始 CSV,Artifacts 存向量文件,业务原始数据与 AI 向量数据存储分离。在线层:检索只返回 Item ID,详情由前端直连 Jama API,向量服务零详情压力。 Offline: Git holds raw CSV, Artifacts holds vector files — raw business data and AI vectors are stored separately. Online: retrieval returns only Item IDs, details flow from the frontend straight to Jama's API — the vector service carries zero detail-fetch load.

storage isolationthin retrievalfrontend fan-out
M·03 · 校正M·03 · CORRECTION

关于权限与安全边界—— Azure 虚拟机走内网 Azure 虚拟机走公网,向量检索 API 由公网可达;Azure Pipeline 与 Azure Artifacts 仍走内网私有链路,Azure OpenAI gpt-5 部署在企业自己的 tenant 里,不对外暴露——所有 LLM 流量都从公网 VM 走 HTTPS 进企业 endpoint,问题和 Jama 数据不出公司账户范围。 On security boundaries — the Azure VM uses a private network the Azure VM is on the public internet, the vector retrieval API is publicly reachable; Azure Pipeline and Azure Artifacts remain on private internal links; Azure OpenAI gpt-5 is deployed inside the company's own tenant and not exposed externally — all LLM traffic goes from the public VM over HTTPS to the enterprise endpoint, so questions and Jama data never leave the corporate account scope.

M·04

多级 容错兜底 Multi-tier fallback

① 流水线增量拉取失败 → 保留上一版 CSV / 向量文件,在线检索不受影响。② 向量服务拉 Artifacts 失败 → 降级用本地旧版 FAISS。③ Cohere rerank 调用失败 → 自动退化成"按 FAISS 向量分数排序"返回(在 cohere_rerank() 里有兜底)。④ Jama 图导航子资源 404(比如这条 itemType 没有 testruns)→ 后端软处理为 {total:0, items:[]},LLM 自然换工具,不报错也不中断流程。⑤ ReAct 第 3 轮还没收尾 → 第 4 轮强制 tool_choice:'none',让 LLM 拿现有材料生成回答,避免无限循环。 ① Pipeline incremental fetch fails → previous CSV / vector files are kept; online retrieval is unaffected. ② Vector service can't pull Artifacts → falls back to the local old FAISS. ③ Cohere rerank call fails → auto-degrades to "rank by raw FAISS similarity score" (built into cohere_rerank()). ④ Jama graph-nav sub-resource returns 404 (e.g. this itemType has no testruns) → backend soft-normalizes to {total:0, items:[]}; the LLM naturally tries another tool, no error, no flow interruption. ⑤ ReAct still hasn't finished by round 3 → round 4 forces tool_choice:'none', the LLM must produce an answer from what it has — preventing infinite loops.

previous-goodlocal-fallbackrerank-bypasssoft-404round-cap