youlbot

shinalok/youlbot

Fork 0

Commit Graph

Author	SHA1	Message	Date
shinalok	bdb6fd83c4	Fix RAGAS eval: increase timeout for local LLM, safe score extraction - RunConfig(timeout=600, max_workers=1): local Qwen3 needs more than 60s/call - Extract scores from df.mean() instead of result[key] to handle NaN safely Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-01 19:41:32 +09:00
shinalok	a2dff825ad	Fix: use Container class (not container instance) in eval script Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-01 17:43:51 +09:00
shinalok	3faf8b09ce	Phase 20: RAGAS evaluation suite - eval/run_ragas.py: collect contexts (RetrieverService) + answers (/chat API), evaluate with faithfulness / answer_relevancy / context_recall / context_precision - eval/dataset.jsonl: 5 Korean Q&A pairs for initial evaluation - eval/requirements.txt: ragas==0.2.9, datasets, langchain-google-vertexai - Evaluator LLM priority: OpenAI > Anthropic > local Qwen3 - Runtime shim for ragas 0.2 / langchain-community 0.4+ vertexai incompatibility Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-01 17:11:00 +09:00

Author

SHA1

Message

Date

shinalok

bdb6fd83c4

Fix RAGAS eval: increase timeout for local LLM, safe score extraction

- RunConfig(timeout=600, max_workers=1): local Qwen3 needs more than 60s/call
- Extract scores from df.mean() instead of result[key] to handle NaN safely

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-06-01 19:41:32 +09:00

shinalok

a2dff825ad

Fix: use Container class (not container instance) in eval script

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-06-01 17:43:51 +09:00

shinalok

3faf8b09ce

Phase 20: RAGAS evaluation suite

- eval/run_ragas.py: collect contexts (RetrieverService) + answers (/chat API),
  evaluate with faithfulness / answer_relevancy / context_recall / context_precision
- eval/dataset.jsonl: 5 Korean Q&A pairs for initial evaluation
- eval/requirements.txt: ragas==0.2.9, datasets, langchain-google-vertexai
- Evaluator LLM priority: OpenAI > Anthropic > local Qwen3
- Runtime shim for ragas 0.2 / langchain-community 0.4+ vertexai incompatibility

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-06-01 17:11:00 +09:00

3 Commits