Kwangmin Kim - MINERVA Phase C-7 — 스킬 조합과 동적 선택 (Composition·Router 패턴)

1 조합 vs 라우팅 — 두 결정

같은 사용자 query에 다음 두 결정을 한다.

결정	답하는 질문	단위
Composition	여러 스킬을 어떻게 결합할 것인가	스킬 흐름 (워크플로)
Routing	어느 스킬을 호출할 것인가	스킬 한 개 선택

두 결정이 매번 함께 발생 — 라우터가 첫 스킬을 결정하면 그 스킬이 다음 스킬을 호출(composition)하고, 그 다음 라우터가 또 호출 등.

[사용자 query]
    ↓
[Router]   "summarize_doc 호출"
    ↓
[summarize_doc skill 실행]
    ↓                              ← composition (sequential)
[Router]   "extract_facts 호출"
    ↓
[extract_facts skill 실행]
    ↓
[응답 통합]

본 편은 두 결정의 패턴 카탈로그.

2 Composition 4 패턴

2.1 Sequential — 가장 흔함

def workflow_sequential(query: str, document: str) -> str:
    summary = registry.get("summarize_doc").execute(document=document)
    facts = registry.get("extract_facts").execute(text=summary)
    answer = registry.get("qa_chatbot").execute(query=query, context=facts)
    return answer

적합: 단계가 명확히 정해진 작업 (요약 → 추출 → 답변). 약점: 직렬 — 각 스킬 latency 합산. 큰 작업은 사용자 경험 ↓.

2.2 Parallel — 독립 작업 동시

async def workflow_parallel(query: str, document: str) -> dict:
    summary, facts, citations = await asyncio.gather(
        registry.get("summarize_doc").execute(document=document),
        registry.get("extract_facts").execute(text=document),
        registry.get("citation_check").execute(text=document),
    )
    return {"summary": summary, "facts": facts, "citations": citations}

적합: 결과가 서로 의존 안 하는 작업. 제약: 외부 API rate limit·메모리 부담 — asyncio.Semaphore로 동시성 제한.

2.3 Conditional — 분기

def workflow_conditional(query: str, doc: str):
    if has_table(doc):
        return registry.get("table_summarize").execute(document=doc)
    if doc_lang(doc) == "en" and user_lang() == "ko":
        translated = registry.get("translate").execute(text=doc)
        return registry.get("summarize_doc").execute(document=translated)
    return registry.get("summarize_doc").execute(document=doc)

적합: 입력 특성에 따라 다른 스킬이 적합한 경우. 위험: if 트리가 깊어지면 코드 부채 — 일정 깊이부터는 LLM 라우터로 전환 권장.

2.4 Recursive — 같은 스킬이 자신 또는 변형 반복

async def workflow_recursive(query: str, plan: list[str], state: dict):
    """C13 Plan-and-Execute 패턴."""
    for step in plan:
        skill = router.select_skill(step, state)
        result = await registry.get(skill).execute(**state)
        state[skill] = result

        # 중간 검증 — 새 plan 필요?
        if needs_replan(state):
            plan = await registry.get("planner").execute(query=query, state=state)
            return await workflow_recursive(query, plan, state)
    return state["answer"]

C13 멀티스텝 플래닝이 정확히 이 패턴.

적합: 작업 복잡도 미지·계획 변경 가능. 위험: 무한 루프·예산 폭증 — C24 step quota·C25 timeout 필수.

3 Router 4 패턴

3.1 Static Lookup — 가장 단순

INTENT_SKILL_MAP = {
    "knowledge_lookup": "qa_chatbot",
    "summarize_request": "summarize_doc",
    "translation": "translate",
    "code_review": "code_review",
}


def static_router(intent: str) -> str:
    return INTENT_SKILL_MAP.get(intent, "qa_chatbot")

강점: 즉시 실행, 디버깅 명확. 약점: 새 스킬 도입 시마다 코드·매핑 업데이트. 미세 분기 표현 어려움.

3.2 Semantic Search — Embedding 기반

C28 registry semantic search 직접 활용:

def semantic_router(query: str, top_k: int = 3) -> list[str]:
    candidates = registry.find_by_intent(query, top_k=top_k)
    return [s.skill_id for s, _ in candidates]


# 단일 결정
best = semantic_router(query)[0]
# 또는 top-k를 LLM에 넘겨 최종 선택

강점: 새 스킬 자동 발견 (description·tags 기반). 코드 변경 없이 라우팅 갱신. 약점: 임베딩 정확도에 의존. 미묘한 의도 차이 못 잡음.

3.3 LLM-driven — 가장 유연

ROUTER_PROMPT = """
사용자 질의: {query}

사용 가능한 스킬 (top-{k}):
{skill_descriptions}

이 질의를 가장 잘 처리할 스킬 ID를 선택. 이유 한 줄과 함께 JSON으로:
{{"skill_id": "...", "reason": "...", "confidence": 0~1}}
"""


def llm_router(query: str, k: int = 5) -> dict:
    candidates = registry.find_by_intent(query, top_k=k)
    descriptions = "\n".join(f"- {s.skill_id}: {s.description}" for s, _ in candidates)
    response = llm_call(ROUTER_PROMPT.format(query=query, k=k, skill_descriptions=descriptions))
    decision = json.loads(response)
    if decision["confidence"] < 0.5:
        return {"skill_id": "qa_chatbot", "reason": "fallback - low confidence"}
    return decision

강점: 미세 의도 파악, top-k 후보 중 최선 선택. 약점: - 비용 — 매 query마다 LLM 호출 (caching 필수) - latency — 100~500ms 추가 - 신뢰도 변동 — confidence 기반 fallback 의무

3.4 Hybrid — Rule + LLM

def hybrid_router(query: str, intent: str | None) -> str:
    # 1. Rule 우선 — 명확한 의도
    if intent and intent in INTENT_SKILL_MAP:
        return INTENT_SKILL_MAP[intent]

    # 2. Semantic으로 top-k
    candidates = registry.find_by_intent(query, top_k=5)

    # 3. top-1 confidence 충분하면 즉시
    top, score = candidates[0]
    if score > 0.85:
        return top.skill_id

    # 4. borderline → LLM 결정
    return llm_router(query, k=5)["skill_id"]

운영 권장 — rule 우선, LLM은 borderline (LLM 호출 비율 5~20%로 통제).

4 Latency vs Flexibility Trade-off

라우팅	latency	유연성	적합
Static lookup	0ms	낮음	명확한 intent·운영 안정
Semantic	10~50ms	중간	신규 스킬 자주 도입
LLM	100~500ms	높음	복잡한 query·미세 의도
Hybrid	평균 30~100ms	높음	운영 권장 (둘 다 활용)

5 State 관리 — LangGraph 통합

복잡한 composition은 LangGraph State machine으로:

from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated


class WorkflowState(TypedDict):
    query: str
    document: str | None
    summary: str | None
    facts: list | None
    answer: str | None
    skills_called: Annotated[list, lambda a, b: a + b]
    cost_usd: Annotated[float, lambda a, b: a + b]


def summarize_node(state: WorkflowState) -> dict:
    skill = registry.get("summarize_doc")
    summary = skill.execute(document=state["document"])
    return {"summary": summary, "skills_called": ["summarize_doc"]}


def extract_node(state: WorkflowState) -> dict:
    skill = registry.get("extract_facts")
    facts = skill.execute(text=state["summary"])
    return {"facts": facts, "skills_called": ["extract_facts"]}


def qa_node(state: WorkflowState) -> dict:
    skill = registry.get("qa_chatbot")
    answer = skill.execute(query=state["query"], context=state["facts"])
    return {"answer": answer, "skills_called": ["qa_chatbot"]}


def route_after_summarize(state: WorkflowState):
    return "extract" if state.get("summary") else END


graph = StateGraph(WorkflowState)
graph.add_node("summarize", summarize_node)
graph.add_node("extract", extract_node)
graph.add_node("qa", qa_node)
graph.set_entry_point("summarize")
graph.add_conditional_edges("summarize", route_after_summarize, {"extract": "extract", END: END})
graph.add_edge("extract", "qa")
graph.add_edge("qa", END)

State에 skills_called·cost_usd 누적 — composition 전체의 audit 한 번에.

6 C24 하네싱과 통합

각 스킬 호출 단계에 C24 가드 자동 적용:

# 데코레이터 chain
@with_audit
@with_quota(budget=token_budget)
@with_breaker(skill_breaker)
@with_timeout(seconds=30)
def execute_skill_safely(skill_id, **kwargs):
    skill = registry.get(skill_id)
    return skill.execute(**kwargs)

composition이 5단계라면 가드도 5번 — 매 단계의 input·output·tool 가드 통과 필수. 한 단계 차단 시 전체 workflow fallback.

7 Recursive Runaway 방지

class RecursionLimit:
    def __init__(self, max_depth: int = 5, max_calls: int = 50):
        self.max_depth = max_depth
        self.max_calls = max_calls
        self.depth = 0
        self.call_count = 0
        self.call_history = []

    def enter(self, skill_id: str):
        self.depth += 1
        self.call_count += 1
        self.call_history.append(skill_id)

        if self.depth > self.max_depth:
            raise RecursionError(f"depth {self.depth} > {self.max_depth}")
        if self.call_count > self.max_calls:
            raise RecursionError(f"calls {self.call_count} > {self.max_calls}")

        # 같은 스킬을 5번 연속 → 무한 루프 의심
        if self.call_history[-5:].count(skill_id) >= 5:
            raise RecursionError(f"loop suspected: {skill_id}")

    def exit(self):
        self.depth -= 1

LangGraph의 recursion_limit 옵션도 같은 효과 — 추가 layer로 도입.

8 자주 발생하는 함정

8.1 Chain Explosion

LLM router가 매 단계마다 다음 스킬 호출 결정 → 5 단계 plan에 5번 LLM 호출 → latency 폭증·비용 ↑.

해법: - 사전 plan (C13) 생성 후 일괄 실행 — 라우팅 LLM 호출 1번 - 또는 router cache — 같은 (intent, state) → 같은 다음 스킬

8.2 Recursive Runaway

A → B → A → B → … 무한 루프 (특히 self-correction 스킬).

해법: - recursion limit + step quota - 같은 스킬 연속 5회 시 강제 종료 - LangGraph recursion_limit 옵션

8.3 Semantic Mismatch

Semantic search가 description의 표현 차이로 잘못된 스킬 선택. 예: “PDF 정리” → summarize_doc 대신 extract_facts.

해법: - description 작성 가이드 — 사용 시점 의도 명시 - 사용자 피드백(“이 스킬이 적합했나요?”) 학습 → embedding fine-tune - top-k 후 LLM 결정 (hybrid 패턴)

8.4 State Explosion

LangGraph State에 모든 중간 결과 저장 → 메모리 폭증, checkpoint 비용 ↑.

해법: - State는 핵심 필드만 (summary·answer) - 큰 중간 데이터는 외부 storage·s3에 저장하고 ID만 state에 - TTL — 일정 시간 후 자동 삭제

8.5 Composition 무한 분기

Conditional의 if 트리가 깊어져 50가지 분기 → 코드 가독성 0, test 부담 폭증.

해법: - 일정 깊이 (3~4) 넘기면 router 패턴으로 전환 - 분기를 데이터로 (rule 표) — 코드는 lookup만 - 규칙 테스트 별도 (decision tree 단위)

8.6 Caching 어려움

Composition 결과를 캐시하려면 모든 입력 + 모든 스킬 버전이 키 — 키 폭발.

해법: - 단계별 캐시 (각 스킬 결과만) - 결정성 — random seed 고정 + 동일 모델·prompt - 캐시는 의미 있는 단계만 (요약 결과는 OK, 답변은 N)

8.7 Permission Creep via Composition

스킬 A는 read만 허용, B는 write 허용. A→B chain에서 A의 결과를 B가 무비판으로 처리 → write 권한 우회.

해법: - C24 가드가 매 단계마다 — 호출자(에이전트) 권한·스킬 권한·도구 권한 모두 확인 - composition 자체에 권한 (allowed_compositions: [A→B]) 명시

9 MINERVA 적용

app/skills/composition/
├── sequential.py            # async chain
├── parallel.py               # asyncio.gather + Semaphore
├── conditional.py            # if/branch helpers
├── recursive.py              # depth·call limit
└── langgraph_builder.py      # State + nodes 자동 생성

app/skills/router/
├── static.py                # INTENT_SKILL_MAP
├── semantic.py               # registry.find_by_intent wrapper
├── llm.py                    # LLM router + caching
├── hybrid.py                 # rule + LLM
└── recursion_limit.py        # 안전 가드

scripts/
├── workflow_test.py         # composition pattern golden eval
├── router_eval.py            # 라우팅 정확도 평가
└── chain_audit.py            # 운영 chain 분포 통계

C28 레지스트리·C24 가드 위에 자연스럽게 얹힘.

10 정리

영역	핵심
Composition 4패턴	Sequential·Parallel·Conditional·Recursive
Router 4패턴	Static·Semantic·LLM·Hybrid (운영 권장 Hybrid)
Trade-off	latency 0ms (static) ~ 500ms (LLM) vs 유연성
State 관리	LangGraph State + reducer (skills_called·cost 누적)
C24 통합	매 단계 audit·quota·breaker·timeout 자동
Recursion 안전	depth + call count + 같은 스킬 연속 검출
함정	chain explosion·runaway·semantic miss·state·branch·caching·permission creep

11 응용 분야

시나리오	적합 패턴
단순 질의·답변	Static router → 단일 skill
새 스킬 자주 도입	Semantic router (코드 변경 X)
복잡한 query·미세 의도	LLM router (Hybrid의 LLM 단계)
다단계 작업 (요약→추출→답변)	Sequential composition
독립 분석 (요약·인용 동시)	Parallel composition
입력 다양성 (PDF·HTML·이미지)	Conditional composition
작업 복잡도 미지	Recursive (Plan-and-Execute)

12 관련 주제

선행 학습 (선수)

C27 스킬 정의 — 조합·라우팅의 단위
C28 스킬 레지스트리 — semantic search 직접 활용
13편 LangGraph 기초 — State machine 토대
20편 Plan-and-Execute — Recursive composition 사례

18-LangGraph 시리즈 cross-reference

#18 스킬 라우팅 확장성과 한계 — Router 4패턴의 trade-off
#25 시스템 프롬프트 동적 주입 아키텍처 — LLM router의 prompt 설계
#10 프롬프트 분류와 라우팅 — intent → skill 매핑 이론

후속 (Phase C-7)

C30 스킬 테스트·품질 게이트 — composition·router 회귀 테스트

Cross-reference

C24 하네싱 — composition 단계마다 가드
C25 실행 제어 — recursion·timeout 안전
C16 Bandit — Router 4패턴 위에 학습 layer