1 Azure Container Apps란?
Azure Container Apps는 서버리스 컨테이너 플랫폼으로, Kubernetes 복잡도 없이 컨테이너를 실행할 수 있다.
주요 특징:
- Kubernetes 기반이지만 관리 불필요
- HTTPS Ingress 자동 구성
- Dapr 통합
- KEDA 기반 자동 스케일링
- Revision 관리 (Blue-Green 배포)
Azure Functions와 비교:
| 항목 | Functions | Container Apps |
|---|---|---|
| 실행 시간 | 최대 10분 | 무제한 |
| Cold Start | 3-5초 | 1-2초 |
| 커스터마이징 | 제한적 | 완전 제어 |
| WebSocket | 불가 | 가능 |
| 최소 인스턴스 | 0 (Consumption) | 0~30 |
| 비용 | 실행 시간 기반 | vCPU/메모리 시간 |
Container Apps 권장 시나리오:
- 실행 시간 > 10분
- WebSocket 또는 gRPC 필요
- 복잡한 종속성
- 지속적인 백그라운드 작업
- Multi-container (사이드카 패턴)
2 환경 설정
2.1 Docker 설치
Windows:
macOS:
Linux:
2.2 Azure CLI 확장
3 Dockerfile 작성
3.1 프로젝트 구조
rag-container/
├── app/
│ ├── __init__.py
│ ├── main.py
│ └── rag.py
├── Dockerfile
├── requirements.txt
└── .dockerignore
3.2 requirements.txt
3.3 app/main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import os
from app.rag import RAGSystem
app = FastAPI(title="RAG API", version="1.0.0")
# RAG 시스템 초기화
rag = RAGSystem()
class QueryRequest(BaseModel):
question: str
top_k: int = 3
class QueryResponse(BaseModel):
question: str
answer: str
sources: list[str]
@app.get("/")
async def root():
"""Health check"""
return {"status": "healthy", "service": "RAG API"}
@app.get("/health")
async def health():
"""Health probe endpoint"""
try:
# 컴포넌트 상태 확인
if rag.is_initialized():
return {"status": "healthy"}
else:
return {"status": "initializing"}
except Exception as e:
return {"status": "unhealthy", "error": str(e)}
@app.post("/query", response_model=QueryResponse)
async def query(request: QueryRequest):
"""RAG 쿼리 엔드포인트"""
try:
result = rag.query(
question=request.question,
top_k=request.top_k
)
return QueryResponse(
question=request.question,
answer=result["answer"],
sources=result["sources"]
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.post("/stream")
async def stream_query(request: QueryRequest):
"""스트리밍 응답"""
from fastapi.responses import StreamingResponse
async def generate():
async for chunk in rag.stream_query(request.question):
yield f"data: {chunk}\n\n"
return StreamingResponse(generate(), media_type="text/event-stream") 3.4 app/rag.py
import os
from langchain_openai import AzureOpenAIEmbeddings, AzureChatOpenAI
from langchain_community.vectorstores.azuresearch import AzureSearch
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
class RAGSystem:
def __init__(self):
self._initialized = False
self._init_components()
def _init_components(self):
"""RAG 컴포넌트 초기화"""
# Embeddings
self.embeddings = AzureOpenAIEmbeddings(
azure_deployment="text-embedding-3-small",
openai_api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
api_key=os.getenv("AZURE_OPENAI_API_KEY")
)
# Vector Store
self.vector_store = AzureSearch(
azure_search_endpoint=os.getenv("AZURE_SEARCH_ENDPOINT"),
azure_search_key=os.getenv("AZURE_SEARCH_API_KEY"),
index_name=os.getenv("AZURE_SEARCH_INDEX_NAME"),
embedding_function=self.embeddings.embed_query
)
# LLM
self.llm = AzureChatOpenAI(
azure_deployment=os.getenv("AZURE_OPENAI_DEPLOYMENT"),
openai_api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
temperature=0
)
# Retriever
self.retriever = self.vector_store.as_retriever(
search_kwargs={"k": 3}
)
# Prompt
prompt = ChatPromptTemplate.from_template(
"""다음 컨텍스트를 참고하여 질문에 답변하세요.
컨텍스트:
{context}
질문: {question}
답변:"""
)
# RAG Chain
def format_docs(docs):
return "\n\n".join([doc.page_content for doc in docs])
self.rag_chain = (
{"context": self.retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| self.llm
| StrOutputParser()
)
self._initialized = True
def is_initialized(self) -> bool:
return self._initialized
def query(self, question: str, top_k: int = 3) -> dict:
"""RAG 쿼리 실행"""
# 문서 검색
docs = self.retriever.invoke(question)
# 답변 생성
answer = self.rag_chain.invoke(question)
# 출처 추출
sources = [doc.metadata.get("source", "Unknown") for doc in docs]
return {
"answer": answer,
"sources": sources
}
async def stream_query(self, question: str):
"""스트리밍 쿼리"""
async for chunk in self.rag_chain.astream(question):
yield chunk 3.5 Dockerfile
FROM python:3.11-slim
# 작업 디렉토리
WORKDIR /app
# 시스템 패키지 업데이트
RUN apt-get update && apt-get install -y \
build-essential \
curl \
&& rm -rf /var/lib/apt/lists/*
# Python 종속성 설치
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# 애플리케이션 코드 복사
COPY ./app ./app
# 포트 노출
EXPOSE 8000
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
# 애플리케이션 실행
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"] 3.6 .dockerignore
__pycache__
*.pyc
*.pyo
*.pyd
.Python
env/
venv/
.venv/
.git
.gitignore
.dockerignore
Dockerfile
README.md
.env
4 로컬 테스트
4.1 Docker 빌드 및 실행
# 이미지 빌드
docker build -t rag-api:latest .
# 로컬 실행
docker run -p 8000:8000 \
-e AZURE_OPENAI_ENDPOINT="https://openai-rag.openai.azure.com/" \
-e AZURE_OPENAI_API_KEY="your-key" \
-e AZURE_OPENAI_DEPLOYMENT="gpt-4o" \
-e AZURE_OPENAI_API_VERSION="2024-02-01" \
-e AZURE_SEARCH_ENDPOINT="https://search-rag.search.windows.net" \
-e AZURE_SEARCH_API_KEY="your-key" \
-e AZURE_SEARCH_INDEX_NAME="rag-documents" \
rag-api:latest
# Health check
curl http://localhost:8000/health
# 쿼리 테스트
curl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-d '{"question": "Azure AI Search란?"}' 4.2 docker-compose.yml (개발용)
version: '3.8'
services:
rag-api:
build: .
ports:
- "8000:8000"
environment:
- AZURE_OPENAI_ENDPOINT=${AZURE_OPENAI_ENDPOINT}
- AZURE_OPENAI_API_KEY=${AZURE_OPENAI_API_KEY}
- AZURE_OPENAI_DEPLOYMENT=${AZURE_OPENAI_DEPLOYMENT}
- AZURE_OPENAI_API_VERSION=2024-02-01
- AZURE_SEARCH_ENDPOINT=${AZURE_SEARCH_ENDPOINT}
- AZURE_SEARCH_API_KEY=${AZURE_SEARCH_API_KEY}
- AZURE_SEARCH_INDEX_NAME=${AZURE_SEARCH_INDEX_NAME}
volumes:
- ./app:/app/app
command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload 실행:
5 Azure Container Registry
5.1 ACR 생성
5.2 이미지 푸시
# 이미지 태깅
docker tag rag-api:latest acrragprod.azurecr.io/rag-api:latest
docker tag rag-api:latest acrragprod.azurecr.io/rag-api:v1.0.0
# 이미지 푸시
docker push acrragprod.azurecr.io/rag-api:latest
docker push acrragprod.azurecr.io/rag-api:v1.0.0
# 이미지 확인
az acr repository list --name acrragprod --output table
az acr repository show-tags --name acrragprod --repository rag-api --output table 6 Container Apps Environment
6.1 Environment 생성
# Log Analytics Workspace 생성
az monitor log-analytics workspace create \
--resource-group rg-rag-prod \
--workspace-name law-rag-prod \
--location koreacentral
# Workspace ID 조회
WORKSPACE_ID=$(az monitor log-analytics workspace show \
--resource-group rg-rag-prod \
--workspace-name law-rag-prod \
--query customerId -o tsv)
WORKSPACE_KEY=$(az monitor log-analytics workspace get-shared-keys \
--resource-group rg-rag-prod \
--workspace-name law-rag-prod \
--query primarySharedKey -o tsv)
# Container Apps Environment 생성
az containerapp env create \
--name cae-rag-prod \
--resource-group rg-rag-prod \
--location koreacentral \
--logs-workspace-id $WORKSPACE_ID \
--logs-workspace-key $WORKSPACE_KEY 7 Container App 배포
7.1 ACR 통합
7.2 Container App 생성
az containerapp create \
--name ca-rag-api \
--resource-group rg-rag-prod \
--environment cae-rag-prod \
--image acrragprod.azurecr.io/rag-api:latest \
--target-port 8000 \
--ingress external \
--registry-server acrragprod.azurecr.io \
--registry-username $ACR_USERNAME \
--registry-password $ACR_PASSWORD \
--cpu 1.0 \
--memory 2.0Gi \
--min-replicas 0 \
--max-replicas 10 \
--env-vars \
AZURE_OPENAI_ENDPOINT="https://openai-rag.openai.azure.com/" \
AZURE_OPENAI_DEPLOYMENT="gpt-4o" \
AZURE_OPENAI_API_VERSION="2024-02-01" \
AZURE_SEARCH_ENDPOINT="https://search-rag.search.windows.net" \
AZURE_SEARCH_INDEX_NAME="rag-documents" \
--secrets \
azure-openai-key="your-openai-key" \
azure-search-key="your-search-key" \
--secret-env-vars \
AZURE_OPENAI_API_KEY=azure-openai-key \
AZURE_SEARCH_API_KEY=azure-search-key 7.3 FQDN 확인
7.4 배포 테스트
FQDN=$(az containerapp show \
--name ca-rag-api \
--resource-group rg-rag-prod \
--query properties.configuration.ingress.fqdn -o tsv)
# Health check
curl https://$FQDN/health
# 쿼리 테스트
curl -X POST https://$FQDN/query \
-H "Content-Type: application/json" \
-d '{"question": "Azure Container Apps란?"}' 8 스케일링 전략
8.1 HTTP 기반 스케일링
az containerapp update \
--name ca-rag-api \
--resource-group rg-rag-prod \
--min-replicas 1 \
--max-replicas 10 \
--scale-rule-name http-rule \
--scale-rule-type http \
--scale-rule-http-concurrency 50 설정 의미:
- min-replicas 1: 항상 1개 인스턴스 유지 (Cold Start 방지)
- max-replicas 10: 최대 10개까지 확장
- http-concurrency 50: 동시 요청 50개당 1개 인스턴스 추가
8.2 CPU/메모리 기반 스케일링
8.3 커스텀 메트릭 스케일링 (KEDA)
YAML 정의:
9 Revision 관리
9.1 Blue-Green 배포
# 새 Revision 배포 (트래픽 0%)
az containerapp update \
--name ca-rag-api \
--resource-group rg-rag-prod \
--image acrragprod.azurecr.io/rag-api:v2.0.0 \
--revision-suffix v2
# Revision 목록 확인
az containerapp revision list \
--name ca-rag-api \
--resource-group rg-rag-prod \
--output table
# 트래픽 분할 (50:50)
az containerapp ingress traffic set \
--name ca-rag-api \
--resource-group rg-rag-prod \
--revision-weight ca-rag-api--v1=50 ca-rag-api--v2=50
# 100% v2로 전환
az containerapp ingress traffic set \
--name ca-rag-api \
--resource-group rg-rag-prod \
--revision-weight ca-rag-api--v2=100
# 이전 Revision 비활성화
az containerapp revision deactivate \
--name ca-rag-api \
--resource-group rg-rag-prod \
--revision ca-rag-api--v1 9.2 롤백
10 멀티 컨테이너 (사이드카)
10.1 Redis 캐시 사이드카
containerapp.yaml:
properties:
template:
containers:
- name: rag-api
image: acrragprod.azurecr.io/rag-api:latest
env:
- name: REDIS_HOST
value: localhost
- name: REDIS_PORT
value: "6379"
resources:
cpu: 1.0
memory: 2.0Gi
- name: redis
image: redis:7-alpine
resources:
cpu: 0.5
memory: 1.0Gi 배포:
10.2 캐시 통합 코드
import redis
from functools import lru_cache
class RAGSystemWithCache(RAGSystem):
def __init__(self):
super().__init__()
self.redis_client = redis.Redis(
host=os.getenv("REDIS_HOST", "localhost"),
port=int(os.getenv("REDIS_PORT", 6379)),
decode_responses=True
)
def query(self, question: str, top_k: int = 3) -> dict:
# 캐시 확인
cache_key = f"rag:{question}"
cached = self.redis_client.get(cache_key)
if cached:
return json.loads(cached)
# RAG 실행
result = super().query(question, top_k)
# 캐시 저장 (10분)
self.redis_client.setex(cache_key, 600, json.dumps(result))
return result 11 모니터링
11.1 Application Insights 통합
# Application Insights 생성
az monitor app-insights component create \
--app rag-api-insights \
--location koreacentral \
--resource-group rg-rag-prod \
--workspace law-rag-prod
# Instrumentation Key 조회
INSTRUMENTATION_KEY=$(az monitor app-insights component show \
--app rag-api-insights \
--resource-group rg-rag-prod \
--query instrumentationKey -o tsv)
# Container App 업데이트
az containerapp update \
--name ca-rag-api \
--resource-group rg-rag-prod \
--set-env-vars APPLICATIONINSIGHTS_CONNECTION_STRING="InstrumentationKey=$INSTRUMENTATION_KEY" 11.2 OpenTelemetry 계측
requirements.txt 추가:
opentelemetry-api==1.21.0
opentelemetry-sdk==1.21.0
opentelemetry-instrumentation-fastapi==0.42b0
azure-monitor-opentelemetry-exporter==1.0.0b21 main.py 수정:
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from azure.monitor.opentelemetry.exporter import AzureMonitorTraceExporter
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
# Tracer 설정
trace.set_tracer_provider(TracerProvider())
tracer = trace.get_tracer(__name__)
exporter = AzureMonitorTraceExporter.from_connection_string(
os.getenv("APPLICATIONINSIGHTS_CONNECTION_STRING")
)
trace.get_tracer_provider().add_span_processor(
BatchSpanProcessor(exporter)
)
# FastAPI 자동 계측
app = FastAPI()
FastAPIInstrumentor.instrument_app(app)
@app.post("/query")
async def query(request: QueryRequest):
with tracer.start_as_current_span("rag_query") as span:
span.set_attribute("question.length", len(request.question))
result = rag.query(request.question)
span.set_attribute("answer.length", len(result["answer"]))
return result 11.3 로그 쿼리
// Container App 로그
ContainerAppConsoleLogs_CL
| where ContainerAppName_s == "ca-rag-api"
| where TimeGenerated > ago(1h)
| order by TimeGenerated desc
| take 100
// 시스템 로그
ContainerAppSystemLogs_CL
| where ContainerAppName_s == "ca-rag-api"
| where TimeGenerated > ago(1h)
| project TimeGenerated, Log_s
// HTTP 요청 분석
requests
| where cloud_RoleName == "ca-rag-api"
| summarize
Count=count(),
AvgDuration=avg(duration),
P95Duration=percentile(duration, 95)
by bin(timestamp, 5m)
| render timechart 12 CI/CD 파이프라인
12.1 GitHub Actions
.github/workflows/deploy.yml:
name: Build and Deploy to Azure Container Apps
on:
push:
branches:
- main
env:
AZURE_CONTAINER_REGISTRY: acrragprod
CONTAINER_APP_NAME: ca-rag-api
RESOURCE_GROUP: rg-rag-prod
IMAGE_NAME: rag-api
jobs:
build-and-deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Login to Azure
uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Login to ACR
run: |
az acr login --name ${{ env.AZURE_CONTAINER_REGISTRY }}
- name: Build and push image
run: |
IMAGE_TAG=${{ env.AZURE_CONTAINER_REGISTRY }}.azurecr.io/${{ env.IMAGE_NAME }}:${{ github.sha }}
docker build -t $IMAGE_TAG .
docker push $IMAGE_TAG
- name: Deploy to Container Apps
run: |
az containerapp update \
--name ${{ env.CONTAINER_APP_NAME }} \
--resource-group ${{ env.RESOURCE_GROUP }} \
--image ${{ env.AZURE_CONTAINER_REGISTRY }}.azurecr.io/${{ env.IMAGE_NAME }}:${{ github.sha }} 13 보안
13.1 Managed Identity
# System-assigned identity 활성화
az containerapp identity assign \
--name ca-rag-api \
--resource-group rg-rag-prod \
--system-assigned
# Principal ID 조회
PRINCIPAL_ID=$(az containerapp identity show \
--name ca-rag-api \
--resource-group rg-rag-prod \
--query principalId -o tsv)
# Azure OpenAI 권한 부여
az role assignment create \
--assignee $PRINCIPAL_ID \
--role "Cognitive Services OpenAI User" \
--scope /subscriptions/{subscription-id}/resourceGroups/rg-rag-prod/providers/Microsoft.CognitiveServices/accounts/openai-rag-prod
# Azure Search 권한 부여
az role assignment create \
--assignee $PRINCIPAL_ID \
--role "Search Index Data Reader" \
--scope /subscriptions/{subscription-id}/resourceGroups/rg-rag-prod/providers/Microsoft.Search/searchServices/search-rag-prod 13.2 DefaultAzureCredential 사용
from azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()
# OpenAI (Managed Identity)
llm = AzureChatOpenAI(
azure_deployment="gpt-4o",
azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
azure_ad_token_provider=lambda: credential.get_token(
"https://cognitiveservices.azure.com/.default"
).token
)
# Azure Search (Managed Identity)
from azure.search.documents import SearchClient
search_client = SearchClient(
endpoint=os.getenv("AZURE_SEARCH_ENDPOINT"),
index_name=os.getenv("AZURE_SEARCH_INDEX_NAME"),
credential=credential
) 13.3 네트워크 보안
# VNET 통합
az containerapp env create \
--name cae-rag-secure \
--resource-group rg-rag-prod \
--location koreacentral \
--infrastructure-subnet-resource-id /subscriptions/{sub-id}/resourceGroups/rg-rag-prod/providers/Microsoft.Network/virtualNetworks/vnet-rag/subnets/subnet-containerapp
# Internal Ingress (Private)
az containerapp create \
--name ca-rag-internal \
--resource-group rg-rag-prod \
--environment cae-rag-secure \
--image acrragprod.azurecr.io/rag-api:latest \
--ingress internal \
--target-port 8000 14 비용 최적화
14.1 가격 구조
Container Apps 비용:
- vCPU: $0.000024/초 ($0.0864/vCPU-시간)
- 메모리: $0.0000027/GB-초 ($0.00972/GB-시간)
- 요청: 첫 200만 건 무료, 이후 $0.40/100만 건
예시 계산 (1 vCPU, 2GB, min=1, max=10):
시나리오 1: 낮은 트래픽
- 평균 인스턴스: 1개
- 월간 시간: 730시간
- vCPU 비용: 1 × 730 × $0.0864 = $63.07
- 메모리 비용: 2GB × 730 × $0.00972 = $14.19
- 총 비용: ~$77/월
시나리오 2: 중간 트래픽
- 평균 인스턴스: 3개
- 월간 시간: 730시간
- vCPU 비용: 3 × 730 × $0.0864 = $189.22
- 메모리 비용: 3 × 2GB × 730 × $0.00972 = $42.57
- 총 비용: ~$232/월
14.2 비용 절감 팁
- min-replicas 조정
# 오프피크에 0으로 설정 (Cold Start 허용)
az containerapp update \
--name ca-rag-api \
--resource-group rg-rag-prod \
--min-replicas 0 - 리소스 최적화
# 필요한 최소 리소스만 할당
az containerapp update \
--name ca-rag-api \
--resource-group rg-rag-prod \
--cpu 0.5 \
--memory 1.0Gi - Consumption Plan 사용
15 참고 자료
15.1 공식 문서
16 다음 단계
컨테이너 배포가 완료되었다면, 이제 전체 시스템을 통합하자:
👉 09-End-to-End-Azure-RAG.qmd - End-to-End Azure RAG 시스템 구축