Llm on CharmingGroot

Llm on CharmingGroot https://charminggroot.github.io/tags/llm/ Recent content in Llm on CharmingGroot Hugo ko-kr Sun, 14 Jun 2026 00:00:00 +0000 080. RAG — 검색 증강 생성 파이프라인 https://charminggroot.github.io/posts/080-rag/ Sun, 14 Jun 2026 00:00:00 +0000 https://charminggroot.github.io/posts/080-rag/ RAG(Retrieval-Augmented Generation)는 LLM이 답변할 때 외부 지식을 검색해 컨텍스트로 주입하는 패턴이다. 모델 가중치에 없는 최신 정보나 도메인 특화 지식을 활용하고 환각을 줄인다. 인덱싱, 검색, 생성 세 단계와 각 단계의 개선 기법을 다룬다. 090. GPTQ — 사후 학습 양자화 https://charminggroot.github.io/posts/090-gptq/ Sun, 14 Jun 2026 00:00:00 +0000 https://charminggroot.github.io/posts/090-gptq/ GPTQ(2022)는 LLM 가중치를 4비트로 압축하는 사후 학습 양자화 방법이다. 재학습 없이 보정 데이터만으로 FP16 대비 4배 작은 모델을 만들고, 성능 손실을 최소화한다. 소비자 GPU에서 대형 모델을 실행하는 실용적인 방법이다.