3.4 RAG, Vector Databases, and Agents

byDeepak Prasad
November 18, 2024

1. RAG (Retrieval Augmented Generation):

Retriever Component: Searches through a knowledge base.
Generator Component: Produces outputs based on the retrieved data.
Purpose: Helps models access up-to-date, domain-specific knowledge beyond their training data.

2. Using Vector Databases in the Real World:

Query Process:
A prompt is encoded and embedded.
The embedding is sent to the vector database to find similar data.
The retriever pulls relevant data.
The model augments the prompt with the retrieved data and generates a response.

3. Hallucinations in LLMs:

Hallucination: When the model generates a believable but incorrect response.
RAG solves this by using an external knowledge base (typically a vector database) to provide accurate and relevant data.

4. Amazon Bedrock and RAG Models:

Amazon Bedrock supports RAG models that integrate with custom knowledge bases.
Applications: Used in question-answering, dialect systems, and content generation.

5. AWS Services for Vector Databases:

Amazon OpenSearch Service: Stores embeddings and provides capabilities like semantic search, RAG, and recommendation engines.
Other Services: Amazon Aurora, Redis, Amazon Neptune, Amazon DocumentDB, and Amazon RDS with PostgreSQL.

6. Amazon OpenSearch Service:

Supports low-latency search, vector storage, and semantic search.
BERT is used to enhance search relevance by generating language-based embeddings.
The vector engine in Amazon OpenSearch Serverless helps with vector storage and search without managing the infrastructure.

7. Agents in Amazon Bedrock:

Agents help automate multi-step tasks.
Agents break down tasks, generate orchestration logic, and call APIs to connect to databases and perform actions.
Example: An agent could help process reservations for a vacation by managing the necessary steps and interacting with external systems.

Deepak Prasad

Leave a Reply Cancel reply