1. RAG (Retrieval Augmented Generation):
- Retriever Component: Searches through a knowledge base.
- Generator Component: Produces outputs based on the retrieved data.
- Purpose: Helps models access up-to-date, domain-specific knowledge beyond their training data.
2. Using Vector Databases in the Real World:
- Query Process:
- A prompt is encoded and embedded.
- The embedding is sent to the vector database to find similar data.
- The retriever pulls relevant data.
- The model augments the prompt with the retrieved data and generates a response.
3. Hallucinations in LLMs:
- Hallucination: When the model generates a believable but incorrect response.
- RAG solves this by using an external knowledge base (typically a vector database) to provide accurate and relevant data.
4. Amazon Bedrock and RAG Models:
- Amazon Bedrock supports RAG models that integrate with custom knowledge bases.
- Applications: Used in question-answering, dialect systems, and content generation.
5. AWS Services for Vector Databases:
- Amazon OpenSearch Service: Stores embeddings and provides capabilities like semantic search, RAG, and recommendation engines.
- Other Services: Amazon Aurora, Redis, Amazon Neptune, Amazon DocumentDB, and Amazon RDS with PostgreSQL.
6. Amazon OpenSearch Service:
- Supports low-latency search, vector storage, and semantic search.
- BERT is used to enhance search relevance by generating language-based embeddings.
- The vector engine in Amazon OpenSearch Serverless helps with vector storage and search without managing the infrastructure.
7. Agents in Amazon Bedrock:
- Agents help automate multi-step tasks.
- Agents break down tasks, generate orchestration logic, and call APIs to connect to databases and perform actions.
- Example: An agent could help process reservations for a vacation by managing the necessary steps and interacting with external systems.