This engine leverages LlamaIndex's VectorStoreIndex to efficiently index and retrieve documents, and generate an answer in response to natural language queries. It use any LlamaIndex vector store.
By default the engine will use OpenAI's GPT-4o model (use the llm
parameter to change that).
Initializes the LlamaIndexQueryEngine with the given vector store. Args: vector_store: The vector store to use for indexing and querying documents. llm: LLM model used by LlamaIndex for query processing. You can find more supported LLMs at LLM. file_reader_class: The file reader class to use for loading documents. Only SimpleDirectoryReader is currently supported.
Source code in autogen/agentchat/contrib/rag/llamaindex_query_engine.py
| def __init__( # type: ignore[no-any-unimported]
self,
vector_store: "BasePydanticVectorStore",
llm: Optional["LLM"] = None,
file_reader_class: Optional[type["SimpleDirectoryReader"]] = None,
) -> None:
"""
Initializes the LlamaIndexQueryEngine with the given vector store.
Args:
vector_store: The vector store to use for indexing and querying documents.
llm: LLM model used by LlamaIndex for query processing. You can find more supported LLMs at [LLM](https://docs.llamaindex.ai/en/stable/module_guides/models/llms/).
file_reader_class: The file reader class to use for loading documents. Only SimpleDirectoryReader is currently supported.
"""
self.llm: LLM = llm or OpenAI(model="gpt-4o", temperature=0.0) # type: ignore[no-any-unimported]
self.vector_store = vector_store
self.file_reader_class = file_reader_class if file_reader_class else SimpleDirectoryReader
|
llm instance-attribute
llm = llm or OpenAI(model='gpt-4o', temperature=0.0)
vector_store instance-attribute
vector_store = vector_store
file_reader_class instance-attribute
file_reader_class = file_reader_class if file_reader_class else SimpleDirectoryReader
init_db
init_db(new_doc_dir=None, new_doc_paths_or_urls=None, *args, **kwargs)
Initialize the database with the input documents or records.
It takes the following steps: 1. Set up LlamaIndex storage context. 2. insert documents and build an index upon them.
PARAMETER | DESCRIPTION |
new_doc_dir | a dir of input documents that are used to create the records in database. TYPE: Optional[Union[Path, str]] DEFAULT: None |
new_doc_paths_or_urls | A sequence of input documents that are used to create the records in database. A document can be a Path to a file or a url. TYPE: Optional[Sequence[Union[Path, str]]] DEFAULT: None |
*args | TYPE: Any DEFAULT: () |
**kwargs | Any additional keyword arguments TYPE: Any DEFAULT: {} |
RETURNS | DESCRIPTION |
bool | True if initialization is successful TYPE: bool |
Source code in autogen/agentchat/contrib/rag/llamaindex_query_engine.py
| def init_db(
self,
new_doc_dir: Optional[Union[Path, str]] = None,
new_doc_paths_or_urls: Optional[Sequence[Union[Path, str]]] = None,
*args: Any,
**kwargs: Any,
) -> bool:
"""Initialize the database with the input documents or records.
It takes the following steps:
1. Set up LlamaIndex storage context.
2. insert documents and build an index upon them.
Args:
new_doc_dir: a dir of input documents that are used to create the records in database.
new_doc_paths_or_urls: A sequence of input documents that are used to create the records in database. A document can be a Path to a file or a url.
*args: Any additional arguments
**kwargs: Any additional keyword arguments
Returns:
bool: True if initialization is successful
"""
self.storage_context = StorageContext.from_defaults(vector_store=self.vector_store)
documents = self._load_doc(input_dir=new_doc_dir, input_docs=new_doc_paths_or_urls)
self.index = VectorStoreIndex.from_documents(documents=documents, storage_context=self.storage_context)
return True
|
connect_db
connect_db(*args, **kwargs)
Connect to the database. It sets up the LlamaIndex storage and create an index from the existing vector store.
PARAMETER | DESCRIPTION |
*args | TYPE: Any DEFAULT: () |
**kwargs | Any additional keyword arguments TYPE: Any DEFAULT: {} |
RETURNS | DESCRIPTION |
bool | True if connection is successful TYPE: bool |
Source code in autogen/agentchat/contrib/rag/llamaindex_query_engine.py
| def connect_db(self, *args: Any, **kwargs: Any) -> bool:
"""Connect to the database.
It sets up the LlamaIndex storage and create an index from the existing vector store.
Args:
*args: Any additional arguments
**kwargs: Any additional keyword arguments
Returns:
bool: True if connection is successful
"""
self.storage_context = StorageContext.from_defaults(vector_store=self.vector_store)
self.index = VectorStoreIndex.from_vector_store(
vector_store=self.vector_store, storage_context=self.storage_context
)
return True
|
add_docs
add_docs(new_doc_dir=None, new_doc_paths_or_urls=None, *args, **kwargs)
Add new documents to the underlying database and add to the index.
PARAMETER | DESCRIPTION |
new_doc_dir | A dir of input documents that are used to create the records in database. TYPE: Optional[Union[Path, str]] DEFAULT: None |
new_doc_paths_or_urls | A sequence of input documents that are used to create the records in database. A document can be a Path to a file or a url. TYPE: Optional[Sequence[Union[Path, str]]] DEFAULT: None |
*args | TYPE: Any DEFAULT: () |
**kwargs | Any additional keyword arguments TYPE: Any DEFAULT: {} |
Source code in autogen/agentchat/contrib/rag/llamaindex_query_engine.py
| def add_docs(
self,
new_doc_dir: Optional[Union[Path, str]] = None,
new_doc_paths_or_urls: Optional[Sequence[Union[Path, str]]] = None,
*args: Any,
**kwargs: Any,
) -> None:
"""Add new documents to the underlying database and add to the index.
Args:
new_doc_dir: A dir of input documents that are used to create the records in database.
new_doc_paths_or_urls: A sequence of input documents that are used to create the records in database. A document can be a Path to a file or a url.
*args: Any additional arguments
**kwargs: Any additional keyword arguments
"""
self._validate_query_index()
documents = self._load_doc(input_dir=new_doc_dir, input_docs=new_doc_paths_or_urls)
for doc in documents:
self.index.insert(doc)
|
query
Retrieve information from indexed documents by processing a query using the engine's LLM.
PARAMETER | DESCRIPTION |
question | A natural language query string used to search the indexed documents. TYPE: str |
RETURNS | DESCRIPTION |
str | A string containing the response generated by LLM. |
Source code in autogen/agentchat/contrib/rag/llamaindex_query_engine.py
| def query(self, question: str) -> str:
"""
Retrieve information from indexed documents by processing a query using the engine's LLM.
Args:
question: A natural language query string used to search the indexed documents.
Returns:
A string containing the response generated by LLM.
"""
self._validate_query_index()
self.query_engine = self.index.as_query_engine(llm=self.llm)
response = self.query_engine.query(question)
if str(response) == EMPTY_RESPONSE_TEXT:
return EMPTY_RESPONSE_REPLY
return str(response)
|