TalkToFiles is a web application built using Gradio and Haystack, designed to enable users to upload documents and interact with their content through a conversational AI interface. This chatbot leverages Retrieval-Augmented Generation (RAG) to provide accurate and context-aware answers by retrieving information from document content.
- Document Upload: Upload multiple files in PDF, Markdown, or Text format.
- Document Store Creation: Automatically preprocess and store document content for querying.
- Conversational Interface: Ask questions and get responses based on uploaded document content.
- Memory Integration: Maintains context throughout the conversation for more natural interactions.
- Gradio: Provides the web interface for document uploads and chat functionality.
- Haystack: Powers the RAG pipeline for document preprocessing and retrieval.
- Cohere: Used for query rephrasing and response generation.
- SentenceTransformers: For creating document embeddings.
-
Clone the repository:
git clone <repository-url>
-
Navigate to the project directory:
cd TalkToFiles
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables:
- Create a
.env
file in the root directory. - Add your API key for Cohere:
COHERE_API_KEY=your_api_key_here
- Create a
-
Start the application:
python main.py
-
Upload Documents:
- Use the interface to upload PDF, Markdown, or Text files.
- Click the "Create Document Store" button to preprocess and store the documents.
-
Chat with Documents:
- Type your query in the chat interface.
- Receive contextually accurate responses based on the document content.
main.py
: Contains the Gradio interface and application logic.module.py
: Defines the pipelines and components for preprocessing, retrieval, and query handling.
- Upload a PDF document containing information about Artificial Intelligence.
- Initialize the document store.
- Ask the chatbot: "What is Artificial Intelligence?"
- Receive an accurate, document-based response.
This project is licensed under the MIT License. See the LICENSE
file for details.