An advanced memory-driven image & video frame retrieval system, enhancing CLIP with hashtag graphs and human feedback.
- (2025-05) Added Weighted Exploration Adjustment: Dynamically expands top-k based on score statistics to avoid local minima during refinement.
- Timeline: 25/07/2024 - 29/09/2024
- Tech Stack: Python, FastAPI, CLIP, FAISS, Docker, Graph Retrieval
- Dataset: AI Challenge 2024 (HCMC)
FrameFinderLE is designed to overcome CLIP's limitations in noisy environments by leveraging graph-based hashtag expansion and dynamic user feedback.
- Graph-based hashtag expansion (GRAFA)
- Dual Feedback: Real-time and Historical learning
- Frame-to-frame similarity adjustment
- Multi-modal search: Text, Hashtag, Image queries
- VideoID and Timestamp filtering for precise search
- ⚖️ Weighted Exploration Adjustment (New)
Dynamically adjusts the size of the top-k result set based on the distribution of similarity scores.
Prevents refinement from being trapped in suboptimal initial results by expanding the search scope only when statistically justified.
Enhances robustness without changing the user-visible result size.
Although the demo does not showcase the refined system based on user feedback, FrameFinderLE allows users to interact with the retrieved images through Like and Dislike buttons on each image in the results. These feedbacks are used to refine the search results, enhancing the system's accuracy and better aligning it with user preferences. As users provide feedback, the system dynamically adjusts the search results to reflect these interactions, continuously optimizing the retrieval process.
- Operating System: Windows 10/11, MacOS, or Linux
- CPU: Minimum 4 cores (8+ cores recommended for better performance)
- RAM: Minimum 8GB (16GB+ recommended)
- Storage: At least 10GB for database and installation
- Python: Python 3.9 or higher
- Internet Connection: Required to download database files
Note: FrameFinderLE is optimized to run on a regular CPU and does not require a GPU.
Note: You may need to install NLTK and download the 'punkt' dataset to avoid errors.
import nltk
nltk.download('punkt_tab')
docker pull thuyhale/frame_finder_le:latest
docker run -p 8000:8000 thuyhale/frame_finder_le:latest
git clone https://github.com/ThuyHaLE/FrameFinderLE.git
cd FrameFinderLE
pip install -r requirements.txt
python app.py