Skip to content

Commit fc964a8

Browse files
badmonster0georgeh0
authored andcommitted
Update ColPali image search example
Add PREFER_GRPC config at the top of main.py for easy switching between gRPC (default, port 6334) and HTTP (port 6333) Qdrant connections via environment variable. Frontend (App.jsx): Use window.location.hostname for API and image URLs, so devices on the same LAN can access the backend and images when the frontend is served on 0.0.0.0. This enables seamless LAN access to search and image results.
1 parent 8bdef5e commit fc964a8

File tree

6 files changed

+111
-53
lines changed

6 files changed

+111
-53
lines changed

examples/image_search/.env

Lines changed: 0 additions & 1 deletion
This file was deleted.

examples/image_search_colpali/README.md

Lines changed: 26 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
# Image Search with CocoIndex
1+
# Image Search with CocoIndex (ColPali Edition)
22
[![GitHub](https://img.shields.io/github/stars/cocoindex-io/cocoindex?color=5B5BD6)](https://github.com/cocoindex-io/cocoindex)
33

4-
We will build live image search and query it with natural language, using multimodal embedding model. We are going use CocoIndex to build real-time indexing flow. During running, you can add new files to the folder and it only process changed files and will be indexed within a minute.
4+
We will build live image search and query it with natural language, using a multimodal embedding model (ColPali). We use CocoIndex to build a real-time indexing flow. During running, you can add new files to the folder and it only processes changed files, indexing them within a minute.
55

66
We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/cocoindex) if this is helpful.
77

@@ -10,10 +10,10 @@ We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/c
1010

1111
## Technologies
1212
- CocoIndex for ETL and live update
13-
- CLIP ViT-L/14 - Embeddings Model for images and query
14-
- Qdrant for Vector Storage
15-
- FastApi for backend
16-
- Ollama (Optional) for generating image captions using `gemma3`.
13+
- **ColPali** - Multimodal Embeddings Model for images and query
14+
- Qdrant for Vector Storage (supports both gRPC and HTTP)
15+
- FastAPI for backend
16+
- Ollama (Optional) for generating image captions using `gemma3` or other models
1717

1818
## Setup
1919
- Make sure Postgres and Qdrant are running
@@ -22,8 +22,27 @@ We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/c
2222
export COCOINDEX_DATABASE_URL="postgres://cocoindex:cocoindex@localhost/cocoindex"
2323
```
2424

25-
## (Optional) Run Ollama
25+
## Qdrant Protocol Configuration
26+
- By default, the app uses **gRPC** (port 6334) to connect to Qdrant for best performance.
27+
- To use HTTP (port 6333) instead, change the config at the top of `main.py`:
28+
```python
29+
# Use GRPC (default)
30+
QDRANT_URL = os.getenv("QDRANT_URL", "localhost:6334")
31+
PREFER_GRPC = os.getenv("QDRANT_PREFER_GRPC", "true").lower() == "true"
32+
# Use HTTP (uncomment below to use HTTP)
33+
#QDRANT_URL = os.getenv("QDRANT_URL", "http://localhost:6333/")
34+
#PREFER_GRPC = os.getenv("QDRANT_PREFER_GRPC", "false").lower() == "true"
35+
```
36+
- You can also override these with environment variables:
37+
```sh
38+
export QDRANT_URL="localhost:6334" # for gRPC (default)
39+
export QDRANT_PREFER_GRPC=true # for gRPC (default)
40+
# or for HTTP:
41+
# export QDRANT_URL="http://localhost:6333/"
42+
# export QDRANT_PREFER_GRPC=false
43+
```
2644

45+
## (Optional) Run Ollama
2746
- This enables automatic image captioning
2847
```
2948
ollama pull gemma3

examples/image_search_colpali/frontend/src/App.jsx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
import React, { useState } from 'react';
22

3-
const API_URL = 'http://localhost:8000/search'; // Adjust this to your backend search endpoint
3+
const API_URL = `http://${window.location.hostname}:8000/search`; // Use same IP as frontend
44

55
export default function App() {
66
const [query, setQuery] = useState('');
@@ -42,7 +42,7 @@ export default function App() {
4242
{results.length === 0 && !loading && <div>No results</div>}
4343
{results.map((result, idx) => (
4444
<div key={idx} className="result-card">
45-
<img src={`http://localhost:8000/img/${result.filename}`} alt={result.filename} className="result-img" />
45+
<img src={`http://${window.location.hostname}:8000/img/${result.filename}`} alt={result.filename} className="result-img" />
4646
<div className="score">Score: {result.score?.toFixed(3)}</div>
4747
</div>
4848
))}

examples/image_search_colpali/main.py

Lines changed: 76 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -6,67 +6,110 @@
66
from typing import Any, Literal
77

88
import cocoindex
9-
import torch
9+
import numpy as np
1010
from dotenv import load_dotenv
11-
from fastapi import FastAPI, Query
11+
from fastapi import FastAPI, Query, HTTPException
1212
from fastapi.middleware.cors import CORSMiddleware
1313
from fastapi.staticfiles import StaticFiles
1414
from PIL import Image
1515
from qdrant_client import QdrantClient
16-
from transformers import CLIPModel, CLIPProcessor
16+
from colpali_engine.models import ColPali, ColPaliProcessor
17+
18+
19+
# --- Config ---
20+
21+
# Use GRPC
22+
QDRANT_URL = os.getenv("QDRANT_URL", "localhost:6334")
23+
PREFER_GRPC = os.getenv("QDRANT_PREFER_GRPC", "true").lower() == "true"
24+
25+
# Use HTTP
26+
# QDRANT_URL = os.getenv("QDRANT_URL", "localhost:6333")
27+
# PREFER_GRPC = os.getenv("QDRANT_PREFER_GRPC", "false").lower() == "true"
1728

1829
OLLAMA_URL = os.getenv("OLLAMA_URL", "http://localhost:11434/")
19-
QDRANT_URL = os.getenv("QDRANT_URL", "http://localhost:6334/")
20-
QDRANT_COLLECTION = "ImageSearch"
21-
CLIP_MODEL_NAME = "openai/clip-vit-large-patch14"
22-
CLIP_MODEL_DIMENSION = 768
30+
QDRANT_COLLECTION = "ImageSearchColpali"
31+
COLPALI_MODEL_NAME = os.getenv("COLPALI_MODEL", "vidore/colpali-v1.2")
32+
COLPALI_MODEL_DIMENSION = 1031 # Set to match ColPali's output
33+
34+
# --- ColPali model cache and embedding functions ---
35+
_colpali_model_cache = {}
36+
37+
38+
def get_colpali_model(model: str = COLPALI_MODEL_NAME):
39+
global _colpali_model_cache
40+
if model not in _colpali_model_cache:
41+
print(f"Loading ColPali model: {model}")
42+
_colpali_model_cache[model] = {
43+
"model": ColPali.from_pretrained(model),
44+
"processor": ColPaliProcessor.from_pretrained(model),
45+
}
46+
return _colpali_model_cache[model]["model"], _colpali_model_cache[model][
47+
"processor"
48+
]
49+
2350

51+
def colpali_embed_image(
52+
img_bytes: bytes, model: str = COLPALI_MODEL_NAME
53+
) -> list[float]:
54+
from PIL import Image
55+
import torch
56+
import io
2457

25-
@functools.cache
26-
def get_clip_model() -> tuple[CLIPModel, CLIPProcessor]:
27-
model = CLIPModel.from_pretrained(CLIP_MODEL_NAME)
28-
processor = CLIPProcessor.from_pretrained(CLIP_MODEL_NAME)
29-
return model, processor
58+
colpali_model, processor = get_colpali_model(model)
59+
pil_image = Image.open(io.BytesIO(img_bytes)).convert("RGB")
60+
inputs = processor.process_images([pil_image])
61+
with torch.no_grad():
62+
embeddings = colpali_model(**inputs)
63+
pooled_embedding = embeddings.mean(dim=-1)
64+
result = pooled_embedding[0].cpu().numpy() # [1031]
65+
return result.tolist()
66+
67+
68+
def colpali_embed_query(query: str, model: str = COLPALI_MODEL_NAME) -> list[float]:
69+
import torch
70+
import numpy as np
71+
72+
colpali_model, processor = get_colpali_model(model)
73+
inputs = processor.process_queries([query])
74+
with torch.no_grad():
75+
embeddings = colpali_model(**inputs)
76+
pooled_embedding = embeddings.mean(dim=-1)
77+
query_tokens = pooled_embedding[0].cpu().numpy() # [15]
78+
target_length = COLPALI_MODEL_DIMENSION
79+
result = np.zeros(target_length, dtype=np.float32)
80+
result[: min(len(query_tokens), target_length)] = query_tokens[:target_length]
81+
return result.tolist()
82+
83+
84+
# --- End ColPali embedding functions ---
3085

3186

3287
def embed_query(text: str) -> list[float]:
3388
"""
34-
Embed the caption using CLIP model.
89+
Embed the caption using ColPali model.
3590
"""
36-
model, processor = get_clip_model()
37-
inputs = processor(text=[text], return_tensors="pt", padding=True)
38-
with torch.no_grad():
39-
features = model.get_text_features(**inputs)
40-
return features[0].tolist()
91+
return colpali_embed_query(text, model=COLPALI_MODEL_NAME)
4192

4293

4394
@cocoindex.op.function(cache=True, behavior_version=1, gpu=True)
4495
def embed_image(
4596
img_bytes: bytes,
46-
) -> cocoindex.Vector[cocoindex.Float32, Literal[CLIP_MODEL_DIMENSION]]:
97+
) -> cocoindex.Vector[cocoindex.Float32, Literal[COLPALI_MODEL_DIMENSION]]:
4798
"""
48-
Convert image to embedding using CLIP model.
99+
Convert image to embedding using ColPali model.
49100
"""
50-
model, processor = get_clip_model()
51-
image = Image.open(io.BytesIO(img_bytes)).convert("RGB")
52-
inputs = processor(images=image, return_tensors="pt")
53-
with torch.no_grad():
54-
features = model.get_image_features(**inputs)
55-
return features[0].tolist()
101+
return colpali_embed_image(img_bytes, model=COLPALI_MODEL_NAME)
56102

57103

58-
# CocoIndex flow: Ingest images, extract captions, embed, export to Qdrant
59-
@cocoindex.flow_def(name="ImageObjectEmbedding")
104+
@cocoindex.flow_def(name="ImageObjectEmbeddingColpali")
60105
def image_object_embedding_flow(
61106
flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope
62107
) -> None:
63108
data_scope["images"] = flow_builder.add_source(
64109
cocoindex.sources.LocalFile(
65110
path="img", included_patterns=["*.jpg", "*.jpeg", "*.png"], binary=True
66111
),
67-
refresh_interval=datetime.timedelta(
68-
minutes=1
69-
), # Poll for changes every 1 minute
112+
refresh_interval=datetime.timedelta(minutes=1),
70113
)
71114
img_embeddings = data_scope.add_collector()
72115
with data_scope["images"].row() as img:
@@ -117,7 +160,7 @@ async def lifespan(app: FastAPI) -> None:
117160
cocoindex.init()
118161
image_object_embedding_flow.setup(report_to_stdout=True)
119162

120-
app.state.qdrant_client = QdrantClient(url=QDRANT_URL, prefer_grpc=True)
163+
app.state.qdrant_client = QdrantClient(url=QDRANT_URL, prefer_grpc=PREFER_GRPC)
121164

122165
# Start updater
123166
app.state.live_updater = cocoindex.FlowLiveUpdater(image_object_embedding_flow)
@@ -162,9 +205,7 @@ def search(
162205
{
163206
"filename": result.payload["filename"],
164207
"score": result.score,
165-
"caption": result.payload.get(
166-
"caption"
167-
), # Include caption if available
208+
"caption": result.payload.get("caption"),
168209
}
169210
for result in search_results
170211
]

examples/image_search_colpali/pyproject.toml

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,18 @@
11
[project]
2-
name = "image-search"
2+
name = "image-search-colpali"
33
version = "0.1.0"
4-
description = "Simple example for cocoindex: build embedding index based on images."
4+
description = "ColPali-based image search example for cocoindex."
55
requires-python = ">=3.11"
66
dependencies = [
77
"cocoindex>=0.1.67",
88
"python-dotenv>=1.0.1",
99
"fastapi>=0.100.0",
1010
"torch>=2.0.0",
11-
"transformers>=4.29.0",
1211
"qdrant-client>=1.14.2",
1312
"uvicorn>=0.34.3",
13+
"colpali-engine>=0.1.0",
14+
"Pillow>=10.0.0",
15+
"numpy>=1.24.0",
1416
]
1517

1618
[tool.setuptools]

pyproject.toml

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -31,11 +31,8 @@ features = ["pyo3/extension-module"]
3131
[project.optional-dependencies]
3232
dev = ["pytest", "pytest-asyncio", "ruff", "mypy", "pre-commit"]
3333

34-
embeddings = ["sentence-transformers>=3.3.1"]
35-
36-
# We need to repeat the dependency above to make it available for the `all` feature.
37-
# Indirect dependencies such as "cocoindex[embeddings]" will not work for local development.
38-
all = ["sentence-transformers>=3.3.1"]
34+
embeddings = ["sentence-transformers>=3.3.1", "colpali-engine"]
35+
all = ["cocoindex[embeddings]"]
3936

4037
[tool.mypy]
4138
python_version = "3.11"

0 commit comments

Comments
 (0)