-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
Background
The internal class _SentenceTransformersEmbeddingBackendFactory
ensures that when both a text embedder and a document embedder use the same model, the model is loaded only once. (This also works for two text embedders, etc.)
This is done by computing an embedding_backend_id
. When a new embedder is warmed up, we check if the ID already exists → reuse the model; otherwise, load a new one and store it.
Issue
This the current implementation of embedding_backend_id
embedding_backend_id = f"{model}{device}{auth_token}{truncate_dim}{backend}" |
It does not account for all parameters that affect how the model is actually loaded in Sentence Transformers.
For example, model_kwargs
and tokenizer_kwargs
change the configuration of the underlying model, but are not reflected in the ID. This can cause different embedders to incorrectly share the same backend:
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
emb1 = SentenceTransformersDocumentEmbedder(model="mymodel")
emb2 = SentenceTransformersDocumentEmbedder(model="mymodel", tokenizer_kwargs={"model_max_length": 512})
print(emb1.embedding_backend is emb2.embedding_backend)
# True !!!
This could be considered a bug, although in practice it may not affect many users.
Potential improvements
In the _SentenceTransformersSparseEmbeddingBackendFactory
, I adopted a different strategy: the ID includes all parameters accepted by Sentence Transformers at model loading.
haystack/haystack/components/embedders/backends/sentence_transformers_sparse_backend.py
Line 48 in 1fb76ec
embedding_backend_id = json.dumps(cache_params, sort_keys=True, default=str) |
I think this is a better approach and should also be adopted in _SentenceTransformersEmbeddingBackendFactory
.
Minor additional improvement
I suggest also modifying _SentenceTransformersSparseEmbeddingBackendFactory.get_embedding_backend
and _SentenceTransformersSparseEncoderEmbeddingBackend.__init__
to only accept kwargs. Since these are internal classes, we can safely make this change.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status