Releases: samestrin/llm-services-api
Releases · samestrin/llm-services-api
v0.0.4
0.0.4
- Tokenization: Convert input text into a list of token IDs, allowing you to process and manipulate text at the token level, default model
all-MiniLM-L6-v2
. - Detokenization: Reconstruct original text from a list of token IDs, allowing you to reverse the tokenization process, default model
all-MiniLM-L6-v2
.
0.0.3
- Adaptive Throttling: Implemented an adaptive throttling mechanism that delays requests using the
Retry-After
header when errors are encountered due to high request frequency or processing failures. The delay is dynamically adjusted based on the client’s request rate and error occurrences.
v.0.0.2
- OpenAI-Compatible Embeddings: Provides an endpoint that mimics the OpenAI embedding API, allowing easy integration with existing systems expecting OpenAI-like responses.
- Configurable Model Loading: Customize which Hugging Face NLP models are loaded by providing command-line arguments or configuring the
models_config.json
file. This flexibility allows the application to adapt to different resource environments or use cases.
Initial Release v0.0.1
- Text Summarization: Generate concise summaries of long texts using BART.
- Sentiment Analysis: Determine the sentiment of text inputs using a fine-tuned DistilBERT model.
- Named Entity Recognition: Identify entities within text and sort them by frequency.
- Paraphrasing: Rephrase sentences to produce semantically similar outputs using a T5 model.
- Keyword Extraction: Extract important keywords from text, with customizable output count using KeyBERT.
- Embedding Generation: Create vector representations of text using SentenceTransformers.