A modular NLP pipeline and interactive web app for sentiment analysis on text using Hugging Face Transformers. This project demonstrates scalable inference, binary and multi-class classification compatibility, model evaluation, and clean engineering practices.
- Goal: Predict whether a movie review expresses positive or negative sentiment
- Input Options: Upload your own CSV or use built-in sample data
- Models:
distilbert-base-uncased-finetuned-sst-2-english
(binary)tabularisai/multilingual-sentiment-analysis
(multiclass)
- Interface: Built using Streamlit for a deployable, interactive UI
- Features:
- Batch inference with tokenizer-aware chunking
- Model selection and label mapping
- Per-label filtering, explanation outputs, and downloadable results
imdb-sentiment-classifier/
├── src/
│ ├── preprocess.py # Tokenization and decoding
│ ├── inference.py # Hugging Face model inference
├── app.py # Streamlit app
├── LICENSE # MIT License
├── README.md # You are here
└── requirements.txt # Project dependencies
You can interact with the web app to upload review data, select the model type, and view predictions directly. Results can be filtered by label and downloaded as CSV.
Link to live app: https://sentiment-classifier-owen-wienczkowski.streamlit.app/
Make sure you're in the project root directory and have Python 3.8+.
- Install dependencies
pip install -r requirements.txt
- Run interactive web app
streamlit run app.py
Hugging Face Transformers and inference pipelines
Streamlit interface design for real-time inference
Modular, production-style pipeline design
Tokenization, decoding, label mapping
Custom evaluation logic for multiclass-to-binary transitions
Multilingual model handling and class consolidation
Usable, downloadable web interface for non-technical users