🛡️ Fraudulent Transaction Detection System

A full-fledged fraud detection pipeline using machine learning and AI interpretability tools. This project includes data exploration, feature engineering, model selection (with XGBoost), visual explanations (SHAP), a Flask API, and a Streamlit frontend. We also integrated Google’s Gemini AI for advanced prediction interpretation.

📊 1. Dataset & Exploratory Data Analysis

Source: synthetic_fraud_dataset.csv
We conducted Exploratory Data Analysis (EDA) using ydata-profiling and custom visualizations to understand:
- Feature distributions
- Correlation heatmaps
- Class imbalance
- Potential data quality issues

The EDA helped inform downstream preprocessing and feature engineering.

🧠 2. Feature Engineering

We selected and transformed features to improve model accuracy and reduce noise.

✅ Important Features:

Account_Balance, IP_Address_Flag, Previous_Fraudulent_Activity
Daily_Transaction_Count, Risk_Score, Is_Weekend, Hour, Amount_Deviation, etc.

❌ Dropped/Transformed Features:

Identifiers like Transaction_ID, User_ID
Raw Timestamp, Device_Type, Location (transformed into derived features)
Low-informative fields like Card_Type, Authentication_Method

We scaled and normalized the final dataset, saving it as preprocessed_fraud_dataset_numerical_only.csv.

🏗️ 3. Model Selection & Training

We tested multiple classifiers:

✅ Logistic Regression
✅ Random Forest
✅ Gradient Boosting
✅ XGBoost (Best Performance)

After selecting XGBoost, we fine-tuned hyperparameters and saved the trained pipeline to fraud_detection_model_numerical.pkl.

📈 4. Model Validation & Interpretation

We built utility functions to:

Predict fraud likelihood on new transactions
Show evaluation metrics:
- Confusion Matrix
- Precision-Recall Curve
- Classification Report
- ROC-AUC Score
Visualize feature importances and interpret predictions with SHAP:

=== TRANSACTION DETAILS ===
Transaction_Amount: 1500
...
Amount_Deviation: 1300

=== PREDICTION ===
Prediction: FRAUD
Fraud Probability: 0.82

Visuals include:

SHAP summary bar plot
Force plots for individual transaction explanations

🧪 5. Flask API

We exposed the model via a REST API using Flask.

🔌 API Endpoints:

Method	Endpoint	Description
POST	`/predict`	Predict using XGBoost model
POST	`/ai-prediction`	Get prediction + explanation via Gemini
GET	`/health`	API health check

Example AI Prompt (Gemini):

"You are an AI fraud detection assistant. Based on the following transaction data..."

Uses gemini-2.5-pro-exp-03-25 for rich explanations alongside predictions.

💻 6. Streamlit Frontend

We developed a simple, responsive Streamlit app to interact with the API:

Enter transaction details manually
Get instant predictions
View explanations (via AI or SHAP)
Toggle light/dark theme

🚀 Running the Project

▶️ Run Combined Flask + Streamlit

bash app/start.sh

📦 Manual Steps (Optional)

Start the Flask API:

python app/api.py

Launch the Streamlit UI:

streamlit run app/frontend.py

📌 Technologies Used

Python (Pandas, Scikit-learn, XGBoost, SHAP)
Flask for backend API
Streamlit for UI
Google Gemini AI for natural language explanations
Matplotlib & Seaborn for visualization
Joblib for model serialization

📍 Future Improvements

Add authentication & rate limiting to API
Integrate live transaction feeds for real-time fraud scoring
Add model monitoring dashboard (e.g., with Grafana + Prometheus)
Expand AI explanation options using multiple LLMs

🤝 Contribution

Feel free to open issues, fork the repo, or create pull requests. Your feedback and ideas are welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
app		app
tests		tests
.gitignore		.gitignore
README.md		README.md
config.py		config.py
requirements.txt		requirements.txt
run.py		run.py
start.sh		start.sh
wsgi.py		wsgi.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🛡️ Fraudulent Transaction Detection System

📊 1. Dataset & Exploratory Data Analysis

🧠 2. Feature Engineering

✅ Important Features:

❌ Dropped/Transformed Features:

🏗️ 3. Model Selection & Training

📈 4. Model Validation & Interpretation

🧪 5. Flask API

🔌 API Endpoints:

💻 6. Streamlit Frontend

🚀 Running the Project

▶️ Run Combined Flask + Streamlit

📦 Manual Steps (Optional)

📌 Technologies Used

📍 Future Improvements

🤝 Contribution

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

FinTech-Software/ML-Model-For-Fraud-Detection-and-Risk-Analysis-API

Folders and files

Latest commit

History

Repository files navigation

🛡️ Fraudulent Transaction Detection System

📊 1. Dataset & Exploratory Data Analysis

🧠 2. Feature Engineering

✅ Important Features:

❌ Dropped/Transformed Features:

🏗️ 3. Model Selection & Training

📈 4. Model Validation & Interpretation

🧪 5. Flask API

🔌 API Endpoints:

💻 6. Streamlit Frontend

🚀 Running the Project

▶️ Run Combined Flask + Streamlit

📦 Manual Steps (Optional)

📌 Technologies Used

📍 Future Improvements

🤝 Contribution

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages