Oil and Machine Learning

Description

In this project, we aim to use machine learning models to help predict the price and price direction of oil.

Goals

Our goal is to compare two or more machine-learning models for identifying price and price direction of oil. For our predictions, we will use natural language processing to draw insights from news articles for the past 22 years. In addition, we will use oil close prices/returns, gold prices, S&P 500, as well as times of unrest (Iraq War 2003-2011). Machine learning typically requires extensive data preparation before the model can be trained. We will use Jupyter to prepare a training and testing dataset, and to train and compare the machine-learning model.

Technologies

Our portfolio analysis will use the following technologies:

pandas
numpy
datetime
pathlib
nltk
matplotlib
analyzer
dotenv
New York Times API
yfinance API
warnings
tensorflow

Instructions

To get the project started on your local machine, clone the GitHub repository.
The first file we want to run is the crude_news_data. This will get the New York Times API data for a set amount of years. This may take around 45 minutes to run...
The end result of this notebook will export a combined_csv file in a headlines folder, with all other articles throughout each month.
Next, we use the crude_sentiment notebook that will get the news data from the combined_csv and run a sentiment analysis which will export an oil_sentiments csv file.
Once we have the sentiment analysis data, we will load historical oil data and apply time series analysis and modeling to determine whether there is any predictable behavior in the oil_series_analysis notebook.

Conclusion

The oil price prediction worked better with the LSTM model compared to Linear Regression Model and Bayesian Ridge Model. While the Linear Regression uses one feature to predict the price, the Bayesian Ridge model used the five features considered and predicted the price using a normal distribution and probability. The price direction under the classification model worked slightly better in the random forest classifier compared to logistic regression. The feature importance of war in the price prediction was identified to be minimal compared to other features considered which could also be due to the fact that we had considered only one war period (due to lack of data availability).

Questions

1. How has oil prices behaved in the past 22 years?

2. What is the sentiment of oil across the period based on news articles using NLP?

3. Identify other features for oil price movements (based on avialability of data)

4. Compare model performances with each other when predicting oil prices.
-Linear Regression

-LSTM

-Bayesian Ridge

5. Compare model performances with each other when predicting oil returns direction.
-Logistic Regression

-Random Forest

6. Compare feature importance in the movement of oil prices.

Contributors

Our team:

References and Resources

CNN Iraq War News
Yahoo Finance
How to Collect Data From The New York Times Over Any Period of Time
New York Times API
Introduction to Bayesian Linear Regression
Bayesian Ridge Regression

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
crude_news_data.ipynb		crude_news_data.ipynb
crude_sentiment.ipynb		crude_sentiment.ipynb
oil_series_analysis.ipynb		oil_series_analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Oil and Machine Learning

Description

Table of Contents

Goals

Technologies

Instructions

Conclusion

Questions

Contributors

GuilleMGN

ksmaria

Prabhdyals

Satheeshbm

Nitin-Thomas

dmerkulenko

References and Resources

License

About

Uh oh!

Releases

Packages

Languages

License

Prabhdyals/MachineLearning-Oil-Analysis

Folders and files

Latest commit

History

Repository files navigation

Oil and Machine Learning

Description

Table of Contents

Goals

Technologies

Instructions

Conclusion

Questions

Contributors

References and Resources

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages