📄️ Introduction
This was the project for the course Financial Forensics, which gave us the financial statements and ratios of over 4000 stocks as well as prices at two time periods, and required us to make an investment portfolio, with a given investment budget.
📄️ Data
The data provided includes the following, with the names and ids (join key) of 4668 stocks as common columns in all of them:
📄️ Preprocessing
The dataframe was next preprocessed.
📄️ Feature Selection
Mutual Info Regression is used to detect feature importance:
📄️ Model
Now that the dataset is fully preprocessed and has the right features, the Ridge estimator is fitted and feature importance is computed.
📄️ Scoring
The fitted model is now used to predict t_2 prices and that is used to score each stock based on prediction error and growth.
📄️ Shortlisting
Three different strategies are used to shortlist stocks. df is the dataframe contaning all the features that was made earlier.
📄️ Portfolio
A diversified portfolio is made through proportionate allocation of the budget in each of the shortlists.