Data
The data provided includes the following, with the names and ids (join key) of 4668 stocks as common columns in all of them:
The data provided includes the following, with the names and ids (join key) of 4668 stocks as common columns in all of them:
Mutual Info Regression is used to detect feature importance:
This was the project for the course Financial Forensics, which gave us the financial statements and ratios of over 4000 stocks as well as prices at two time periods, and required us to make an investment portfolio, with a given investment budget.
Now that the dataset is fully preprocessed and has the right features, the Ridge estimator is fitted and feature importance is computed.
A diversified portfolio is made through proportionate allocation of the budget in each of the shortlists.
The dataframe was next preprocessed.
The fitted model is now used to predict t_2 prices and that is used to score each stock based on prediction error and growth.
Three different strategies are used to shortlist stocks. df is the dataframe contaning all the features that was made earlier.