Classification of documents
This is the core of the project - classifying the document into a subcategory.
This is the core of the project - classifying the document into a subcategory.
We made a dashboard that shows all the current documents in a table, sorted in order of expiry date (nearest expiry date first), with the documents that have already expired being flagged red and user being notified.
The datasets consisted of Google ads performance, Microsoft ads performance and Meta ads performance data, with each dataframe containing features for date, campaign type (each channel could have multiple campaigns), cost of marketing, number of impressions generated, number of clicks and conversions.
This tool is meant to handle the retention workflow of all the documents in the company.
Once we have the subcategory of a document, computing the expiry is trivial.
Three features are engineered: Impression Rate (number of impressions / total cost), Click-Through Rate (CTR) (number of clicks / total impressions), and Conversion Rate (number of conversions / total clicks). There are some NaN and Inf values due to division by 0 in some cases – these are filled with 0.
We have only scratched the surface. A lot more can be done for this.
This project was done for the individual hackathon NetElixirAIgnition conducted by the marketing agency NetElixir digital solutions.
This project was started as part of an intra-company (Syngenta) Gen-AI hackathon, in a team of 4 members.
This section details the significant hackathons I have participated in and the corresponding projects.
The Platypus library's NSGAII algorithm is used for constrained multi-objective optimization. The Genetic Algorithm is run for generations number of generations.
The first step is to extract all the text from a document and preprocess it into a format suitable to be analyzed.