MQP Machine Learning Project



At my university (WPI), each student is required to complete a Major Qualifying Project (MQP), where a group of students and professors work on a project for 3 class-credits. The project is within their major and involves creating something (in this case, a program) and writing a paper about it. For my MQP, I worked in a team of four students with a sponsor (a financial company, whose name will be left out for confidentiality). The project was, essentially, utilizing machine learning techniques to aid them with sorting through their company's spending data.




Project Specificiations

The project involved developing and optimizing machine learning models for their problem of predicting spending data. They had previous models, utilizing Linear SVM and Random Forests, but they were slow and did not have as high of an accuracy as desired. Thus, we worked on optimizing the code and testing out new models.


We tried models such as Decision Trees, K-Nearest Neighbors, Neural Networks, and Gaussian Processes. We found that, of those four models and the other two used before, the Decision Trees and K-Nearest Neighbors were the best models. We also experimented with optimizing the parameters of the models. Furthermore, pre-processing of the data was also performed and tweaked, with things such as stop words being utilized. Finally, we attempted to implement Natural Language Processing (NLP) but found it added overhead with no benefit.


Outcomes

Ultimately, we were able to complete the project and produce improved code and models for the machine learning problem. Regarding what I learned, the biggest takeaway was working on a team for a full-time project. I got to learn how to utilize Agile Scrum, with the scrum meetings, restrospectives, sprints, and user stories. I also got familiar with industry software that I never saw before, such as WinSCP, Snowflake, and Control-M, while getting more familiarity with python and sci-kit learn.





Conclusion

Overall, I learned a lot of important technologies along with organizational practices from this assignment. This experience felt a lot more like a real-world project, as I was in constant contact with my team alongside mentors from the sponsor. It was a very rewarding project, especially when we got to see the machine learning models outperforming the expectations.