I am a Machine Learning Scientist with over 7 years of experience in applying ML techniques to various domains. I hold a PhD in Astrophysics and have over 12 years of experience in interdisciplinary scientific research. Throughout my career, I have contributed to several projects in astrophysics, neuroscience and banking. I am focused on developing cutting-edge ML algorithms and models to solve complex problems in diverse fields. I am passionate about using my skills and expertise to drive innovation and make a positive impact on society.
As a Decision scientist (Machine Learning Engineer at credit risk) at Monzo, I played a pivotal role in developing and maintaining machine-learning models for automated decision-making processes. My responsibilities included leading the development of a lightGBM-based model for the evaluation of overdraft applications, resulting in substantially improved accuracy compared to previous models. My contribution to feature selection analysis led to the gain of new insights into customer behaviour, driving further feature engineering initiatives.
In addition to model development, I was also responsible for monitoring and maintaining a series of machine-learning models in production. I restructured the internal library, significantly improving test coverage and ensuring a more uniform monitoring approach across all Monzo products. These efforts resulted in more extensive monitoring checks, providing valuable insights into the behaviour of the models, especially on subgroups of customers, and enabling better decision-making.
Overall, my contributions helped Monzo provide better prices and promote products suitable for the financial needs of over 7 million customers, and improved the creditworthiness evaluation process.
I was responsible for developing a highly efficient model training pipeline using Docker containers and Google Cloud platform. The data was managed and analysed using BigQuery, SQL, and DBT data models. The insights obtained from the analysis were summarized and presented in an intuitive and accessible format through Jupyter notebooks, providing valuable information to stakeholders and decision-makers.
While at King's College London, I played a crucial role in the development of Neurofind.ai, a cutting-edge machine-learning tool designed to aid in mental health diagnosis through neuroimaging. I was responsible for leading the technical side of the project and personally developed Neuroharmony, a machine-learning tool aimed at mitigating bias across different scanners. This innovative tool represented a significant step in bridging the gap between academic research and clinical implementation of neuroimaging-based diagnoses.
My responsibilities included hands-on machine learning implementation and development, brain imaging research, statistical analysis, data cleaning, preprocessing, scientific reporting, and supervision of MSc students. My expertise and contributions to the field were recognized through my authorship of an article on NeuroImage, the most important journal in the field, one chapter in the book "Machine Learning: Methods and Applications to Brain Disorders" and my co-authorship of five additional chapters in the same book, among other publications. This work showcased my ability to drive impactful research and development initiatives in the field of neuroimaging and mental health. It also shows my ability to quickly adapt to new domains, as my previous research experience was in Astrophysics.
I leveraged a wide range of tools to drive the development and implementation of these cutting-edge technologies. My contributions in creating data visualizations on the final product using Matplotlib, Pandas and Numpy, and developing secure and scalable AI solutions using Docker and Kubernetes, played a crucial role in the success of the project. I designed an API using Flask, Celery, Flower, and Redis to connect job requests from the frontend to the AI backend.
The quality of my work and its importance to the field were recognized with a £109,000 MRC research grant for the project "Using Artificial Intelligence to mitigate scanner bias in brain disorders." This grant further solidifies my ability to contribute to successful research initiatives and deliver impactful results in the field of machine learning.
Throughout my PhD, I applied machine learning techniques to analyze a large, high-resolution spectroscopy dataset of over 250,000 stars within our galaxy. I focused on developing and implementing unsupervised learning algorithms to group stars based on their spectroscopic properties, with the ultimate goal of improving traditional visual classification methods. Additionally, I employed dimensionality reduction techniques to improve machine learning performance on the dataset, which originally contained over 8,000 variables. Finally, I explored supervised learning algorithms to trace the origins of stars across various stellar clusters. Through this work, I gained extensive experience in applying machine learning to complex scientific datasets which improved my skills in data analysis, algorithm development, and scientific communication.
Throughout my research, I have been actively involved in academic writing, co-authoring six articles and being the main author of two publications in high-impact scientific journals. These publications have helped to disseminate my findings and contribute to the scientific community's understanding of the applications of machine learning in astrophysics.
I am proud to say that my work was recognized with the maximum recognition for a PhD at my institution, which is a Cum Laude. This experience has strengthened my skills in data analysis, programming, and scientific communication, and has prepared me to tackle complex challenges in data-driven fields.
During my Master's program, I conducted research on the chemical composition and ages of stellar clusters, using a combination of data analysis and programming skills. Specifically, I conducted a meta-analysis of dozens of articles from the literature and extracted data to reanalyze over 60 stellar clusters in a homogeneous and automated manner. Prior to my work, these objects were analyzed using a range of methods that relied on extensive human interaction and subjective interpretation. To make the process more efficient and consistent, I developed a Python algorithm that automated the analysis. Through this work, I was able to measure the gradient of chemical abundances across the Galaxy, gaining insights into the evolution of the Milky Way. Along the way, I gained valuable experience in developing complex programs in Python, as well as learning Fortran and Bash scripts, version control, unit testing, data visualization, and data analysis.