Explainable Deep Clustering in Finance

Starting as a participation in BigData Competition

22.09 ~ 22.12

BigData Competition in NH Investment & Securities

Started and led a project in BigData Competition in NH Investment & Securities
Topic: Advanced Customer Profiling and Personalized Investment Portfolio Curation
Devised a clustering technique to profile financial investment proclivity of internal 7348 customers data in NH Investment & Securities
- Autoencoder: Linear, Convolutional
- Dimensionality Reduction: t-SNE
- Clustering: K-means
Utilized high dimensional data, including features of risk-preference, income level, and financial securities preference, along with external stock price data from S&P Capital IQ and investpy package to incorporate volatility data based on customer stock balance information.
Optimal Clusters, important features extracted by SHapley Additive exPlanations, and sample applications for financial service product are shown below.

Untitled

23.01~23.08

Summary

We present the high-dimensional data and feature set using novel network-based visualization methods and identify the multi-stage process’s optimal configuration.
The approach segments 14,837 potential customers, each with 163 categorical and 143 numerical features from National Survey of Tax and Benefit Data from Korea Institue of Public Finance
The first stage of the dimension reduction process employs deep neural network-based autoencoders.
The second and third stage uses a non-neural network-based dimension reduction algorithm and clustering algorithm contingent on clustering performance.
Subsequently, game theory-inspired Shapley values are computed for each feature to enhance explainability.
The optimal approach involves an autoencoder, isometric mapping to three dimensions, and K-means clustering.