Final_Fine_Bigdata_Analyst.pdf
Data Processing
- Systematic Data Processing and Preprocessing:
- Ensured clustering model can accurately learn relationships among data features.
- Additional External Data:
- Utilized Value Weighted Volatility to measure customer portfolio risk preferences based on end-of-month stock balance information.
- Employed S&P Capital IQ and investpy packages for this purpose.
Clustering Approach
- Manifold Learning for Latent Variable Extraction and Dimensionality Reduction:
- Used Autoencoders and t-SNE for dimensionality reduction.
- Implemented both Linear and Convolutional Autoencoders.
- Performed six K-means clustering variations with and without external data and Autoencoders:
- Clustering without Autoencoder or external data (Silhouette: 0.390)
- Clustering with external data, without Autoencoder (Silhouette: 0.386)
- Clustering with Linear Autoencoder, without external data (Silhouette: 0.414)
- Clustering with Linear Autoencoder and external data (Silhouette: 0.415)
- Clustering with Conv Autoencoder, without external data (Silhouette: 0.583)
- Clustering with Conv Autoencoder and external data (Silhouette: 0.582)
- Optimization Results:
- No significant performance difference due to external data.
- Noticeable performance improvement with Autoencoders.
- Conv Autoencoder outperformed Linear Autoencoder.
Clustering Visualization and Analysis
- Optimal Clustering and Visualization:
- Reduced data to 2D using Conv Autoencoder and t-SNE.
- Identified four optimal clusters.
- Cluster Analysis:
- Initial classification mainly by age and income.
- Potential for further refinement with additional data, such as spending habits.
- Future Enhancement:
- Analyze total asset changes with open banking or transfer data to differentiate asset changes due to returns or transfers.
Proposal for Financial Service Product
- Investment Behavior Classification:
- Aim to classify customer investment behavior and link it to portfolio-based financial product recommendations.
- Focus on lifetime portfolio management techniques like Target Date Funds.