Data Mining Applications in Banking Sector While Preserving Customer Privacy

Özge Doğuç

doi:10.28991/ESJ-2022-06-06-014

Authors

Özge Doğuç
oduguc@medipol.edu.tr
Department of Management Information Systems, Istanbul Medipol University, Istanbul,, Turkey

Vol. 6 No. 6 (2022): December

Research Articles

Downloads

PDF

Abstract
How to Cite
Metrics
References
License

In real-life data mining applications, organizations cooperate by using each other's data on the same data mining task for more accurate results, although they may have different security and privacy concerns. Privacy-preserving data mining (PPDM) practices involve rules and techniques that allow parties to collaborate on data mining applications while keeping their data private. The objective of this paper is to present a number of PPDM protocols and show how PPDM can be used in data mining applications in the banking sector. For this purpose, the paper discusses homomorphic cryptosystems and secure multiparty computing. Supported by experimental analysis, the paper demonstrates that data mining tasks such as clustering and Bayesian networks (association rules) that are commonly used in the banking sector can be efficiently and securely performed. This is the first study that combines PPDM protocols with applications for banking data mining.

Doi:10.28991/ESJ-2022-06-06-014

Full Text:PDF

Agrawal, R., & Srikant, R. (2000). Privacy-preserving data mining. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data - SIGMOD '00. doi:10.1145/342009.335438.

Cramer, R., Damgård, I., Nielsen, J.B. (2001). Multiparty Computation from Threshold Homomorphic Encryption. Advances in Cryptology ” EUROCRYPT 2001, Lecture Notes in Computer Science, 2045. Springer, Berlin, Germany. doi:10.1007/3-540-44987-6_18.

Kantarcioglu, M., & Clifton, C. (2004). Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering, 16(9), 1026–1037. doi:10.1109/TKDE.2004.45.

Du, W., & Zhan, Z. (2002). Building decision tree classifier on private data. Proceedings of the IEEE International Conference on Privacy, Security and Data Mining-Volume 14, 1–8. 1 December, Maebashi City, Japan.

Evfimievski, A., Srikant, R., Agrawal, R., & Gehrke, J. (2002). Privacy preserving mining of association rules. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD '02. doi:10.1145/775047.775080.

KantarcıoÇ§lu, M., Clifton, C. (2004). Privately Computing a Distributed K-NN Classifier. Knowledge Discovery in Databases: PKDD 2004. PKDD 2004. Lecture Notes in Computer Science, 3202. Springer, Berlin, Germany. doi:10.1007/978-3-540-30116-5_27.

Jagannathan, G., & Wright, R. N. (2005). Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. Proceeding of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining-KDD '05. doi:10.1145/1081870.1081942.

Wright, R., & Yang, Z. (2004). Privacy-preserving Bayesian network structure computation on distributed heterogeneous data. Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining- KDD '04. doi:10.1145/1014052.1014145.

Gilburd, B., Schuster, A., & Wolff, R. (2004). Privacy-preserving data mining on data grids in the presence of malicious participants. Proceedings. 13th IEEE International Symposium on High Performance Distributed Computing, 24 August 2004 Honolulu, HI, USA. doi:10.1109/hpdc.2004.1323540.

Yao, A. C. (1982). Protocols for secure computations. 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982). doi:10.1109/sfcs.1982.38.

Atallah, M.J., Du, W. (2001). Secure Multi-party Computational Geometry. Algorithms and Data Structures, WADS 2001, Lecture Notes in Computer Science, 2125. Springer, Berlin, Germany. doi:10.1007/3-540-44634-6_16.

Boudot, F., Schoenmakers, B., & Traoré, J. (2001). A fair and efficient solution to the socialist millionaires' problem. Discrete Applied Mathematics, 111(1–2), 23–36. doi:10.1016/S0166-218X(00)00342-5.

Paillier, P. (1999). Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. Advances in Cryptology” EUROCRYPT '99, EUROCRYPT 1999, Lecture Notes in Computer Science, 1592, Springer, Berlin, Germany. doi:10.1007/3-540-48910-X_16.

Du, W., Han, Y. S., & Chen, S. (2004). Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification. Proceedings of the 2004 SIAM International Conference on Data Mining. doi:10.1137/1.9781611972740.21.

Li, X., Yi, S., Cundy, A. B., & Chen, W. (2022). Sustainable decision-making for contaminated site risk management: A decision tree model using machine learning algorithms. Journal of Cleaner Production, 371, 133612.doi:10.1016/j.jclepro.2022.133612.

Du, W., & Zhan, Z. (2003). Using randomized response techniques for privacy-preserving data mining. Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD '03. doi:10.1145/956750.956810.

Beaver, D. (1997). Commodity-based cryptography (extended abstract). Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing-STOC '97. doi:10.1145/258533.258637.

Zhan, J., Matwin, S., Chang, L. (2005). Privacy-Preserving Collaborative Association Rule Mining. Data and Applications Security XIX. DBSec 2005, Lecture Notes in Computer Science, 3654. Springer, Berlin, Germany. doi:10.1007/11535706_12.

Hasheminejad, S. M. H., & Khorrami, M. (2018). Data mining techniques for analyzing bank customers: A survey. Intelligent Decision Technologies, 12(3), 303–321. doi:10.3233/IDT-180335.

Özmen, M., Aydoğan, E. K., Delice, Y., & Toksarı, M. D. (2020). Churn prediction in Turkey's telecommunications sector: A proposed multiobjective–cost-sensitive ant colony optimization. WIREs Data Mining and Knowledge Discovery, 10(1). doi:10.1002/widm.1338.

Matsunaga, F. T., Brancher, J. D., & Busto, R. M. (2014). Data mining applications and techniques: A systematic review. Rev. Eletrí´nica Argentina-Brasil Tecnologias da Informaçí£o e da Comunicaçí£o, 1(2).

Olufemi Ogunleye, J. (2022). The Concept of Data Mining. Intechopen, London, United Kingdom. doi:10.5772/intechopen.99417.

Li, Y., Jiang, X., Wang, S., Xiong, H., & Ohno-Machado, L. (2016). VERTIcal Grid lOgistic regression (VERTIGO). Journal of the American Medical Informatics Association, 23(3), 570–579. doi:10.1093/jamia/ocv146.

Das, A., Bhattacharyya, D. K., & Kalita, J. K. (2003). Horizontal vs. vertical partitioning in association rule mining: a comparison. Proceedings of the 6th International Conference on Computational Intelligence and Natural Computation (CINC), 1617-1620, 26-30 September, 2003, Embassy Suites Hotel and Conference Center, Cary, North Carolina, United States.

Hemlata, & Gulia, P. (2017). Novel algorithm for PPDM of vertically partitioned data. International Journal of Applied Engineering Research, 12(12), 3090–3096.

Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, 226–231, 2-4 August, 1996, Portland Oregon, United States.

Evfimievski, A., Gehrke, J., & Srikant, R. (2003). Limiting privacy breaches in privacy preserving data mining. Proceedings of the Twenty-Second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems-PODS '03. doi:10.1145/773153.773174.

Lindell, Y., & Pinkas, B. (2012). Secure two-party computation via cut-and-choose oblivious transfer. Journal of Cryptology, 25(4), 680–722. doi:10.1007/s00145-011-9107-0.

Yang, Z., & Wright, R. N. (2006). Privacy-preserving computation of bayesian networks on vertically partitioned data. IEEE Transactions on Knowledge and Data Engineering, 18(9), 1253–1264. doi:10.1109/TKDE.2006.147.

Goethals, B., Laur, S., Lipmaa, H., Mielikäinen, T. (2005). On Private Scalar Product Computation for Privacy-Preserving Data Mining. Information Security and Cryptology – ICISC 2004. ICISC 2004, Lecture Notes in Computer Science, 3506. Springer, Berlin, Germany. doi:10.1007/11496618_9.

Har-Peled, S., & Sadri, B. (2005). How fast is the k-means method? Algorithmica, 41(3), 185–202. doi:10.1007/s00453-004-1127-9.

Jagannathan, G., & Wright, R. N. (2005). Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. Proceeding of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining-KDD '05. doi:10.1145/1081870.1081942.

Freedman, M.J., Nissim, K., Pinkas, B. (2004). Efficient Private Matching and Set Intersection. Advances in Cryptology-EUROCRYPT 2004. EUROCRYPT 2004, Lecture Notes in Computer Science, 3027. Springer, Berlin, Germany. doi:10.1007/978-3-540-24676-3_1.

Bunn, P., & Ostrovsky, R. (2007). Secure two-party k-means clustering. Proceedings of the 14th ACM Conference on Computer and Communications Security- CCS2007. doi:10.1145/1315245.1315306.

Malkhi, D., Nisan, N., Pinkas, B., & Sella, Y. (2004). Fairplay-Secure Two-Party Computation System. USENIX Security Symposium, 9-13August, 2004, San Diego, United States.

Kissner, L., Song, D. (2005). Privacy-Preserving Set Operations. Advances in Cryptology – CRYPTO 2005, CRYPTO 2005, Lecture Notes in Computer Science, 3621. Springer, Berlin, Germany. doi:10.1007/11535218_15.

Acceptance Rate:	21%
Review Speed:	74 days
Issue Per Year:	6
Number of Volumes:	7
Number of Issues:	44
Number of Articles:	493
Number of Reviewers:	1187
Number of Contributors:	1394
Contributing Countries:	83
No. of WoS Citations:	2609
No. of Scopus Citations:	2936
No. of Google Citations:	4161
Google h-index:	29
Google i10-index:	126
Abstract Views:	681,807
PDF Download:	492,524

Data Mining Applications in Banking Sector While Preserving Customer Privacy

Authors

Downloads

Downloads

Login

submission

Publisher & Affiliated Societies

Indexing & Abstracting

SidebarMenu

IndexedBy

Indexing and Abstracting

twitter

Social Media

Analytics

Analytics

Information

Most Cited Articles

Impediments of Green Finance Adoption System: Linking Economy and Environment

Digital Transformation: Opportunities and Challenges for Leaders in the Emerging Countries in Response to Covid-19 Pandemic

Optical and Structural Characterization of Bi2FexNbO7 Nanoparticles for Environmental Applications

Thermal Regeneration and Reuse of Carbon and Glass Fibers from Waste Composites

Address

Contact Info:

Data Mining Applications in Banking Sector While Preserving Customer Privacy

Authors

Downloads

Downloads

Login

submission

Publisher & Affiliated Societies

Indexing & Abstracting

SidebarMenu

social

Journal Imprint

Journal Metrics

IndexedBy

Indexing and Abstracting

twitter

Social Media

Analytics

Analytics

Information

Most Cited Articles

Impediments of Green Finance Adoption System: Linking Economy and Environment

Digital Transformation: Opportunities and Challenges for Leaders in the Emerging Countries in Response to Covid-19 Pandemic

Optical and Structural Characterization of Bi2FexNbO7 Nanoparticles for Environmental Applications

Thermal Regeneration and Reuse of Carbon and Glass Fibers from Waste Composites