Thematic Analysis of Big Data in Financial Institutions Using NLP Techniques with a Cloud Computing Perspective: A Systematic Literature Review
Abstract
:1. Introduction
2. Basic Terminology
2.1. Big Data
2.2. Cloud Implementations and Benefits
2.2.1. Cloud Computing Environments
2.2.2. Cloud Computing Service Categories
2.3. Thematic Analysis
2.4. Financial Data Categories
3. Literature Review
3.1. Bibliometric Analysis
3.1.1. Annual Scientific Production
3.1.2. Citation Analysis
3.1.3. Key Stats
3.1.4. Co-Occurrence Network
4. Discussion
5. Conclusions and Future Work
5.1. Conclusions
5.2. Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Chan, S.W.K.; Chong, M.W.C. Sentiment analysis in financial texts. Decis. Support. Syst. 2017, 94, 53–64. [Google Scholar] [CrossRef]
- Lima, L.; Portela, F.; Santos, M.F.; Abelha, A.; Machado, J. Big data for stock market by means of mining techniques. In Advances in Intelligent Systems and Computing; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
- Bibri, S.E. The anatomy of the data-driven smart sustainable city: Instrumentation, datafication, computerization and related applications. J. Big Data 2019, 6, 59. [Google Scholar] [CrossRef]
- Lin, C.; Kunnathur, A.S.; Li, L. Conceptualizing big data practices. Int. J. Account. Inf. Manag. 2020, 28, 205–222. [Google Scholar] [CrossRef]
- Anshari, M.; Almunawar, M.N.; Lim, S.A.; Al-Mudimigh, A. Customer relationship management and big data enabled: Personalization & customization of services. Appl. Comput. Inf. 2019, 15, 94–101. [Google Scholar]
- Pedro, J.; Brown, I.; Hart, M. Capabilities and Readiness for Big Data Analytics. Proc. Comput. Sci. 2019, 164, 3–10. [Google Scholar] [CrossRef]
- Bibri, S.E. Sustainable Urban Forms: Time to Smarten up with Big Data Analytics and Context–Aware Computing for Sustainability. In Smart Sustainable Cities of the Future; Elsevier: Amsterdam, The Netherlands, 2018. [Google Scholar]
- Bibri, S.E.; Krogstie, J. ICT of the new wave of computing for sustainable urban forms: Their big data and context-aware augmented typologies and design concepts. Sustain. Cities Soc. 2017, 32, 449–474. [Google Scholar] [CrossRef]
- Li, Q.; Chen, Y.; Wang, J.; Chen, Y.; Chen, H. Web Media and Stock Markets: A Survey and Future Directions from a Big Data Perspective. IEEE Trans. Knowl. Data Eng. 2018, 30, 381–399. [Google Scholar] [CrossRef]
- Hariri, R.H.; Fredericks, E.M.; Bowers, K.M. Uncertainty in big data analytics: Survey, opportunities, and challenges. J. Big Data 2019, 6, 44. [Google Scholar] [CrossRef]
- O’Halloran, S.; Maskey, S.; McAllister, G.; Park, D.K.; Chen, K. Big data and the regulation of financial markets. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Paris, France, 25–28 August 2015. [Google Scholar]
- Bhardwaj, A.; Narayan, Y.; Dutta, M. Sentiment Analysis for Indian Stock Market Prediction Using Sensex and Nifty. Proc. Comput. Sci. 2015, 70, 85–91. [Google Scholar] [CrossRef]
- Tsaih, R.H.; Kuo, B.S.; Lin, T.H.; Hsu, C.C. The Use of Big Data Analytics to Predict the Foreign Exchange Rate Based on Public Media: A Machine-Learning Experiment. IT Prof. 2018, 20, 34–41. [Google Scholar] [CrossRef]
- Hanafizadeh, P.; Harati Nik, M.R. Configuration of Data Monetization: A Review of Literature with Thematic Analysis. Glob. J. Flex. Syst. Manag. 2020, 21, 17–34. [Google Scholar] [CrossRef]
- Li, R.Y.M.; Song, L.; Li, B.; James, C.; Crabbe, M.; Yue, X.G. Predicting Carpark Prices Indices in Hong Kong Using AutoML. CMES Comput. Model. Eng. Sci. 2023, 134, 2247–2282. [Google Scholar]
- Arunachalam, D.; Kumar, N.; Kawalek, J.P. Understanding big data analytics capabilities in supply chain management: Unravelling the issues, challenges and implications for practice. Transp. Res. Part. E Logist. Transp. Rev. 2018, 114, 416–436. [Google Scholar] [CrossRef]
- Clarke, V.; Braun, V. Thematic analysis. J. Posit. Psychol. 2017, 12, 297–298. [Google Scholar] [CrossRef]
- Braun, V.; Clarke, V. Using thematic analysis in psychology. Qual. Res. Psychol. 2006, 3, 77–101. [Google Scholar] [CrossRef]
- Boyatzis, R. Transforming Qualitative Information: Thematic Analysis and Code Development; Sage: Thousand Oaks, CA, USA, 1998. [Google Scholar]
- Braun, V.; Clarke, V. Successful Qualitative Research: A Practical Guide for Beginners. Successful Qualitative Research: A Practical Guide for Beginners; Sage: Thousand Oaks, CA, USA, 2013. [Google Scholar]
- Petersen, K.; Gerken, J.M. #COVID-19: An exploratory investigation of hashtag usage on Twitter. Health Policy 2021, 125, 541–547. [Google Scholar]
- Xiang, X.; Lu, X.; Halavanau, A.; Xue, J.; Sun, Y.; Lai, P.H.L.; Wu, Z. Modern Senicide in the Face of a Pandemic: An Examination of Public Discourse and Sentiment About Older Adults and COVID-19 Using Machine Learning. J. Gerontol. B Psychol. Sci. Soc. Sci. 2021, 76, e190–e200. [Google Scholar] [CrossRef]
- Falcone, T.; Dagar, A.; Castilla-Puentes, R.C.; Anand, A.; Brethenoux, C.; Valleta, L.G.; Furey, P.; Timmons-Mitchell, J.; Pestana-Knight, E. Digital conversations about suicide among teenagers and adults with epilepsy: A big-data, machine learning analysis. Epilepsia 2020, 61, 951–958. [Google Scholar] [CrossRef]
- Mondal, B. Artificial intelligence: State of the art. In Book Artificial Intelligence: State of the Art; Springer: Berlin/Heidelberg, Germany, 2019; pp. 389–425. [Google Scholar]
- Van Banerveld, M.; Le-Khac, N.A.; Kechadi, M.T. Performance evaluation of a natural language processing approach applied in white collar crime investigation. In Future Data and Security Engineering; Springer: Berlin/Heidelberg, Germany, 2014. [Google Scholar]
- Guntuku, S.C.; Schneider, R.; Pelullo, A.; Young, J.; Wong, V.; Ungar, L.; Polsky, D.; Volpp, K.G.; Merchant, R. Studying expressions of loneliness in individuals using twitter: An observational study. BMJ Open 2019, 9, e030355. [Google Scholar] [CrossRef]
- Zheng, C.; Xue, J.; Sun, Y.; Zhu, T. Public opinions and concerns regarding the Canadian prime minister’s daily COVID-19 briefing: Longitudinal study of youtube comments using machine learning techniques. J. Med. Internet 2021, 23, e23957. [Google Scholar] [CrossRef]
- Rodriguez, M.Y.; Storer, H. A computational social science perspective on qualitative data exploration: Using topic models for the descriptive analysis of social media data. J. Technol. Hum. Serv. 2020, 38, 54–86. [Google Scholar] [CrossRef]
- Pérez, V.; Caro, R.; Rua Vieites, A. Unraveling the Complexities of Climate Change and Environment Migration: A Transformers-Based Topic Modelling Approach; 2023, preprint version. [CrossRef]
- Chang, T.; DeJonckheere, M.; Vydiswaran, V.G.V.; Li, J.; Buis, L.R.; Guetterman, T.C. Accelerating Mixed Methods Research With Natural Language Processing of Big Text Data. J. Mix. Methods Res. 2021, 15, 398–412. [Google Scholar] [CrossRef]
- Andreotta, M.; Nugroho, R.; Hurlstone, M.J.; Boschetti, F.; Farrell, S.; Walker, I.; Paris, C. Analyzing social media data: A mixed-methods framework combining computational and qualitative text analysis. Behav. Res. Method. 2019, 51, 1766–1781. [Google Scholar] [CrossRef] [PubMed]
- Akter, S.; Gunasekaran, A.; Wamba, S.F.; Babu, M.M.; Hani, U. Reshaping competitive advantages with analytics capabilities in service systems. Technol. Forecast. Soc. Chang. 2020, 159, 120180. [Google Scholar] [CrossRef]
- Bibri, S.E. The IoT for smart sustainable cities of the future: An analytical framework for sensor-based big data applications for environmental sustainability. Sustain. Cities Soc. 2018, 38, 230–253. [Google Scholar] [CrossRef]
- Akter, S.; Bandara, R.; Hani, U.; Fosso Wamba, S.; Foropon, C.; Papadopoulos, T. Analytics-based decision-making for service systems: A qualitative study and agenda for future research. Int. J. Inf. Manag. 2019, 48, 85–95. [Google Scholar] [CrossRef]
- Che, S.; Zhu, W.; Li, X. Anticipating Corporate Financial Performance from CEO Letters Utilizing Sentiment Analysis. Math. Probl. Eng. 2020, 4, 5609272. [Google Scholar] [CrossRef]
- Mbah, R.B.K.; Rege, M.; Misra, B. Using spark and scala for discovering latent trends in job markets. In Proceedings of the ICCDA 2019: Proceedings of the 2019 3rd International Conference on Compute and Data Analysis, New York, NY, USA, 14–17 March 2019. [Google Scholar]
- Gu, Y.; Storey, V.C.; Woo, C.C. Conceptual Modeling for Financial Investment with Text Mining; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
- Horkoff, J.; Barone, D.; Jiang, L.; Yu, E.; Amyot, D.; Borgida, A.; Mylopoulos, J. Strategic business modeling: Representation and reasoning. Softw. Syst. Model. 2014, 13, 1015–1041. [Google Scholar] [CrossRef]
- Hujala, M.; Knutas, A.; Hynninen, T.; Arminen, H. Improving the quality of teaching by utilising written student feedback: A streamlined process. Comput. Educ. 2020, 157, 103965. [Google Scholar] [CrossRef]
- Klein, L.F.; Eisenstein, J.; Sun, I. Exploratory thematic analysis for digitized archival collections. Digit. Scholarsh. Humanit. 2015, 30, i130–i141. [Google Scholar] [CrossRef]
- Odlum, M.; Yoon, S.; Broadwell, P.; Brewer, R.; Kuang, D. How twitter can support the HIV/AIDS response to achieve the 2030 eradication goal: In-depth thematic analysis of world AIDS day tweets. JMIR Public Health Surv. 2018, 4, e10262. [Google Scholar] [CrossRef] [PubMed]
- Tang, S.; Liu, Q.; Tan, W.A. Intention Classification based on Transfer Learning: A Case Study on Insurance Data; Springer: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
- Ni, Y.; Su, Z.; Wang, W.; Ying, Y. A novel stock evaluation index based on public opinion analysis. Proc. Comput. Sci. 2019, 147, 581–587. [Google Scholar] [CrossRef]
- Chen, M.Y.; Chen, T.H. Modeling public mood and emotion: Blog and news sentiment and socio-economic phenomena. Future Gen. Comput. Syst. 2019, 96, 692–699. [Google Scholar] [CrossRef]
- Esichaikul, V.; Phumdontree, C. Sentiment analysis of Thai financial news. In Proceedings of the ICSEB’18: Proceedings of the 2018 2nd International Conference on Software and e-Business, New York, NY, USA, 18–20 December 2018. [Google Scholar]
- Yan, J.; Wang, K.; Liu, Y.; Xu, K.; Kang, L.; Chen, X.; Zhu, H. Mining social lending motivations for loan project recommendations. Expert. Syst. Appl. 2018, 111, 100–106. [Google Scholar] [CrossRef]
- Konstantinidis, A.; Scalzodees, B.; Calvi, G.G.; Mandic, D.P. Text Mining—A Key Lynchpin in the Investment Process: A Survey; Series Frontiers in Artificial Intelligence and Applications, Applications of Intelligent Systems; IOS Press: Amsterdam, The Netherlands, 2018; Volume 310, pp. 181–193. [Google Scholar] [CrossRef]
- Skeen, S.; Jones, S.; Cruse, C.; Horvath, K. Integrating Natural Language Processing and Interpretive Thematic Analyses to Gain Human-Centered Design Insights on HIV Mobile Health: Proof-of-Concept Analysis. JMIR Hum. Factors 2022, 9, e37350. [Google Scholar] [CrossRef]
- Sallam, M. ChatGPT Utility in Health Care Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns. Healthcare 2023, 11, 887. [Google Scholar] [CrossRef]
- Watkins, R. Guidance for researchers and peer-reviewers on the ethical use of Large Language Models (LLMs) in scientific research workflows. AI Ethics 2023, 6–7. [Google Scholar] [CrossRef]
- Sallam, M. The Utility of ChatGPT as an Example of Large Language Models in Healthcare Education, Research and Practice: Systematic Review on the Future Perspectives and Potential Limitations. medRxiv 2023. [Google Scholar] [CrossRef]
- Yang, N. Financial Big Data Management and Control and Artificial Intelligence Analysis Method Based on Data Mining Technology. Wirel. Commun. Mob. Comput. 2022, 2022, 7596094. [Google Scholar] [CrossRef]
- Suciu, G.; Suciu, V.; Halunga, S.; Fratu, O. Big data, internet of things and cloud convergence for E-Health applications. In Book Big Data, Internet of Things and Cloud Convergence for E-Health Applications; Springer: Berlin/Heidelberg, Germany, 2015; pp. 151–160. [Google Scholar]
Stage | Description |
---|---|
Stage 1 | Identifying keywords and databases and performing searches based on individual keywords. n = ~1825 for bigdata finance n = ~693 for nlp finance |
Stage 2 | Assessing the results involving the PRISMA approach. Searching based on a combination of keywords. n = 73(nlp, bigdata, finance) + 73(ThematicAnalysis, nlp) + 63(ThematicAnalysis, bigdata) = 209 |
Stage 3 | Excluding studies based on title and abstract n = 123 |
Stage 4 | Excluding studies based on abstract/methodology/conclusion n = 98 |
Stage 5 | Including studies after reading the full text and discussion of results n = 53 |
Description | Results | |
---|---|---|
Main Information about Data | Timespan | 2013:2022 |
Sources (journals, books, etc.) | 116 | |
Documents | 142 | |
Average years from publication | 2.01 | |
Average citations per document | 5.352 | |
Average citations per year per doc | 1.524 | |
References | 1 | |
Document Types | Article | 91 |
Book chapter | 1 | |
Conference paper | 35 | |
Conference review | 1 | |
Letter | 3 | |
Review | 11 | |
Authors | Authors | 573 |
Author appearances | 615 | |
Authors of single-authored documents | 10 | |
Authors of multi-authored documents | 563 | |
Author Collaboration | Single-authored documents | 10 |
Documents per author | 0.248 | |
Authors per document | 4.04 | |
Co-authors per document | 4.33 | |
Collaboration index | 4.27 |
Journal Name | No. of Articles Produced |
---|---|
Journal of Medical Intern | 10 |
Lecture Notes in Computer | 6 |
JMR Formative Research | 3 |
JMR Public Health and Su | 3 |
PLOS One | 3 |
BMJ Open | 2 |
CEUR Workshop Proceedings | 2 |
Clinical Toxicology | 2 |
Journal of Advanced Research | 2 |
Journal of Hospitality and | 2 |
Lecture Notes in Electric | 2 |
Others | 1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sharma, R.K.; Bharathy, G.; Karimi, F.; Mishra, A.V.; Prasad, M. Thematic Analysis of Big Data in Financial Institutions Using NLP Techniques with a Cloud Computing Perspective: A Systematic Literature Review. Information 2023, 14, 577. https://doi.org/10.3390/info14100577
Sharma RK, Bharathy G, Karimi F, Mishra AV, Prasad M. Thematic Analysis of Big Data in Financial Institutions Using NLP Techniques with a Cloud Computing Perspective: A Systematic Literature Review. Information. 2023; 14(10):577. https://doi.org/10.3390/info14100577
Chicago/Turabian StyleSharma, Ratnesh Kumar, Gnana Bharathy, Faezeh Karimi, Anil V. Mishra, and Mukesh Prasad. 2023. "Thematic Analysis of Big Data in Financial Institutions Using NLP Techniques with a Cloud Computing Perspective: A Systematic Literature Review" Information 14, no. 10: 577. https://doi.org/10.3390/info14100577