Acessibilidade / Reportar erro

A COMPREHENSIVE DEEP LEARNING ALGORITHM TO UNDERSTAND THE ROLE OF SOCIAL MEDIA IN CONSUMER PERCEPTION OF GREEN CONSUMPTION

Un algoritmo de aprendizaje profundo comprensivo para comprender el papel de las redes sociales en la percepción del consumidor hacia el consumo verde

ABSTRACT

This research proposes a comprehensive deep-learning algorithm to understand the role of social media in consumer perception of green consumption. After the COVID-19 pandemic, society has shown increased focus on the relationship between people and nature. Achieving sustainable development goals requires promoting green consumption, which necessitates understanding and influencing public attitudes toward sustainability. While previous studies have explored green consumption using behavioral models and surveys, they often overlook the perspective of social media. This study uses deep learning techniques to analyze social media data, including text and video content, to gain insights into consumer behavior and preferences. The study entails collecting data from X (former Twitter) and YouTube, developing deep learning algorithms for text classification, and creating a visualization and reporting system. More specifically, this study aims to analyze the impact of social media information sharing on society’s green purchasing intentions and proposes advanced architectures for text mining specifically the LDA method. This study highlights the valuable insights from analyzing social media discourse on green consumption. Trends, emotional attitudes, and engagement were examined using text mining and sentiment analysis. The study reveals platform-specific differences in sentiment and identifies influential keywords and phrases. The analysis also uncovers emotional responses and key factors associated with the discourse on green consumption. The findings can inform future strategies for promoting sustainable consumption. The study concludes by emphasizing the importance of further research to explore the discrepancies between platforms and harness the implications of these findings for sustainable consumption strategies.

Keywords:
deep learning; social media; sustainability; green consumption; sustainable development goals

RESUMEN

Este artículo de investigación propone un algoritmo de aprendizaje profundo integral para comprender el papel de las redes sociales en la percepción del consumidor hacia el consumo verde. Después del brote de la COVID-19, la sociedad ha mostrado un mayor enfoque en la relación entre las personas y la naturaleza. Lograr los objetivos de desarrollo sostenible requiere promover el consumo verde, lo que implica comprender e influir en las actitudes públicas hacia la sostenibilidad. Si bien investigaciones previas han explorado el consumo verde utilizando modelos de comportamiento y encuestas, a menudo pasan por alto la perspectiva de las redes sociales. Aprovechando técnicas de aprendizaje profundo, este estudio tiene como objetivo analizar datos de las redes sociales, incluido contenido de texto y video, para obtener información sobre el comportamiento y las preferencias del consumidor. El estudio implica la recopilación de datos de X (anteriormente Twitter) y YouTube, el desarrollo de algoritmos de aprendizaje profundo para la clasificación de texto y la creación de un sistema de visualización e informes. Más específicamente, este estudio tiene como objetivo analizar el impacto de compartir información en las redes sociales en las intenciones de compra verde de la sociedad y proponer arquitecturas avanzadas para la minería de texto, específicamente el método LDA. Este estudio destaca las valiosas ideas obtenidas al analizar el discurso en las redes sociales sobre el consumo verde. Se examinaron tendencias, actitudes emocionales y participación mediante minería de texto y análisis de sentimiento. El estudio revela diferencias específicas de la plataforma en el sentimiento e identifica palabras clave y frases influyentes. El análisis también descubre respuestas emocionales y factores clave asociados con el discurso sobre el consumo verde. Los hallazgos pueden informar estrategias futuras para promover el consumo sostenible. El estudio concluye enfatizando la importancia de investigaciones adicionales para explorar las discrepancias entre plataformas y aprovechar las implicaciones de estos hallazgos para las estrategias de consumo sostenible.

Palabras clave:
aprendizaje profundo; redes sociales; sostenibilidad; consumo verde; objetivos de desarrollo sostenible

RESUMO

Esta pesquisa propõe um algoritmo abrangente de aprendizado profundo para compreender o papel das redes sociais na percepção do consumidor em relação ao consumo sustentável. Após o surgimento da Covid-19, a sociedade tem apresentado um foco maior sobre a relação entre as pessoas e a natureza. Alcançar os objetivos de desenvolvimento sustentável requer a promoção do consumo verde, o que exige compreender e influenciar as atitudes públicas em relação à sustentabilidade. Enquanto estudos anteriores exploraram o consumo sustentável usando modelos comportamentais e pesquisas, muitas vezes negligenciaram a perspectiva das redes sociais. Utilizando técnicas de aprendizado profundo, este estudo visa analisar dados das redes sociais, incluindo conteúdo de texto e vídeo, para obter insights sobre o comportamento e preferências do consumidor. O estudo envolve a coleta de dados do X (antigo Twitter) e do YouTube, o desenvolvimento de algoritmos de aprendizado profundo para classificação de texto e a criação de um sistema de visualização e relatório. Mais especificamente, este estudo visa analisar o impacto do compartilhamento de informações nas redes sociais nas intenções de compra sustentável da sociedade e propor arquiteturas avançadas para mineração de texto, especificamente o método LDA. Este estudo destaca os insights obtidos da análise do discurso das redes sociais sobre o consumo sustentável. Tendências, atitudes emocionais e engajamento foram examinados usando mineração de texto e análise de sentimento. O estudo revela diferenças específicas da plataforma no sentimento e identifica palavraschave e frases influentes. A análise também revela respostas emocionais e fatores-chave associados ao discurso sobre consumo sustentável. Os resultados podem apoiar na construção de futuras estratégias para promover o consumo sustentável. O estudo conclui enfatizando a importância de pesquisas adicionais para explorar as discrepâncias entre as plataformas e aproveitar as implicações dessas descobertas para estratégias de consumo sustentável.

Palavras-chave:
aprendizado profundo; redes sociais; sustentabilidade; consumo verde; objetivos de desenvolvimento sustentável

INTRODUCTION

Climate change and resource crises pose great challenges for the world. Faced with ecological damage and resource constraints, governments actively seek regenerative development pathways in which humans and nature can coexist harmoniously (Sharafi, 2021Sharifi, A. (2021). Co-benefits and synergies between urban climate change mitigation and adaptation measures: A literature review. Science of the Total Environment, 750, 141642. https://doi.org/10.1016/j.scitotenv.2020.141642
https://doi.org/10.1016/j.scitotenv.2020...
). Especially after the COVID-19 pandemic, society has increasingly focused on people and nature (Jian et al., 2020Jian, Y., Yu, I. Y., Yang, M. X., & Zeng, K. J. (2020). The impacts of fear and uncertainty of Covid-19 on environmental concerns, brand trust, and behavioral intentions toward green hotels. Sustainability, 12(20), 8688. https://doi.org/10.3390/su12208688
https://doi.org/10.3390/su12208688...
; Sun et al., 2021Sun, X., Su, W., Guo, X., & Tian, Z. (2021). The impact of awe induced by Covid-19 pandemic on green consumption behavior in China. International Journal of Environmental Research and Public Health, 18(2), 543. https://doi.org/10.3390/ijerph18020543
https://doi.org/10.3390/ijerph18020543...
). The public has begun to understand that while industrial civilization brings convenience to public life, it also brings many environmental problems, especially the high amount of carbon emissions that prevent the achievement of the Sustainable Development Goals (SDGs) (Yang Y. et al., 2022Yang, Y., Li, Y., & Guo, Y. (2022). Impact of the differences in carbon footprint driving factors on carbon emission reduction of urban agglomerations given SDGs: A case study of the Guanzhong in China. Sustainable Cities and Society, 85, 104024. https://doi.org/10.1016/j.scs.2022.104024
https://doi.org/10.1016/j.scs.2022.10402...
; Yang et al., 2023Yang, W., Feng, L., Wang, Z., & Fan, X. (2023). Carbon emissions and national sustainable development goals coupling coordination degree study from a global perspective: Characteristics, heterogeneity, and spatial effects. Sustainability, 15(11), 9070. https://doi.org/10.3390/su15119070
https://doi.org/10.3390/su15119070...
).

Achieving the Sustainable Development Goals (SDGs) is not enough with the efforts of government and businesses alone. This task requires society to choose a greener lifestyle, i.e., green consumption, in their daily lives (Akhtar et al., 2021Akhtar, R., Sultana, S., Masud, M. M., Jafrin, N., & Al-Mamun, A. (2021). Consumers’ environmental ethics, willingness, and green consumerism between lower and higher income groups. Resources, Conservation and Recycling, 168, 105274. https://doi.org/10.1016/j.resconrec.2020.105274
https://doi.org/10.1016/j.resconrec.2020...
). The key to promoting green consumption lies in changing public attitudes. Understanding society’s trend toward green consumption will help find the key point of consumers’ attitude change and increase their perceived effectiveness, influencing consumer group behavior.

Most previous studies have used behavioral models and theories to explore consumer purchase intentions and behaviors (Costa et al., 2021Costa, C. S. R., da Costa, M. F., Maciel, R. G., Aguiar, E. C., & Wanderley, L. O. (2021). Consumer antecedents towards green product purchase intentions. Journal of Cleaner Production, 313, 127964. https://doi.org/10.1016/j.jclepro.2021.127964
https://doi.org/10.1016/j.jclepro.2021.1...
; Mamun et al., 2018Al Mamun, A., Mohamad, M. R., Yaacob, M. R. B., & Mohiuddin, M. (2018). Intention and behavior towards green consumption among low-income households. Journal of Environmental Management, 227, 73-86. https://doi.org/10.1016/j.jenvman.2018.08.061
https://doi.org/10.1016/j.jenvman.2018.0...
; Zaremohzzabieh et al., 2021ZaremohzzabiehZaremohzzabieh, Z., Ismail, N., Ahrari, S., & Samah, A. A. (2021). The effects of consumer attitude on green purchase intention: A meta-analytic path analysis. Journal of Business Research, 132, 732-743. https://doi.org/10.1016/j.jbusres.2020.10.053
https://doi.org/10.1016/j.jbusres.2020.1...
). Although many scholars have examined the attitude-behavior gap in green consumption, most research is based on surveys or interview forms, which do not consider the social media perspective to evaluate green consumption. With the development of deep learning, it is possible to analyze consumer behavior using text mining and LDA techniques. Analysis of text data and videos from social media by deep learning algorithms provides insights for governments and business executives to grasp changes in consumer psychology over time, then adjust their strategies and develop green marketing plans based on real needs.

This study proposes a multimodal deep learning algorithm using social media to measure people’s perspective of green consumption. Literature has few studies that analyze social media data using text mining and deep learning techniques. First, this study uses social media data to understand people’s perceptions and attitudes toward green consumption. By analyzing text and video data from platforms such as X (former Twitter) and YouTube, we tried to determine the users’ thoughts and how they behave about environmentally friendly consumption. Secondly, this research generates visualization and reporting outputs to present the obtained data understandably and effectively. This enables governments and businesses to shape their green consumption strategies and marketing plans based on real needs. Decision-makers who review these outputs gain valuable insights into consumer preferences and trends, allowing for informed choices that advance sustainable practices. Furthermore, clear communication of this data fosters collaboration and understanding among stakeholders involved in promoting environmentally conscious initiatives, contributing to the overall success of green initiatives. Thirdly, examining how information sharing on social media affects the green purchasing intentions of society is important in terms of understanding what kind of changes have occurred in the public’s consumption habits and environmental attitudes. Finally, this research contributes to the development of new and effective architectures for text analysis and LDA applications. Such techniques are important tools to better understand consumer behavior and intentions regarding green consumption.

Overall, this study aims to measure the perception of green consumption using structural and semi-structured data on X and videos on YouTube. In this research, three steps of the consumer behavior detection process are discussed. Collecting data on green consumption on the social media platforms X and YouTube reduces the subjectivity brought by surveys and interviews through large-scale sampling, the development of real-time deep learning and transformative algorithms for text classification of social media, and the development of the visualization and reporting system. As a result, this study will analyze how information sharing in social media affects the green purchasing intentions of society and propose the development of cutting-edge architectures for text analysis and LDA applications with the ability to analyze and detect consumer purchase intentions and behaviors.

LITERATURE REVIEW

Consumption behavior that puts less burden on the environment is called green consumption (Li, 2020Li, M. (2020). Review of consumers’ green consumption behavior. American Journal of Industrial and Business Management, 10, 585-599. https://doi.org/10.4236/ajibm.2020.103039
https://doi.org/10.4236/ajibm.2020.10303...
). In the literature, consumer green behavior, corporate green production, and green marketing in social media are the topics of green consumption that are the most studied (Yao et al., 2022Yao, J., Guo, X., Wang, L., & Jiang, H. (2022). Understanding green consumption: A literature review based on factor analysis and bibliometric method. Sustainability, 14, 8324. https://doi.org/10.3390/su14148324
https://doi.org/10.3390/su14148324...
). Consumer’s understanding of green consumption and their perspectives on environmental issues are key to helping consumers become aware of green consumption and adopt green behaviors (Wang, 2021Wang, Y. (2021). Research on the influence mechanism of green cognition level on consumers’ green consumption behavior: An empirical study based on SPSS. International Conference on Management Science and Software Engineering (ICMSSE). https://doi.org/10.1109/ICMSSE53595.2021.00044
https://doi.org/10.1109/ICMSSE53595.2021...
). However, studies have observed that consumers show little engagement in sustainable practices, despite the high knowledge level of environmental issues (Ahamad & Ariffin, 2018Ahamad, N. R., & Ariffin, M. (2018). Assessment of knowledge, attitude, and practice towards sustainable consumption among university students in Selangor, Malaysia. Sustainable Production and Consumption, 16, 88-98. https://doi.org/10.1016/j.spc.2018.06.006
https://doi.org/10.1016/j.spc.2018.06.00...
; ElHaffar et al., 2020ElHaffar, G., Durif, F., & Dubé, L. (2020). Towards closing the attitude-intention-behavior gap in green consumption: A narrative review of the literature and an overview of future research directions. Journal of Cleaner Production, 275, 122556. https://doi.org/10.1016/j.jclepro.2020.122556
https://doi.org/10.1016/j.jclepro.2020.1...
; Groening et al., 2018Groening, C., Sarkis, J., & Zhu, Q. (2018). Green marketing consumer-level theory review: A compendium of applied theories and further research directions. Journal of Cleaner Production, 172, 1848-1866. https://doi.org/10.1016/j.jclepro.2017.12.002
https://doi.org/10.1016/j.jclepro.2017.1...
; Huang et al., 2022Huang, H., Long, R., Chen, H., Sun, K., & Li, Q. (2022). Exploring public attention about green consumption on Sina Weibo: Using text mining and deep learning. Sustainable Production and Consumption, 30, 674-685. https://doi.org/10.1016/j.spc.2021.12.017
https://doi.org/10.1016/j.spc.2021.12.01...
). Therefore, examining consumers’ perspectives of green consumption is an important issue in understanding consumer behavior.

According to Sajeewanie et al. (2019)Sajeewanie, L. A. C., Ab Yajid, M. S., Khatibi, A., Azam, F., & Tham, J.(2019). Integrated model for green purchasing intention and green adoption: Future research direction. Journal of Sociological Research, 10(2), 23-66. https://doi.org/10.5296/jsr.v10i2.14996
https://doi.org/10.5296/jsr.v10i2.14996...
, the earliest and most fundamental models of pro- environmental behavior were based on the linear development of environmental knowledge. Current studies on green consumer behavior have ignored the influence of the consumer’s environment. Also, recent research has favored quantitative regression models, viewing the problem as static rather than dynamic (Yang M. et al., 2022Yang, M., Chen, H., Long, R., & Yang, J. (2022). The impact of different regulation policies on promoting green consumption behavior based on social network modeling. Sustainable Production and Consumption, 32, 468-478. https://doi.org/10.1016/j.spc.2022.05.007
https://doi.org/10.1016/j.spc.2022.05.00...
).

Studies on green consumption and its sub-branches have been carried out for different sectors (Zhao et al., 2020Zhao, G., Geng, Y., Sun, H., Tian, X., Chen, W., & Wu, D. (2020). Mapping the knowledge of green consumption: A meta-analysis. Environmental Science and Pollution Research, 27, 44937-44950. https://doi.org/10.1007/s11356-020-11029-y
https://doi.org/10.1007/s11356-020-11029...
), from policy (Jiang & Gao, 2023Jiang, Z., & Gao, X. (2023). Text mining and quantitative evaluation of China’s green consumption policies based on green consumption objects. Environment, Development and Sustainability, 26(3), 6601-6622. https://doi.org/10.1007/s10668-023-02976-w
https://doi.org/10.1007/s10668-023-02976...
) to logistics (Agyabeng-Mensah et al., 2020Agyabeng-Mensah, Y., Afum, E., & Ahenkorah, E. (2020). Exploring financial performance and green logistics management practices: Examining the mediating influences of market, environmental and social performances. Journal of Cleaner Production, 258, 120613. https://doi.org/10.1016/j.jclepro.2020.120613
https://doi.org/10.1016/j.jclepro.2020.1...
) to manufacturing (Kluczek, 2017Kluczek, A. (2017). Quick green scan: A methodology for improving green performance in terms of manufacturing processes. Sustainability, 9(1), 88. https://doi.org/10.3390/su9010088
https://doi.org/10.3390/su9010088...
). Based on such studies that consider age, race, gender, belief, and personal characteristics, among others, it is possible to develop market segmentation strategies (Huseynov & Yıldırım, 2019Huseynov, F., & Yıldırım, S. O. (2019). Online consumer typologies and their shopping behaviors in B2C e-commerce platforms. Sage Open, 9(2), 1-19. https://doi.org/10.1177/2158244019854639
https://doi.org/10.1177/2158244019854639...
) to increase green consumption. Gender differences exist in the amount of sustainable consumption (Sarac, 2022Saraç, Ö. (2022). Kültür Turistlerinin Sürdürülebilir Tüketim Davranışlarının Cinsiyete Göre Farklılıkları Safranbolu Üzerinde Bir Araştırma. Journal of Humanities and Tourism Research, 12(2), 265-283.), with men in a certain age range being more distant than women toward green product consumption (Bedard & Reisdorf, 2018Bedard, S., & Reisdorf, C. A. (2018). Millennials’ green consumption behaviour: Exploring the role of social media. Corporate Social Responsibility and Environmental Management, 25(1), 1388-1396. https://doi.org/10.1002/csr.1654
https://doi.org/10.1002/csr.1654...
).

Most studies conducted on people’s perspectives toward green consumption have obtained data in the form of surveys (Jain et al., 2020Jain, V. K., Gupta, A., Tyagi, V., & Verma, H. (2020). Social media and green consumption behavior of millennials. Journal of Content, Community & Communication, 11, 221-230. https://doi.org/10.31620/JCCC.06.20/16
https://doi.org/10.31620/JCCC.06.20/16...
; Jalali & Khalid, 2019Jalali, S. S., & Khalid, H. (2019). Understanding Instagram influencers values in green consumption behaviour: A review paper. Open International Journal of Informatics, 7(Special Issue 1), 47-58. https://oiji.utm.my/index.php/oiji/article/view/115
https://oiji.utm.my/index.php/oiji/artic...
; Tang et al., 2020Tang, H., Xu, Y., Lin, A., Heidari, A. A., Wang, M., Chen, H., ... & Li, C. (2020). Predicting green consumption behaviors of students using efficient firefly grey wolf-assisted k-nearest neighbor classifiers. IEEE Access. https://doi.org/10.1109/ACCESS.2020.2973763
https://doi.org/10.1109/ACCESS.2020.2973...
).

Biswas (2016)Biswas, A. (2016). Impact of social media usage factors on green consumption behavior based on technology acceptance model. Journal of Advanced Management Science, 4(2), 92-97. https://doi.org/10.12720/joams.4.2.92-97
https://doi.org/10.12720/joams.4.2.92-97...
, Djafarova and Rushworth (2017)Djafarova, E., & Rushworth, C. (2017). Exploring the credibility of online celebrities’ Instagram profiles in influencing the purchase decisions of young female users. Computers in Human Behavior, 68, 1-7. https://doi.org/10.1016/j.chb.2016.11.009
https://doi.org/10.1016/j.chb.2016.11.00...
, Bedard and Reisdorf (2018)Bedard, S., & Reisdorf, C. A. (2018). Millennials’ green consumption behaviour: Exploring the role of social media. Corporate Social Responsibility and Environmental Management, 25(1), 1388-1396. https://doi.org/10.1002/csr.1654
https://doi.org/10.1002/csr.1654...
, Xie and Madni (2023)Xie, S., & Madni, G. (2023). Impact of social media on young generation’s green consumption behavior through subjective norms and perceived green value. Sustainability, 15, 3739. https://doi.org/10.3390/su15043739
https://doi.org/10.3390/su15043739...
used questionnaires to collect data to understand people’s perceptions of green consumption. Their findings complement each other: social media influences consumer green choosing behavior favorably (Biswas, 2016Biswas, A. (2016). Impact of social media usage factors on green consumption behavior based on technology acceptance model. Journal of Advanced Management Science, 4(2), 92-97. https://doi.org/10.12720/joams.4.2.92-97
https://doi.org/10.12720/joams.4.2.92-97...
), celebrities on Instagram, bloggers, YouTube channel owners and “Instafamous” profiler owners have the power to influence the purchasing behavior of young girls (Djafarova & Rushworth, 2017Djafarova, E., & Rushworth, C. (2017). Exploring the credibility of online celebrities’ Instagram profiles in influencing the purchase decisions of young female users. Computers in Human Behavior, 68, 1-7. https://doi.org/10.1016/j.chb.2016.11.009
https://doi.org/10.1016/j.chb.2016.11.00...
) and the millennial generation’s intent to make green purchases is positively correlated with social media and online interpersonal influence (Bedard & Reisdorf, 2018Bedard, S., & Reisdorf, C. A. (2018). Millennials’ green consumption behaviour: Exploring the role of social media. Corporate Social Responsibility and Environmental Management, 25(1), 1388-1396. https://doi.org/10.1002/csr.1654
https://doi.org/10.1002/csr.1654...
). Especially for young people (Xie & Madni, 2023Xie, S., & Madni, G. (2023). Impact of social media on young generation’s green consumption behavior through subjective norms and perceived green value. Sustainability, 15, 3739. https://doi.org/10.3390/su15043739
https://doi.org/10.3390/su15043739...
), social media is the primary source of information about green consumption (Ahamad & Ariffin, 2018Ahamad, N. R., & Ariffin, M. (2018). Assessment of knowledge, attitude, and practice towards sustainable consumption among university students in Selangor, Malaysia. Sustainable Production and Consumption, 16, 88-98. https://doi.org/10.1016/j.spc.2018.06.006
https://doi.org/10.1016/j.spc.2018.06.00...
) and has the strongest influence on consumer behavior (Jain et al., 2020Jain, V. K., Gupta, A., Tyagi, V., & Verma, H. (2020). Social media and green consumption behavior of millennials. Journal of Content, Community & Communication, 11, 221-230. https://doi.org/10.31620/JCCC.06.20/16
https://doi.org/10.31620/JCCC.06.20/16...
). Hence, studies on the impact of social media on green consumption and consumer behavior are important. Jalali and Khalid (2019)Jalali, S. S., & Khalid, H. (2019). Understanding Instagram influencers values in green consumption behaviour: A review paper. Open International Journal of Informatics, 7(Special Issue 1), 47-58. https://oiji.utm.my/index.php/oiji/article/view/115
https://oiji.utm.my/index.php/oiji/artic...
used survey data from Instagram users to study influencers’ green product purchasing habits, values, and precautions. They conducted their research using the Uses and Gratifications Theory (UGT) and the Theory of Reasoned Action (TRA) to find out how consumers’ green cognition affects their purchasing behavior.

Even though machine learning is being used in research on green consumption and its sub-branches (Ma & Qiao, 2021Ma, Y., & Qiao, E. (2021). Research on Accurate prediction of operating energy consumption of green buildings based on improved machine learning. IEEE International Conference on Industrial Application of Artificial Intelligence (IAAI).; Tanveer et al., 2020Tanveer, M., Richhariya, B., Khan, R. U., Rashid, A. H., Khanna, P., Prasad, M., & Lin, C. T. (2020). Machine learning techniques for the diagnosis of Alzheimer’s disease: A review. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 16(1s), 1-35. https://doi.org/10.1145/3344998
https://doi.org/10.1145/3344998...
), the numbers are still quite small as most of the researchers used behavioral models and theories (Costa et al., 2021Costa, C. S. R., da Costa, M. F., Maciel, R. G., Aguiar, E. C., & Wanderley, L. O. (2021). Consumer antecedents towards green product purchase intentions. Journal of Cleaner Production, 313, 127964. https://doi.org/10.1016/j.jclepro.2021.127964
https://doi.org/10.1016/j.jclepro.2021.1...
) and those who used machine learning used surveys and questionnaires for the data (Jain et al., 2020Jain, V. K., Gupta, A., Tyagi, V., & Verma, H. (2020). Social media and green consumption behavior of millennials. Journal of Content, Community & Communication, 11, 221-230. https://doi.org/10.31620/JCCC.06.20/16
https://doi.org/10.31620/JCCC.06.20/16...
).

Text mining is a tool that is widely used in complex data because it produces fast and accurate results in unstructured data analysis. Therefore, it is used in different fields of analysis, such as social network data analysis (Kunte & Panicker, 2019Kunte, A. V., & Panicker, S. (2019). Using textual data for personality prediction:A machine learning approach. 4th International Conference on Information Systems and Computer Networks (ISCON).; Park et al., 2022Park, J. Y., Mistur, E., Kim, D., Mo, Y., & Hoefer, R. (2022). Toward human-centric urban infrastructure: Text mining for social media data to identify the public perception of Covid-19 policy in transportation hubs. Sustainable Cities and Society, 76, 103524. https://doi.org/10.1016/j.scs.2021.103524
https://doi.org/10.1016/j.scs.2021.10352...
), policy test analysis (Jiang & Gao, 2023Jiang, Z., & Gao, X. (2023). Text mining and quantitative evaluation of China’s green consumption policies based on green consumption objects. Environment, Development and Sustainability, 26(3), 6601-6622. https://doi.org/10.1007/s10668-023-02976-w
https://doi.org/10.1007/s10668-023-02976...
; Lu & Park, 2022Lu, Y., & Park, S. D. (2022). Time series analysis of policy discourse on green consumption in China: Text mining and network analysis. Sustainability, 14(22), 14704. https://doi.org/10.3390/su142214704
https://doi.org/10.3390/su142214704...
), and medical analysis (Balcıoğlu, 2022Balcıoğlu, Y. S. (2022). Detection of depression and anxiety synmptoms via Twitter after Covid-19 with machine learning., 2. In Başkent International Conference On Multidisciplinary Studies (pp. 261-265).; Yazdavar et al., 2020Yazdavar, A. H., Mahdavinejad, M. S., Bajaj, G., Romine, W., Sheth, A., Monadjemi, A. H., ... & Hitzler, P. (2020). Multimodal mental health analysis in social media. Plos ONE, 15(4), 1-27. https://doi.org/10.1371/journal.pone.0226248
https://doi.org/10.1371/journal.pone.022...
), and tourism analysis (Imamah et al., 2020Imamah, I., Husni, H., Rachman, E. M., Suzanti, I. O., & Mufarroha, F. A (2020). Text mining and support vector machine for sentiment analysis of tourist reviews in Bangkalan Regency. Journal of Physics, 1477, 022023. https://doi.org/10.1088/1742-6596/1477/2/022023
https://doi.org/10.1088/1742-6596/1477/2...
). Text mining can prevent subjectivity surveys since it allows for first-hand information. Thus, more reliable results can be obtained.

Sentiment analysis is another area where text mining is used. Sentiment analysis performed on social media users is one of the most effective methods to examine the user’s emotions, preferences (Serrano et al., 2021Serrano, L., Ariza-Montes, A., Nader, M., Sianes, A., & Law, R. (2021). Exploring preferences and sustainable attitudes of Airbnb green users in the review comments and ratings: A text mining approach. In Sustainable Consumer Behaviour and the Environment (pp. 114-132). Routledge.), and behaviors. These studies can help decision-makers develop new strategies. Wu et al. (2021)Wu, Z., Zhang, Y., Chen, Q., & Wang, H. (2021). Attitude of Chinese public towards municipal solid waste sorting policy: A text mining study. Science of the Total Environment, 756, 142674. https://doi.org/10.1016/j.scitotenv.2020.142674
https://doi.org/10.1016/j.scitotenv.2020...
and Huang et al. (2021) analyzed the data from the Chinese social platform Sina Weibo with text mining and sentiment analysis. They observed that most Chinese people have positive thoughts about green consumption (Huang et al., 2021). However, recycling policies have a negative impact due to different factors such as fees, penalties, and irregular procedures. (Wu et al., 2021Wu, Z., Zhang, Y., Chen, Q., & Wang, H. (2021). Attitude of Chinese public towards municipal solid waste sorting policy: A text mining study. Science of the Total Environment, 756, 142674. https://doi.org/10.1016/j.scitotenv.2020.142674
https://doi.org/10.1016/j.scitotenv.2020...
). Jiang and Gao’s (2023)Jiang, Z., & Gao, X. (2023). Text mining and quantitative evaluation of China’s green consumption policies based on green consumption objects. Environment, Development and Sustainability, 26(3), 6601-6622. https://doi.org/10.1007/s10668-023-02976-w
https://doi.org/10.1007/s10668-023-02976...
text mining study on Chinese policies also supports this, stating that the country pays less attention to the recycling chain.

Brzustewicz and Singh (2021)Brzustewicz, P., & Singh, A. (2021). Sustainable consumption in consumer behavior in the time of covid-19: Topic modeling on twitter data using LDA. Energies, 14(18), 5787. https://doi.org/10.3390/en14185787
https://doi.org/10.3390/en14185787...
used text mining to categorize the data. According to data taken from X, the sustainable consumption topics that X users talk about can be distinguished as organic food consumption, food waste, vegan food, sustainable tourism, sustainable transportation, and sustainable energy consumption, and most of the users have a positive perspective of the subject (Brzustewicz & Singh, 2021Brzustewicz, P., & Singh, A. (2021). Sustainable consumption in consumer behavior in the time of covid-19: Topic modeling on twitter data using LDA. Energies, 14(18), 5787. https://doi.org/10.3390/en14185787
https://doi.org/10.3390/en14185787...
).

Our study will provide a different approach to the subject by using data sets from both X and YouTube. It will use machine learning algorithms to analyze both text and video data.

DATA COLLECTION

The data for this study was obtained from X and YouTube, two of the most prominent social media platforms. On X, two separate data sets were created using the X API (Application Programming Interface). The first data set was built based on a collection of post IDs acquired through keyword searches related to general consumption. The second data set consisted of posts about environmentally friendly consumption gathered over a year. To retrieve the required data associated with each post ID, we employed a technique known as text mining. The Tweepy Python library was utilized to interact with the X API and gather information about each post in JSON (JavaScript Object Notation) format. The second dataset from X was compiled using the X Streaming API, which provides real-time access to the platform’s data. For YouTube, the data was extracted from the comment sections of videos related to environmentally conscious consumption. This data collection process made use of YouTube’s API, employing the Google API client for Python to extract the data.

This research was conducted as a part of a wider study that looked at posts connected to environmentally conscious consumption on X. The studied data was originally acquired in two separate studies examining discussions on the platform about green intake during the first several months of the epidemic. The first data set was constructed based on a collection of post IDs obtained via keywords linked to general consumption. We employed a technique known as text-mining, which consisted of compiling all the relevant information that could be found about each tweet into a file written in JavaScript Object Notation (JSON) format using the X Search API on each post ID. This procedure gathered information from the four million posts accessible on X (at that time, “Twitter”) in December 2020. The second data set consisted of roughly 6 million posts about environmentally friendly consumption gathered from December 2019 to December 2020 using the Streaming API on X. Both data sets were then filtered so that only the dates of overlap were included. Additionally, any posts or comments repeated across both data sets were removed. The completed data set consisted of 1,227,170 distinct posts and comments. The entity object of each post’s JSON representation was parsed for any URLs that might be used to access YouTube videos, and those found were collected and analyzed. Figure 1 shows a graphical flow chart for the data selection and exclusion process.

Figure 1
Data Set Combination, Filtering, and Exclusion Process

Explanation of the data collection process

Data sources identification

X and YouTube

They were chosen due to their prominence as social media platforms and their significant role in shaping public discourse, especially regarding environmental issues.

Rationale for Platform Selection

These platforms were selected for their diverse user bases and the different types of engagement they foster (posts, reposts, comments).

Data retrieval methods

X

API Usage: The X API was utilized to collect posts and reposts. This involved using specific keywords related to green consumption.

Data Sets: We created two separate data sets - one from general consumption-related posts and another focusing on environmentally friendly consumption.

Streaming API: For real-time data, the X Streaming API was employed.

YouTube

API for Comments: The YouTube API was used to extract comments from videos related to environmentally conscious consumption.

Filtering Process: We focused on videos and comments that were directly relevant to green consumption.

Data collection parameters

Time frame

We covered a significant period to capture the evolution of the discourse, particularly highlighting changes during the first months of the COVID-19 pandemic.

Keyword-Based Collection: We used keywords and hashtags relevant to green consumption to filter and collect data.

Language and Demographic Considerations: If applicable, mention if the study was limited to specific languages or demographics.

Data processing

Text mining techniques

We employed text mining to parse and analyze the collected data, ensuring relevant information was extracted.

Data cleaning

This involved preprocessing steps like removing duplicates and irrelevant content to refine the data set.

Ethical considerations

Public data

We ensured that all collected data was publicly available, adhering to the platforms’ ethical guidelines and terms of service.

Anonymization and privacy

We took measures to anonymize data and respect user privacy.

Justification for data collection approach

Alignment with research goals

The primary objective of our study was to understand the role of social media in shaping consumer perception of green consumption. This involved analyzing public discourse, trends, and sentiments on platforms where these discussions are prevalent.

Selection of social media platforms (X and YouTube)

X

This was chosen for its dynamic and real-time nature, offering insights into immediate public reactions and discussions. X’s format encourages concise, focused expressions of opinions, making it a rich source for sentiment analysis and trend identification.

YouTube

YouTube was selected for its role as a platform for more detailed discussions, often found in comments on videos related to green consumption. YouTube comments provide depth to understanding consumer perceptions, complementing the concise data from X.

Data collection via APIs

Use of X API

This allowed us to gather a vast amount of data related to green consumption, including posts, reposts, and hashtags, ensuring a comprehensive view of the topic over time.

Use of YouTube API

This enabled extracting comments from relevant videos, providing insights into more detailed consumer opinions and perspectives on green consumption.

Data collection techniques

Keyword searches

This search targeted specific terms related to green consumption to filter the data, ensuring relevance to our study’s focus.

Text mining and sentiment analysis

Critical for analyzing large volumes of text data, these techniques allowed us to identify key themes, sentiments, and trends in the discourse around green consumption.

Temporal scope of data collection

The collection period was strategically chosen to cover significant phases of public discourse, especially during the early months of the COVID-19 pandemic, when environmental awareness was notably heightened.

Volume and diversity of data

We ensured a large and diverse sample size from both platforms to capture a wide array of viewpoints and discussions. This comprehensive approach allowed for robust analysis and more generalizable findings.

Comprehensiveness

We emphasize the thoroughness of the approach in capturing a representative sample of discourse on green consumption.

METHODS

This research employed a data-driven, computational approach using Python, a versatile and widely used programming language in data science and computational social science research. The study involved two primary stages: data collection and data analysis, each of which utilized specific Python libraries and techniques.

Theoretical perspective

Relevance to research objectives

The sample size was selected to ensure a comprehensive representation of the discourse on green consumption across X and YouTube. Given the vast amount of data generated daily on these platforms, a larger sample was necessary to capture a wide range of opinions and sentiments.

Alignment with study design

The study analyzes trends, emotional attitudes, and engagement over a significant period (covering the early months of the COVID-19 pandemic). This period saw increased public interest in environmental issues, thus necessitating a substantial sample size to accurately reflect the changes in discourse.

Statistical perspective

Data availability

The volume of data available on X and YouTube allowed for the collection of a large sample, providing robustness to the analysis.

Sampling method

The use of APIs from X and YouTube enabled the collection of a broad and varied dataset, including reposts, comments, and keywords related to green consumption.

Data diversity

Data was collected over a year to ensure a holistic view, capturing various phases and peaks in the discussion, which is critical for understanding long-term trends and patterns.

Statistical power

A larger sample size increases the statistical power of the study, reducing the likelihood of Type II errors (failing to detect an effect that is present) and enhancing the reliability of the findings.

Generalizability

A substantial sample size contributes to the generalizability of the study results, allowing for more accurate extrapolation of the findings to the broader population of social media users interested in green consumption.

Addressing the theoretical and statistical rationale

Compliance with sampling norms

The sample size adheres to standard practices in computational social science research, where large datasets are often analyzed to discern patterns in digital communication.

Ensuring data richness

The chosen sample size ensures sufficient data richness and depth, allowing for a nuanced analysis of sentiment and keyword trends.

Data preprocessing

Before proceeding with the analysis, the data was cleaned and preprocessed. The two datasets were filtered so that only the dates of overlap were included, and any duplicate posts or comments across both data sets were removed, creating a final data set of distinct posts and comments. Python’s Pandas library was used for data cleaning and preprocessing tasks.

Robust data collection

Large sample size

We gathered extensive data sets, including millions of posts and YouTube comments, ensuring a substantial and varied pool of public opinions.

Time-frame consideration

The data collection spanned a significant period, capturing the evolution of discourse over time and ensuring that both transient and sustained trends were included.

Focused data retrieval

Keyword-driven collection

Utilized carefully chosen keywords related to green consumption, guaranteeing that the data was precisely targeted and relevant to our research objectives.

API utilization

Employed advanced API capabilities to systematically extract and compile data, ensuring efficiency and comprehensiveness in our collection process.

In-depth data processing

Rigorous data cleaning

Implemented thorough data cleaning procedures, including the removal of duplicates and irrelevant entries, to refine the data set for analysis.

Text mining and sentiment analysis

Applied sophisticated text mining techniques and sentiment analysis tools to distill and interpret vast amounts of data, drawing out key themes, sentiments, and patterns.

Analytical rigor

Diverse analytical techniques

Our analysis was not limited to basic descriptive statistics but extended to advanced techniques like Latent Dirichlet Allocation (LDA) for topic modeling and sentiment analysis, providing a deeper understanding of the public discourse.

Data analysis

The analysis of the data set was performed using various methods in Python. We used Matplotlib, a widely used data visualization library in Python, to visualize the frequency of posts and YouTube comments over time. Word frequency analysis was conducted on posts and comments to understand the most discussed topics related to green consumption. The Python libraries NLTK (Natural Language Toolkit) and Scikit-learn were used for natural language processing tasks, including tokenization and stop word removal. Sentiment analysis was performed on both posts on X and YouTube comments to understand the emotional tone underlying the discussions. The sentiment analysis process involved text preprocessing, tokenization, and sentiment scoring using sentiment lexicons. We used the Python library TextBlob, which provides a simple API for diving into common natural language processing tasks such as part- of-speech tagging, noun phrase extraction, and sentiment analysis. For more advanced text analysis, Latent Dirichlet Allocation (LDA), a type of probabilistic topic model, was used to extract the underlying topics from the posts and comments. Gensim, a Python library for topic modeling, was used to implement LDA.

In our study focusing on the role of social media in shaping consumer perception of green consumption, we employed various software tools and automated libraries. These tools were instrumental in efficiently and accurately cleaning the large volumes of data collected from X and YouTube.

Python served as the backbone of our data cleaning process, known for its robustness and versatility in handling large datasets, and provided a flexible platform for integrating various data cleaning libraries and custom scripts.

Pandas Library is utilized for its powerful data manipulation capabilities. It enabled us to perform tasks such as filtering out irrelevant data, removing duplicates, and restructuring datasets for easier analysis. It facilitated handling and transforming large data sets into a format suitable for analysis.

Natural Language Toolkit (NLTK) was employed to preprocess textual data from social media posts. Essential for tasks like tokenization, stop word removal, and text normalization, which are critical for cleaning and preparing text data for sentiment analysis. Regular Expressions (Regex) are used to identify and remove unwanted text patterns, such as URLs, special characters, and non-standard symbols. This helped to refine the textual data to include only relevant content.

Scikit-learn was used to implement some of the more advanced data cleaning and preprocessing techniques, especially in preparing the data set for sentiment analysis and topic modeling. Tweepy (for X Data), a Python library for accessing the X API, was instrumental in the initial data collection phase, and it also played a role in the preliminary cleaning of X data.

Google API Client (for YouTube Data) facilitated efficient extraction and initial cleaning of data from YouTube, ensuring that we gathered relevant comments for analysis. Jupyter Notebooks provided an interactive environment for coding, data cleaning, and preliminary data analysis, allowing for a more streamlined and documented cleaning process.

Finally, we used WordCloud, a Python library, to create a word cloud. The word cloud was generated from the most frequently used terms in the posts related to green consumption, offering a visual representation of the central themes in the discourse. In summary, by integrating various Python libraries and data analysis techniques, this research offered an in-depth exploration of the public discourse surrounding green consumption on X and YouTube.

We recognize that Common Method Bias (CMB) is a critical concern in research, particularly in studies utilizing self-reported data or relying on single-source data collection methods. CMB can lead to inflated or spurious correlations between variables, thus potentially compromising the validity of research findings.

Although our study primarily analyzed data from social media platforms and did not rely heavily on self-reported measures, we understand the importance of mitigating CMB in all forms of research.

For studies involving self-reported data, we advocate for using diverse data collection methods and sources to reduce the risk of CMB. This includes triangulating data with other sources, employing mixed methods approaches, and integrating qualitative insights. Methodological Strategies to Address CMB:

  • Anonymity and confidentiality; assuring respondents of anonymity and confidentiality to reduce social desirability biases.

  • Temporal separation: collecting predictor and outcome variables at different times to reduce common method variance.

  • Methodological separation: using different methods to measure different constructs, thereby reducing the likelihood of method-linked biases.

RESULTS AND DISCUSSION

Figure 2 shows the frequency of reposts of data relevant to green consumption. It shows that the trend line climbed from January to February when it reached its first high. Beginning in January, the greatest intensity peaks were seen, followed by a second peak in April and a third peak in June. November saw the appearance of the fourth peak. This data demonstrates the intensity of discussion activities regarding green consumption on X in the first period beginning in January and reaching its peak at the end of the month. This indicates that public awareness has increased.

Figure 2
Number of posts over time

The frequency of comments on Youtube about green consumption is shown in Figure 3, where the trend line rose steadily from January until November, when it reached its highest point. Beginning in January, the greatest intensity peaks were seen, followed by a second peak in March and a third peak in July. November saw the appearance of the fourth peak. These data demonstrate public awareness, as they represent the intensity of discussion activities concerning environmentally responsible consumption on YouTube throughout the first period beginning in January and reaching its peak at the end of the year.

Figure 3
Comments over Time

Figure 4 displays the number of posts that include the phrase “sustainability,” which is a closely similar but separate topic. The upward trajectory of the term “sustainability” reached its highest point in March. Between April and September, the trend line for the term “sustainability” had a downward trend. Figure 5 illustrates the X usage levels at various times. The upward trajectory for the term “green consumption” reached its highest point in July. Between March and May, there was a downward shift in the trend line for the term “green consumption.” Officials in China announced that they had discovered a novel virus belonging to the coronavirus family as the number of confirmed cases of the disease grew. It was first identified as 2019-nCoV, but its name was subsequently changed to COVID-19. This marked the beginning of the COVID-19 epidemic, as shown by the significant increase in the number of posts that used the term outbreak.

Figure 4
Frequencies of the keywords Sustainability on X

Figure 5
Frequencies of the keyword Green Consumption on X

Figure 6 illustrates the level of engagement on YouTube’s comment section for one similar but separate keyword: sustainability. The upward trajectory of the term “sustainability” reached its highest point in April and June of this year. Additionally, the downward trend line for the term “sustainability” was seen between August and September. The levels of time spent reading comments on YouTube are shown in Figure 7. The upward trajectory for the term “green consumption” reached its highest point in June. Additionally, the downward trend line for the term “green consumption” continued throughout March, May, August, and September.

Figure 6
Frequencies of the keyword Sustainability on YouTube

Figure 7
Frequencies of the keyword Green Consumption on YouTube

Figure 8 displays trend lines that indicate the word frequency counts for the primary symptoms of green consumption on X. These word frequency counts may represent the perspectives and worries of X users about pollution. Sustainability and consumption are the two most important aspects of environmentally conscious consumption. Since only the six most important components were used in the graph plotting, other factors are omitted from Figure 8. It is seen that the biggest decreases among the six factors occurred in March. However, in March, when these decreases occurred, two factors showed the greatest increase. These factors are, respectively, the environment and consumption behavior.

Figure 8
Trend lines indicating the word frequencies of six key green consumption factors on X

The word frequency counts for the key symptoms of green consumption are shown as trend lines in Figure 9. These word frequency counts might be representative of the viewpoints and concerns of people who use YouTube about pollution. Ecologically sensitive consumption should have a primary emphasis on preserving natural resources and reducing overall consumption. Figure 9 does not include any other considerations since the graph was constructed using just the six elements that were deemed to be the most significant. When Figure 9 is examined, the biggest difference is in June, when two of the six elements show a decrease while the others showed an upward trend.

Figure 9
Trend lines indicating the word frequencies of six key green consumption factors on YouTube

The word clouds created from the most commonly used terms in this research provided a better understanding of X users’ posts concerning environmentally conscious consumption. As seen in Figure 10, the terms that appeared most often were connected to the concept of being sustainable. The terms “conservation,” “green,” “climate,” “clean,” “waste,” and “energy” are the secondary words in the word cloud. The terms “sustainability” and “consumption” were used to investigate a variety of viewpoints concerning the various eco-friendly forms of consumption while considering the frequency of certain keywords in a particular search.

Figure 10
Word cloud showing the keywords appearing most frequently in posts related to green consumption

Analysis at the sentiment level further improved the results by making it possible to clearly distinguish between negative and positive subjects about environmentally conscious consumption. According to the findings of the study of the posts’ emotions, just 39.62% of them included optimistic thoughts, while 60.38% contained negative sentiments. This indicates that X users have an unfavorable attitude toward environmentally conscious purchases. Figure 11 illustrates that throughout January, negative sentiment climbed more than positive sentiment. After February, there was a greater rise in positive emotion than negative sentiment. From February to April, it seems that positive emotions had a more significant influence than negative.

Figure 11
Sentiment analysis of negative and positive posts related to green consumption

According to the results of the analysis of the comments’ emotions, 57.19% of them included positive thoughts, while 42.81 % contained negative. This suggests that YouTube viewers have a favorable attitude toward environmentally conscious purchases. According to Figure 12, the highest levels of negative sentiment were recorded in February and early March, April, June, and September, while the highest levels of good sentiment were recorded in January and late February, throughout March, in April, July, October, November, and in December.

Figure 12
Sentiment analysis of negative and positive YouTube comments related to green consumption

Figure 13 illustrates the temporal dynamics of the various subjects broken down by weeks. It reveals that there have been discernible shifts in the relative frequency of some themes over time. In particular, films that supported the argument that green consumption dominated the first week, were swiftly displaced as the dominating frame after the second week. This picture frame became less prevalent over the subsequent weeks but did not go away entirely and was still the fifth most common picture frame overall.

Figure 13
Temporal heatmap of the YouTube transcript topics used in each week of the data

After analyzing the posts using the LDA to determine their emotional quotient, we discovered that more than half of the posts sent around the globe could be characterized by one of three feelings: anger, trust, or sadness. As seen in Figure 14, a significant portion of the total posts that were reviewed had references to the feeling of trust. The subsequent feeling, sadness, suggested that individuals were looking forward to rehabilitation or to receiving answers from professionals. Trust was the emotion that came before sadness. In a similar vein, the feeling of anger was linked to about 17.5% of all posts, which further supports the pessimistic attitudes held by the majority of the population. Some of the posts displayed negative feelings such as disgust and fear, with corresponding proportions of 5.5% and 3.4% of the total posts containing these sentiments. The category of delight, which is a positive feeling, only applied to a very tiny percentage (0.5%) of the responses.

Figure 14
Sentiment wheel showing the emotional quotients of the studied posts

The LDA was used to study the emotional quotient of the comments made on YouTube, and the findings revealed that more than half of the comments submitted on YouTube throughout the globe were characterized by one of three emotions: surprise, fear, or trust. As shown in Figure 15, the comments that expressed the feeling of surprise made up most of the total comments examined. The fear came next, right after surprise. In a similar vein, the feeling of trust was connected to roughly 15% of the remarks, which bolsters the favorable views held by the people. Some comments displayed negative feelings such as disgust, anger, and sadness, with corresponding proportions of 14.3%, 10.4%, and 3.5% of the total comments containing these expressions.

Figure 15
Sentiment wheel showing the emotional quotients of the studied youtube comments

Impact on study findings

  • The cleaned data set was free from noise and irrelevant information, enabling a more focused and in-depth analysis.

  • The quality of the data set post-cleaning allowed for a clearer and more accurate identification of trends, sentiments, and patterns in the discourse on green consumption.

  • By analyzing a refined data set, we gained deeper insights into how social media users perceive and discuss green consumption, particularly about sustainability and environmental impact.

  • The reliability of our study’s findings was significantly bolstered by the rigorous data cleaning process.

  • Cleaned data reduced the risk of erroneous conclusions and ensured that our interpretations accurately reflected the actual public discourse.

  • Our findings, explored from a meticulously cleaned data set, offer trustworthy insights that can inform strategies for promoting sustainable consumption and understanding consumer behavior in the digital age.

CONCLUSION

In conclusion, the data collected in this study has provided valuable insight into the discourse surrounding environmentally conscious consumption on the social media platforms X and YouTube. Using innovative methods such as text mining and sentiment analysis, we were able to discern trends, emotional attitudes, and engagement with the topic at various times, particularly during the first months of the COVID-19 pandemic. The trend analysis showcased the public’s growing awareness and interest in green consumption. The data from both X and YouTube demonstrated several peaks in the intensity of discussions, showing increased public engagement with the topic. However, the sentiment attached to these discussions varied between the platforms. Our sentiment analysis yielded contrasting results: X users predominantly expressed negative sentiment toward green consumption (60.38%), while YouTube comments were more positive (57.19%). This suggests that the perception of green consumption differs across platforms and warrants further research to explore these discrepancies.

Detailed explanation of analytical methods

Data analysis overview

Our study involved a multi-faceted analysis approach, utilizing both quantitative and qualitative methods to extract meaningful insights from the social media data.

Sentiment analysis

We detailed how sentiment analysis was conducted, including the specific algorithms or models used (e.g., TextBlob or custom sentiment analysis models). Explained the process of categorizing sentiments into positive, negative, and neutral and how these categories were quantitatively analyzed.

Topic modeling

We described the use of Latent Dirichlet Allocation (LDA) to uncover prevalent topics in the data set. We explained how topics were identified and categorized and the criteria used for determining the relevance and significance of each topic.

Trend analysis

We outlined the methods used for identifying and analyzing trends over time in the data set. We clarified how peaks and troughs in the data were correlated with external events or time frames.

Data visualization techniques

We provided details on the types of visualizations used (e.g., line graphs, word clouds) and how they were employed to represent data findings effectively. We discussed the rationale behind the choice of each visualization type in relation to the data being presented.

Moreover, the analysis of keywords and phrases related to green consumption brought forward noteworthy trends. The frequency analysis revealed the ebbs and flows of public interest in terms like “sustainability” and “consumption.” These shifts often coincided with global events and news cycles, indicating that the public discourse around green consumption is susceptible to external influences. The emotional content of the discussions, as revealed through the Latent Dirichlet Allocation (LDA) model, further nuanced our understanding of the public discourse. X users’ emotional responses were primarily characterized by feelings of trust, anger, and sadness. On the other hand, YouTube comments were primarily marked by surprise, fear, and trust. This underlines the complex range of emotional responses to the topic of environmentally responsible consumption. Our study also indicated the key factors associated with the discourse on green consumption. Among the six factors considered, the environment and consumption behavior appeared to be paramount. These findings may aid in shaping future strategies to promote sustainable consumption. Finally, this study successfully employed word clouds to visualize common language in the examined posts. The word cloud for this research underscored terms such as “conservation,” “green,” “climate,” “clean,” “waste,” and “energy,” suggesting a broad range of concerns associated with the discourse on green consumption.

Our research has several implications. First, it highlights the important role of social media platforms in driving discussions around green consumption. This serves as a beacon for companies, policymakers, and social activists, highlighting the need to engage and respond to these conversations. The findings underscore the potential for leveraging social media as a powerful tool for promoting sustainable practices and influencing consumer behavior positively. As we navigate the challenges of environmental sustainability, the active involvement of all stakeholders in shaping these online dialogues becomes increasingly crucial to foster a collective commitment to environmentally friendly choices. The data also highlights the importance of sustainability and environmental concerns in consumption behavior, highlighting the increasing consumer demand for green products and sustainable practices. Moreover, it signals a paradigm shift in the market dynamics, urging businesses to reevaluate their strategies and incorporate eco-friendly initiatives into their products and services. In light of this shift, companies that proactively embrace sustainable practices not only align themselves with consumer preferences but also position themselves as leaders in responsible and ethical business practices, fostering long-term brand loyalty and positive societal impact. Second, analysis of emotion patterns reveals emotional responses to arguments. This provides a deeper understanding of public sentiment toward green consumption, which can guide the creation of more effective and targeted marketing strategies and public policies. Furthermore, understanding the emotional aspects of environmental discussions allows businesses and policymakers to adjust their messages and projects to connect with specific emotional triggers. This increases the chances of more people adopting green practices and policies. This emotional awareness in decision-making has the potential to build a more empathetic and connected relationship among stakeholders, ultimately supporting the overall success of green initiatives. Third, the implications obtained from this study show that discussions and sensitivities about green consumption on social media have a decisive effect on consumers’ tendencies to purchase environmentally friendly products. It has been observed that positive and informative content, especially shared on social media platforms, increases the likelihood of consumers choosing sustainable products. Therefore, social media analysis makes significant contributions to raising public awareness about sustainability by providing valuable insights into consumers’ purchasing behavior and preferences. In light of these findings, businesses and advocacy groups can strategically leverage social media platforms to disseminate positive and informative content, thereby influencing consumer choices toward more sustainable options. Additionally, policymakers can use these insights to design targeted campaigns that promote environmentally friendly behaviors and products, fostering a culture of environmental responsibility.

Our research has highlighted the intricate dynamics of the public discourse on green consumption across two popular social media platforms. It elucidates how the platforms shape the discourse and, in turn, the public’s attitudes toward environmentally conscious consumption. As interest in this area grows, we expect to see further research exploring the implications of these findings and their potential to inform strategies promoting sustainable consumption.

  • NOTE

    This study is one of the outputs of the project numbered 2022-A-113-07, funded by Gebze Technical University Research Fund.
  • Evaluated through a double-anonymized peer review.
  • Reviewers

    Chin Chee Hua https://orcid.org/0000-0001-7807-0496, University of Technology Sarawak, Sarawak, Malasya, but he did not authorize the disclosure of your peer review report. The second reviewer did not authorize disclosure of their identity and peer review report.

REFERENCES

  • Agyabeng-Mensah, Y., Afum, E., & Ahenkorah, E. (2020). Exploring financial performance and green logistics management practices: Examining the mediating influences of market, environmental and social performances. Journal of Cleaner Production, 258, 120613. https://doi.org/10.1016/j.jclepro.2020.120613
    » https://doi.org/10.1016/j.jclepro.2020.120613
  • Ahamad, N. R., & Ariffin, M. (2018). Assessment of knowledge, attitude, and practice towards sustainable consumption among university students in Selangor, Malaysia. Sustainable Production and Consumption, 16, 88-98. https://doi.org/10.1016/j.spc.2018.06.006
    » https://doi.org/10.1016/j.spc.2018.06.006
  • Akhtar, R., Sultana, S., Masud, M. M., Jafrin, N., & Al-Mamun, A. (2021). Consumers’ environmental ethics, willingness, and green consumerism between lower and higher income groups. Resources, Conservation and Recycling, 168, 105274. https://doi.org/10.1016/j.resconrec.2020.105274
    » https://doi.org/10.1016/j.resconrec.2020.105274
  • Balcıoğlu, Y. S. (2022). Detection of depression and anxiety synmptoms via Twitter after Covid-19 with machine learning., 2. In Başkent International Conference On Multidisciplinary Studies (pp. 261-265).
  • Bedard, S., & Reisdorf, C. A. (2018). Millennials’ green consumption behaviour: Exploring the role of social media. Corporate Social Responsibility and Environmental Management, 25(1), 1388-1396. https://doi.org/10.1002/csr.1654
    » https://doi.org/10.1002/csr.1654
  • Biswas, A. (2016). Impact of social media usage factors on green consumption behavior based on technology acceptance model. Journal of Advanced Management Science, 4(2), 92-97. https://doi.org/10.12720/joams.4.2.92-97
    » https://doi.org/10.12720/joams.4.2.92-97
  • Brzustewicz, P., & Singh, A. (2021). Sustainable consumption in consumer behavior in the time of covid-19: Topic modeling on twitter data using LDA. Energies, 14(18), 5787. https://doi.org/10.3390/en14185787
    » https://doi.org/10.3390/en14185787
  • Costa, C. S. R., da Costa, M. F., Maciel, R. G., Aguiar, E. C., & Wanderley, L. O. (2021). Consumer antecedents towards green product purchase intentions. Journal of Cleaner Production, 313, 127964. https://doi.org/10.1016/j.jclepro.2021.127964
    » https://doi.org/10.1016/j.jclepro.2021.127964
  • Djafarova, E., & Rushworth, C. (2017). Exploring the credibility of online celebrities’ Instagram profiles in influencing the purchase decisions of young female users. Computers in Human Behavior, 68, 1-7. https://doi.org/10.1016/j.chb.2016.11.009
    » https://doi.org/10.1016/j.chb.2016.11.009
  • ElHaffar, G., Durif, F., & Dubé, L. (2020). Towards closing the attitude-intention-behavior gap in green consumption: A narrative review of the literature and an overview of future research directions. Journal of Cleaner Production, 275, 122556. https://doi.org/10.1016/j.jclepro.2020.122556
    » https://doi.org/10.1016/j.jclepro.2020.122556
  • Groening, C., Sarkis, J., & Zhu, Q. (2018). Green marketing consumer-level theory review: A compendium of applied theories and further research directions. Journal of Cleaner Production, 172, 1848-1866. https://doi.org/10.1016/j.jclepro.2017.12.002
    » https://doi.org/10.1016/j.jclepro.2017.12.002
  • Huang, H., Long, R., Chen, H., Sun, K., & Li, Q. (2022). Exploring public attention about green consumption on Sina Weibo: Using text mining and deep learning. Sustainable Production and Consumption, 30, 674-685. https://doi.org/10.1016/j.spc.2021.12.017
    » https://doi.org/10.1016/j.spc.2021.12.017
  • Huseynov, F., & Yıldırım, S. O. (2019). Online consumer typologies and their shopping behaviors in B2C e-commerce platforms. Sage Open, 9(2), 1-19. https://doi.org/10.1177/2158244019854639
    » https://doi.org/10.1177/2158244019854639
  • Imamah, I., Husni, H., Rachman, E. M., Suzanti, I. O., & Mufarroha, F. A (2020). Text mining and support vector machine for sentiment analysis of tourist reviews in Bangkalan Regency. Journal of Physics, 1477, 022023. https://doi.org/10.1088/1742-6596/1477/2/022023
    » https://doi.org/10.1088/1742-6596/1477/2/022023
  • Jain, V. K., Gupta, A., Tyagi, V., & Verma, H. (2020). Social media and green consumption behavior of millennials. Journal of Content, Community & Communication, 11, 221-230. https://doi.org/10.31620/JCCC.06.20/16
    » https://doi.org/10.31620/JCCC.06.20/16
  • Jalali, S. S., & Khalid, H. (2019). Understanding Instagram influencers values in green consumption behaviour: A review paper. Open International Journal of Informatics, 7(Special Issue 1), 47-58. https://oiji.utm.my/index.php/oiji/article/view/115
    » https://oiji.utm.my/index.php/oiji/article/view/115
  • Jian, Y., Yu, I. Y., Yang, M. X., & Zeng, K. J. (2020). The impacts of fear and uncertainty of Covid-19 on environmental concerns, brand trust, and behavioral intentions toward green hotels. Sustainability, 12(20), 8688. https://doi.org/10.3390/su12208688
    » https://doi.org/10.3390/su12208688
  • Jiang, Z., & Gao, X. (2023). Text mining and quantitative evaluation of China’s green consumption policies based on green consumption objects. Environment, Development and Sustainability, 26(3), 6601-6622. https://doi.org/10.1007/s10668-023-02976-w
    » https://doi.org/10.1007/s10668-023-02976-w
  • Kluczek, A. (2017). Quick green scan: A methodology for improving green performance in terms of manufacturing processes. Sustainability, 9(1), 88. https://doi.org/10.3390/su9010088
    » https://doi.org/10.3390/su9010088
  • Kunte, A. V., & Panicker, S. (2019). Using textual data for personality prediction:A machine learning approach. 4th International Conference on Information Systems and Computer Networks (ISCON).
  • Li, M. (2020). Review of consumers’ green consumption behavior. American Journal of Industrial and Business Management, 10, 585-599. https://doi.org/10.4236/ajibm.2020.103039
    » https://doi.org/10.4236/ajibm.2020.103039
  • Lu, Y., & Park, S. D. (2022). Time series analysis of policy discourse on green consumption in China: Text mining and network analysis. Sustainability, 14(22), 14704. https://doi.org/10.3390/su142214704
    » https://doi.org/10.3390/su142214704
  • Ma, Y., & Qiao, E. (2021). Research on Accurate prediction of operating energy consumption of green buildings based on improved machine learning IEEE International Conference on Industrial Application of Artificial Intelligence (IAAI).
  • Al Mamun, A., Mohamad, M. R., Yaacob, M. R. B., & Mohiuddin, M. (2018). Intention and behavior towards green consumption among low-income households. Journal of Environmental Management, 227, 73-86. https://doi.org/10.1016/j.jenvman.2018.08.061
    » https://doi.org/10.1016/j.jenvman.2018.08.061
  • Park, J. Y., Mistur, E., Kim, D., Mo, Y., & Hoefer, R. (2022). Toward human-centric urban infrastructure: Text mining for social media data to identify the public perception of Covid-19 policy in transportation hubs. Sustainable Cities and Society, 76, 103524. https://doi.org/10.1016/j.scs.2021.103524
    » https://doi.org/10.1016/j.scs.2021.103524
  • Sajeewanie, L. A. C., Ab Yajid, M. S., Khatibi, A., Azam, F., & Tham, J.(2019). Integrated model for green purchasing intention and green adoption: Future research direction. Journal of Sociological Research, 10(2), 23-66. https://doi.org/10.5296/jsr.v10i2.14996
    » https://doi.org/10.5296/jsr.v10i2.14996
  • Saraç, Ö. (2022). Kültür Turistlerinin Sürdürülebilir Tüketim Davranışlarının Cinsiyete Göre Farklılıkları Safranbolu Üzerinde Bir Araştırma. Journal of Humanities and Tourism Research, 12(2), 265-283.
  • Serrano, L., Ariza-Montes, A., Nader, M., Sianes, A., & Law, R. (2021). Exploring preferences and sustainable attitudes of Airbnb green users in the review comments and ratings: A text mining approach. In Sustainable Consumer Behaviour and the Environment (pp. 114-132). Routledge.
  • Sharifi, A. (2021). Co-benefits and synergies between urban climate change mitigation and adaptation measures: A literature review. Science of the Total Environment, 750, 141642. https://doi.org/10.1016/j.scitotenv.2020.141642
    » https://doi.org/10.1016/j.scitotenv.2020.141642
  • Sun, X., Su, W., Guo, X., & Tian, Z. (2021). The impact of awe induced by Covid-19 pandemic on green consumption behavior in China. International Journal of Environmental Research and Public Health, 18(2), 543. https://doi.org/10.3390/ijerph18020543
    » https://doi.org/10.3390/ijerph18020543
  • Tang, H., Xu, Y., Lin, A., Heidari, A. A., Wang, M., Chen, H., ... & Li, C. (2020). Predicting green consumption behaviors of students using efficient firefly grey wolf-assisted k-nearest neighbor classifiers. IEEE Access. https://doi.org/10.1109/ACCESS.2020.2973763
    » https://doi.org/10.1109/ACCESS.2020.2973763
  • Tanveer, M., Richhariya, B., Khan, R. U., Rashid, A. H., Khanna, P., Prasad, M., & Lin, C. T. (2020). Machine learning techniques for the diagnosis of Alzheimer’s disease: A review. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 16(1s), 1-35. https://doi.org/10.1145/3344998
    » https://doi.org/10.1145/3344998
  • Wang, Y. (2021). Research on the influence mechanism of green cognition level on consumers’ green consumption behavior: An empirical study based on SPSS International Conference on Management Science and Software Engineering (ICMSSE). https://doi.org/10.1109/ICMSSE53595.2021.00044
    » https://doi.org/10.1109/ICMSSE53595.2021.00044
  • Wu, Z., Zhang, Y., Chen, Q., & Wang, H. (2021). Attitude of Chinese public towards municipal solid waste sorting policy: A text mining study. Science of the Total Environment, 756, 142674. https://doi.org/10.1016/j.scitotenv.2020.142674
    » https://doi.org/10.1016/j.scitotenv.2020.142674
  • Xie, S., & Madni, G. (2023). Impact of social media on young generation’s green consumption behavior through subjective norms and perceived green value. Sustainability, 15, 3739. https://doi.org/10.3390/su15043739
    » https://doi.org/10.3390/su15043739
  • Yang, M., Chen, H., Long, R., & Yang, J. (2022). The impact of different regulation policies on promoting green consumption behavior based on social network modeling. Sustainable Production and Consumption, 32, 468-478. https://doi.org/10.1016/j.spc.2022.05.007
    » https://doi.org/10.1016/j.spc.2022.05.007
  • Yang, W., Feng, L., Wang, Z., & Fan, X. (2023). Carbon emissions and national sustainable development goals coupling coordination degree study from a global perspective: Characteristics, heterogeneity, and spatial effects. Sustainability, 15(11), 9070. https://doi.org/10.3390/su15119070
    » https://doi.org/10.3390/su15119070
  • Yang, Y., Li, Y., & Guo, Y. (2022). Impact of the differences in carbon footprint driving factors on carbon emission reduction of urban agglomerations given SDGs: A case study of the Guanzhong in China. Sustainable Cities and Society, 85, 104024. https://doi.org/10.1016/j.scs.2022.104024
    » https://doi.org/10.1016/j.scs.2022.104024
  • Yao, J., Guo, X., Wang, L., & Jiang, H. (2022). Understanding green consumption: A literature review based on factor analysis and bibliometric method. Sustainability, 14, 8324. https://doi.org/10.3390/su14148324
    » https://doi.org/10.3390/su14148324
  • Yazdavar, A. H., Mahdavinejad, M. S., Bajaj, G., Romine, W., Sheth, A., Monadjemi, A. H., ... & Hitzler, P. (2020). Multimodal mental health analysis in social media. Plos ONE, 15(4), 1-27. https://doi.org/10.1371/journal.pone.0226248
    » https://doi.org/10.1371/journal.pone.0226248
  • Zaremohzzabieh, Z., Ismail, N., Ahrari, S., & Samah, A. A. (2021). The effects of consumer attitude on green purchase intention: A meta-analytic path analysis. Journal of Business Research, 132, 732-743. https://doi.org/10.1016/j.jbusres.2020.10.053
    » https://doi.org/10.1016/j.jbusres.2020.10.053
  • Zhao, G., Geng, Y., Sun, H., Tian, X., Chen, W., & Wu, D. (2020). Mapping the knowledge of green consumption: A meta-analysis. Environmental Science and Pollution Research, 27, 44937-44950. https://doi.org/10.1007/s11356-020-11029-y
    » https://doi.org/10.1007/s11356-020-11029-y

Edited by

Associate Editor:

Cham Tat-Huei

Publication Dates

  • Publication in this collection
    26 Aug 2024
  • Date of issue
    2024

History

  • Received
    12 Sept 2023
  • Accepted
    29 Apr 2024
Fundação Getulio Vargas, Escola de Administração de Empresas de S.Paulo Avenida Nove de Julho, 2.029, Bela Vista, CEP: 01313-902, Telefone: +55 (11) 3799-7718 - São Paulo - SP - Brazil
E-mail: rae@fgv.br