TY - GEN
T1 - Detecting Rumors Transformed from Hong Kong Copypasta
AU - Fung, Yin Chun
AU - Lee, Lap Kei
AU - Chui, Kwok Tai
AU - Lee, Ian Cheuk Yin
AU - Chan, Morris Tsz On
AU - Cheung, Jake Ka Lok
AU - Lam, Marco Kwan Long
AU - Wu, Nga In
AU - Lu, Markus
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - A copypasta is a piece of text that is copied and pasted in online forums and social networking sites (SNSs) repeatedly, usually for a humorous or mocking purpose. In recent years, copypasta is also used to spread rumors and false information, which damages not only the reputation of individuals or organizations but also misleads many netizens. This paper presents a tool for Hong Kong netizens to detect text messages that are copypasta or their variants (by transforming an existing copypasta with new subjects and events). We exploit the Encyclopedia of Virtual Communities in Hong Kong (EVCHK), which contains a database of 315 commonly occurred copypasta in Hong Kong, and a CNN model to determine whether a text message is a copypasta or its variant with an accuracy rate of around 98%. We also showed a prototype of a Google Chrome browser extension that provides a user-friendly interface for netizens to identify copypasta and their variants on a selected text message directly (e.g., in an online forum or SNS). This tool can show the source of the corresponding copypasta and highlight their differences (if it is a variant). From a survey, users agreed that our tool can effectively help them to identify copypasta and hence help stop the spreading of this kind of online rumor.
AB - A copypasta is a piece of text that is copied and pasted in online forums and social networking sites (SNSs) repeatedly, usually for a humorous or mocking purpose. In recent years, copypasta is also used to spread rumors and false information, which damages not only the reputation of individuals or organizations but also misleads many netizens. This paper presents a tool for Hong Kong netizens to detect text messages that are copypasta or their variants (by transforming an existing copypasta with new subjects and events). We exploit the Encyclopedia of Virtual Communities in Hong Kong (EVCHK), which contains a database of 315 commonly occurred copypasta in Hong Kong, and a CNN model to determine whether a text message is a copypasta or its variant with an accuracy rate of around 98%. We also showed a prototype of a Google Chrome browser extension that provides a user-friendly interface for netizens to identify copypasta and their variants on a selected text message directly (e.g., in an online forum or SNS). This tool can show the source of the corresponding copypasta and highlight their differences (if it is a variant). From a survey, users agreed that our tool can effectively help them to identify copypasta and hence help stop the spreading of this kind of online rumor.
KW - Copypasta
KW - Natural language processing
KW - Rumor detection
UR - http://www.scopus.com/inward/record.url?scp=85149625805&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/554ea22a-800d-3b84-9c87-695abfcba0bf/
U2 - 10.1007/978-3-031-22018-0_2
DO - 10.1007/978-3-031-22018-0_2
M3 - Conference contribution
AN - SCOPUS:85149625805
SN - 9783031220173
T3 - Lecture Notes in Networks and Systems
SP - 11
EP - 23
BT - International Conference on Cyber Security, Privacy and Networking, ICSPN 2022
A2 - Nedjah, Nadia
A2 - Martínez Pérez, Gregorio
A2 - Gupta, B.B.
PB - Springer Science and Business Media Deutschland GmbH
T2 - International Conference on Cyber Security, Privacy and Networking, ICSPN 2022
Y2 - 9 September 2021 through 11 September 2021
ER -