2021 Workshop Organizers

  • Giovanni Da San Martino, University of Padova. dasan[at]math.unipd.it

  • Anna Feldman, Montclair State University. feldmana[at]montclair.edu

  • Chris Leberknight, Montclair State University. leberknightc[at]montclair.edu

  • Preslav Nakov, Qatar Computing Research Institute, HBKU. pnakov[at]hbku.edu.qa


Past Workshops

NLP4IF 2018

NLP4IF 2019

NLP4IF 2020


Shared Task

We are happy to formally announce this year's two shared tasks to be collocated with the NLP4IF workshop at NAACL 2021

Task 1: Fighting the COVID-19 Infodemic (in English, Arabic, and Bulgarian)

Predict several binary properties of a tweet about COVID-19: whether it is harmful, whether it contains a verifiable claim, whether it may be of interest to the general public, whether it appears to contain false information, etc.: https://gitlab.com/NLP4IF/nlp4if-2021

Task 2: Censorship Detection (in Chinese)

Predict which tweet is going to be censored using its text: https://gitlab.com/NLP4IF/nlp4-if-censorship-detection

Further details are available here: https://gitlab.com/NLP4IF


Important Dates
    Regular Papers
  • March 22, 2021: Workshop Papers Due Date
  • April 15, 2021: Notification of Acceptance
  • April 26, 2021: Camera-ready papers due (hard deadline)
    • Shared Task
  • February 24, 2021: Training data released
  • April 6, 2021: Test input released
  • April 8, 2021: Test submissions due
  • April 12, 2021: System descriptions due
  • April 19, 2021: System descriptions notification of acceptance
  • April 26, 2021: Camera-ready system descriptions due (hard deadline)
  • NLP4IF 2021 Workshop

    NLP4IF is dedicated to NLP methods that potentially contribute (either positively or negatively) to the free flow of information on the Internet, or to our understanding of the issues that arise in this area. We hope that our workshop will have a transformative impact on society by getting closer to achieving Internet freedom in countries where accessing and sharing of information are strictly controlled by censorship.

    The workshop is supported by the U.S. National Science Foundation, award No. #1828199

    The topics of interest include (but are not limited) to the following:

  • Censorship detection: detecting deleted or edited text; detecting blocked keywords/banned terms;
  • Censorship circumvention techniques: linguistically inspired countermeasure for Internet censorship such as keyword substitution, expanding coverage of existing banned terms, text paraphrasing, linguistic steganography, generating information morphs etc.;
  • Detection of self-censorship;
  • Identifying potentially censorable content;
  • Disinformation/Misinformation detection: fake news, fake accounts, rumor detection, etc.;
  • Identification of propaganda at document and fragment level
  • Identification of hate speech
  • (Comparative) analysis of the language of propagandistic and biased texts
  • Automatic generation of persuasive content
  • Automatic debiasing of news content
  • Tools to facilitate the flagging, either automatic or manual, of propaganda and bias in social media
  • Automatic detection of coordinated propaganda campaigns such as the use of social bots, botnets, and water armies
  • Analysis of diffusion and consumption of propagandistic, hyperpartisan, and extremely biased content in social networks
  • Techniques to empirically measure Internet censorship across communication platforms;
  • Investigations on covert linguistic communication and its limits;
  • Identity and private information detection;
  • Passive and targeted surveillance techniques;
  • Ethics in NLP;
  • “Walled gardens”, personalization and fragmentation of the online public space;
  • We hope that our workshop will have a transformative impact on society by getting closer to achieving Internet freedom in countries where accessing and sharing of information are strictly controlled by censorship.

    We accept submissions of short and long papers. See the guidelines here: https://2021.naacl.org/calls/style-and-formatting/

    Submission page: https://www.softconf.com/naacl2021/nlp4if2021/

    Schedule Detail

    Please note all times are PDT (Los Angeles time, GMT-7)

    • 8:00-8:10 (PDT)

      Introduction

      Opening Remarks

    • event speaker

      8:10-9:10 (PDT)

      Invited Speaker: Filippo Menczer

      "4 Reasons Why Social Media Make Us Vulnerable to Manipulation"

    • event speaker

      9:10-9:25 (PDT)

      "Leveraging Community and Author Context to Explain the Performance and Bias of Text-Based Deception Detection Models."

      Galen Weld, Ellyn Ayton, Tim Althoff and Maria Glenski
    • event speaker

      9:25-9:40 (PDT)

      "Improving Hate Speech Type and Target Detection with Hateful Metaphor Features"

      Jens Lemmens, Ilia Markov and Walter Daelemans
    • event speaker

      9:40-9:55 (PDT)

      “An Empirical Assessment of the Qualitative Aspects of Misinformation in Health News”

      Chaoyuan Zuo, Qi Zhang and Ritwik Banerjee
    • event speaker

      10:10-11:10 (PDT)

      Invited Speaker: Margaret E. Roberts

      "Resilience to Online Censorship"

    • event speaker

      11:10-11:25 (PDT)

      “Generalisability of Topic Models in Cross-corpora Abusive Language Detection”

      Tulika Bose, Irina Illina and Dominique Fohr
    • event speaker

      11:25-11:40 (PDT)

      “Extractive and Abstractive Explanations for Fact-Checking and Evaluation of News”

      Ashkan Kazemi, Zehua Li, Verónica Pérez-Rosas and Rada Mihalcea
    • event speaker

      11:40-11:55 (PDT)

      #Findings of the NLP4IF-2021 Shared Tasks on Fighting the COVID-19 Infodemic and Censorship Detection”

      Shaden Shaar, Firoj Alam, Giovanni Da San Martino, Alex Nikolov, Wajdi Zaghouani, Preslav Nakov and Anna Feldman
    • event speaker

      11:55-12:00 (PDT)

      Best Paper Award

      author names
    • event speaker

      12:00-1:25 (PDT)

      Poster Session Poster and Program Schedule

    • event speaker

      1:25-2:25 (PDT)

      Panel

    • event speaker

      2:25-2:30 (PDT)

      Ending Remarks

    Keynote Speakers


    Filippo Menczer

    (Indiana University)
    fil [at] iu.edu

    Bio: Filippo Menczer is a distinguished professor of informatics and computer science and director of the Observatory on Social Media at Indiana University. He holds a Laurea in Physics from the Sapienza University of Rome and a Ph.D. in Computer Science and Cognitive Science from the University of California, San Diego. Dr. Menczer is an ACM Fellow and a board member of the IU Network Science Institute. His research interests span Web and data science, computational social science, science of science, and modeling of complex information networks. In the last ten years, his lab has led efforts to study online misinformation spread and to develop tools to detect and counter social media manipulation. http://cnets.indiana.edu/fil/bio/sketch/

    Title: "4 Reasons Why Social Media Make Us Vulnerable to Manipulation"

    Abstract: As social media become major channels for the diffusion of news and information, it becomes critical to understand how the complex interplay between cognitive, social, and algorithmic biases triggered by our reliance on online social networks makes us vulnerable to manipulation and disinformation. This talk overviews ongoing network analytics, modeling, and machine learning efforts to study the viral spread of misinformationand to develop tools for countering the online manipulation of opinions. Joint work with collaborators at the Indiana University Observatory on Social Media (osome.iu.edu). This research is supported by the National Science Foundation, McDonnell Foundation, DARPA, Democracy Fund, Craig Newmark Philanthropies,and Knight Foundation. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of these funding agencies.


    Margaret E. Roberts

    (University of California - San Diego)
    meroberts [at] ucsd.edu

    Bio: Margaret Roberts is an Associate Professor at UC San Diego in the Department of Political Science and the Halıcıoğlu Data Science Institute. Her research interests lie in the intersection of political methodology and the politics of information, specifically focused on censorship and propaganda, digital politics, and the use of text analysis in social science. Her work has appeared in venues such as the American Journal of Political Science, American Political Science Review, Political Analysis and Science. Her recent book “Censored: Distraction and Diversion Inside China’s Great Firewall” was listed as one of the Foreign Affairs Best Books of 2018, was honored with the Goldsmith Book Award, and has been awarded the Best Book Award in the Human Rights Section and Information Technology and Politics Section of the American Political Science Association. She received a Ph.D. from Harvard University in 2014 and M.S. in Statistics from Stanford University in 2009. She holds the Chancellor's Associates Endowed Chair I at UCSD.

    Title: "Resilience to Online Censorship"

    Abstract: To what extent are Internet users resilient to online censorship? When does censorship influence consumption of information and when does it create backlash? Drawing on data reflecting censorship evasion of the Great Firewall of China, I examine the extent to which individuals affected by censorship seek out ways to route around it. Using censorship events of Wikipedia and Instagram, I examine how changes in the censorship and political environment influence censorship evasion. I find that crisis, as well as censorship of very popular and addictive websites, can create incentives for censorship evasion that in turn provides a gateway to long censored and sensitive political information. But, in the absence of a strong incentive to evade censorship, censorship events cut off access not only to political information, but also to opportunities for exploration and learning. Based on joint work with Jennifer Pan and Will Hobbs.


    “Misinformation, Disinformation, and Internet Freedom in the Time of the First Global Infodemic”


    Panelists



    Anjalie Field anjalief [at] andrew.cmu.edu

    Anjalie Field is a PhD student at the Language Technologies Institute at Carnegie Mellon University advised by Yulia Tsvetkov and a member of TsvetShop. Her primary interests involve using Natural Language Processing (NLP) to model social science concepts. Her current work is focused on framing and media bias. She recieved her Bachelors Degree from Princeton University in Computer Science with minors in Latin and Ancient Greek. Her undergraduate thesis advisor was Christiane Fellbaum, and her thesis focused on an automated analysis of Latin.


    Antoinette Pole (polea [at] montclair.edu

    Antoinette Pole is an Associate Professor and Deputy Chair of Political Science & Law at Montclair State University. She studies the intersection of information technology and politics, exploring theoretical questions related to representation and political participation. Professor Pole has authored a book on blogs titled, Blogging the Political: Politics and Participation in a Networked Society (Routledge, 2010)and coauthored, New York Politics: A Tale to Two States, Second Edition (ME Sharpe, 2010). Her work appears in peer-reviewed journals such as New Media & Society, Health Education & Behavior, British Food Journal, Journal of Agriculture and Human Values, Public Choice, American Journal Public Health, and the Journal of Health Communication. Currently, Professor Pole's research focuses on craft beer consumption; fish consumption; and how the 2016 election shaped interactions on Facebook. https://www.montclair.edu/profilepages/view_profile.php?username=polea


    Paolo Rosso (prosso [at] dsic.upv.es)

    Paolo Rosso is Full Professor at the Universitat Politècnica de València, where he is also a member of the PRHLT research center (http://personales.upv.es/prosso/). His research interests are focused on social media data analysis, mainly on author profiling, sarcasm detection, fake news and hate speech detection. He has published 50+ articles in journals and 400+ articles in conferences and workshops, and he is among the top 25 most cited computer science researchers in Spain (H-index: 60). Since 2014 he is Deputy Steering Committee Chair of the CLEF Association. Currently, he is the PI of the Spanish research project on MIsinformation and Miscommunication: FAKEnHATE, and member of the European Digital Media Observatory IBERIFIER on monitoring the threats of disinformation. He gave keynotes on fake news and hate speech at CICLing-2019 and TSD-2020, and a tutorial at CIKM-2020 on online harmful information. He was co-organiser of the shared task at PAN on Fake news spreaders on Twitter. His work on the detection of false information and hate speech was covered by Spanish media, and recently he gave a webinar at Spain AI on Detection of harmful information: fake news, conspiracy theories and hate speech. He has been advisor of 23 PhD students: the last PhD thesis (2020) was On the detection of false information: from rumors to fake news.


    Roberto Di Pietro (rdipietro [at] hbku.edu.qa)

    Dr. Roberto Di Pietro, ACM Distinguished Scientist, is Full Professor in Cybersecurity at HBKU-CSE. Previously, he was in the capacity of Global Head Cybersecurity Research at Nokia Bell Labs, and Associate Professor (with tenure) of Computer Science at University of Padova, Italy. He has been working in the security field for 24+ years, leading both technology-oriented and research-focused teams in the private sector, government, and academia (MoD, United Nations HQ, EUROJUST, IAEA, WIPO). His main research interests include AI driven cybersecurity, security and privacy for wired and wireless distributed systems (e.g. Blockchain technology, Cloud, IoT, OSNs), virtualization security, applied cryptography, computer forensics, and data science. Other than being involved in M&A of start-up---and having founded one (exited)---, he has been producing 230+ scientific papers and patents over the cited topics, has co-authored three books, edited one, and contributed to a few others. In 2011-2012 he was awarded a Chair of Excellence from University Carlos III, Madrid. In 2020 he received the Jean-Claude Laprie Award for having significantly influenced the theory and practice of Dependable Computing.


    youtube to mp3

    VENUE

    Co-located with NAACL 2021 https://2021.naacl.org/ currently scheduled to be held in Mexico City, Mexico

    Important Dates (Regular Papers)

    March 22, 2021: Workshop Papers Due Date

    April 15, 2021: Notification of Acceptance

    April 26, 2021: Camera-ready papers due (hard deadline)

    June 6, 2021: NLP4IF Workshop

    Note: All deadlines are 11:59 pm UTC -12h (anywhere on earth).

    Program

    According to a recent report produced by Freedom House (freedomhouse.org), an independent watchdog organization dedicated to the expansion of freedom and democracy around the world, political rights and civil liberties around the world deteriorated to their lowest point in more than a decade in 2017. Online manipulation and disinformation tactics played an important role in elections in at least 18 countries over the past year, including the United States (see Freedom House reports). Disinformation tactics contributed to a seventh consecutive year of overall decline in internet freedom, as did a rise in disruptions to mobile internet service and increases in physical and technical attacks on human rights defenders and independent media. A record number of governments have restricted mobile internet service for political or security reasons, often in areas populated by ethnic or religious minorities. The use of “fake news,” automated “bot” accounts, and other manipulation methods gained particular attention in the United States. While the country’s online environment remained generally free, it was troubled by a proliferation of fabricated news articles, divisive partisan vitriol, and aggressive harassment of many journalists, both during and after the presidential election campaign. Venezuela, the Philippines, and Turkey were among 30 countries where governments were found to employ armies of “opinion shapers” to spread government views, to drive particular agendas, and to counter government critics on social media. The number of governments attempting to control online discussions in this manner has risen each year since Freedom House began systematically tracking the phenomenon in 2009. Various barriers exist to prevent citizens of a large number of countries from accessing information in many countries around the world. Some involve infrastructural and economic barriers, others include violations of user rights such as surveillance, privacy and repercussions for online speech and activities such as imprisonment, extralegal harassment or cyberattacks. Yet another area is limits on content, which involves legal regulations on content, technical filtering and blocking websites, (self-)censorship. Large Internet service providers (ISPs) are effective monopolies, and have the power to use NLP techniques to control the information flow. Users have been suspended or banned, sometimes without human intervention, and with little opportunity for redress. Users reacted to this by using coded, oblique or metaphorical language, by taking steps to conceal their identity such as the use of multiple accounts, raising questions about who the real originating author of a post actually is.

    Submissions should be written in English and anonymized with regard to the authors and/or their institution (no author-identifying information on the title page nor anywhere in the paper), including referencing style as usual. Authors should also ensure that identifying meta-information is removed from files submitted for review.

    Dual submission policy: NAACL-HLT 2021 will not consider any paper that is under review in a journal or another conference at the time of submission. This policy covers all refereed and archival conferences and workshops (including ACL workshops). For example, a paper under review at an EACL workshop cannot be dual-submitted to NAACL-HLT 2021. In addition, we will not consider any paper that overlaps significantly (more than 25%) in content or results with papers that will be (or have been) published elsewhere. Papers may not be submitted elsewhere during the NAACL-HLT 2021 review period. Authors submitting more than one paper to NAACL-HLT 2021 must ensure that the submissions do not overlap significantly (less than 25%) with each other in content or results.

    We accept submissions of short and long papers. See the guidelines here: https://2021.naacl.org/calls/style-and-formatting/

    Submission page:

    FAQ: virtual attendance, and LaTeX templates https://2021.naacl.org/faq/

  • Giovanni Da San Martino, Scientist, Qatar Computing Research Institute. gmartino[at]qf.org.qa
  • Anna Feldman, Professor of Linguistics and Computer Science at Montclair State University. feldmana[at]montclair.edu
  • Chris Leberknight, Associate Professor of Computer Science at Montclair State University. leberknightc[at]montclair.edu
  • Preslav Nakov, Senior Scientist, Qatar Computing Researach Institute. pnakov[at]qf.org.qa
  • Tariq Alhindi, Columbia University (USA)
  • Alberto Barŕon-Cedeño, University of Bologna (Italy)
  • Jed Crandall, University of New Mexico, NM (USA)
  • Anjalie Field, Carnegie Mellon University, PA (USA)
  • Yiqing Hua, Cornell Tech (USA)
  • Jeffrey Knockell, The Citizen Lab, University of Toronto (Canada)
  • Henrique Lopes Cardoso, University of Porto (Portugal)
  • Hannah Rashkin, University of Washington (USA)