2020 Workshop Organizers

  • Giovanni Da San Martino, Qatar Computing Research Institute, HBKU. gmartino[at]hbku.edu.qa

  • Chris Brew, LivePerson. christopher.brew[at]gmail.com

  • Giovanni Luca Ciampaglia, University of South Florida. glc3[at]mail.usf.edu

  • Anna Feldman, Montclair State University. feldmana[at]montclair.edu

  • Chris Leberknight, Montclair State University. leberknightc[at]montclair.edu

  • Preslav Nakov, Qatar Computing Research Institute, HBKU. pnakov[at]hbku.edu.qa

Past Workshops

NLP4IF 2018

NLP4IF 2019

NLP4IF 2020 Workshop on Underline Platform

(you need to register for COLING in order to access it)

NLP4IF is dedicated to NLP methods that potentially contribute (either positively or negatively) to the free flow of information on the Internet, or to our understanding of the issues that arise in this area. We hope that our workshop will have a transformative impact on society by getting closer to achieving Internet freedom in countries where accessing and sharing of information are strictly controlled by censorship.

The workshop is supported by the U.S. National Science Foundation, award No. #1828199

The topics of interest include (but are not limited) to the following:

  • Censorship detection: detecting deleted or edited text; detecting blocked keywords/banned terms;
  • Censorship circumvention techniques: linguistically inspired countermeasure for Internet censorship such as keyword substitution, expanding coverage of existing banned terms, text paraphrasing, linguistic steganography, generating information morphs etc.;
  • Detection of self-censorship;
  • Identifying potentially censorable content;
  • Disinformation/Misinformation detection: fake news, fake accounts, rumor detection, etc.;
  • Identification of propaganda at document and fragment level
  • Identification of hate speech
  • (Comparative) analysis of the language of propagandistic and biased texts
  • Automatic generation of persuasive content
  • Automatic debiasing of news content
  • Tools to facilitate the flagging, either automatic or manual, of propaganda and bias in social media
  • Automatic detection of coordinated propaganda campaigns such as the use of social bots, botnets, and water armies
  • Analysis of diffusion and consumption of propagandistic, hyperpartisan, and extremely biased content in social networks
  • Techniques to empirically measure Internet censorship across communication platforms;
  • Investigations on covert linguistic communication and its limits;
  • Identity and private information detection;
  • Passive and targeted surveillance techniques;
  • Ethics in NLP;
  • “Walled gardens”, personalization and fragmentation of the online public space;
  • We hope that our workshop will have a transformative impact on society by getting closer to achieving Internet freedom in countries where accessing and sharing of information are strictly controlled by censorship.


    Schedule Detail

    Please note all times are Central European Time (CET)

    • 2:00-2:05 (CET)


    • event speaker

      2:05-2:20 (CET)

      Rutvik Vijjali, Prathyush Potluri, Siddharth Kumar and Sundeep Teki. Two Stage Transformer Model for COVID-19 Fake News Detection and Fact Checking Slides coming soon

    • event speaker

      2:20-2:35 (CET)

      Timothy Niven and Hung-Yu Kao, Measuring Alignment to Authoritarian State Media as Framing Bias.Slides coming soon

    • event speaker

      2:35-2:50 (CET)

      Mingi Shin, Sungwon Han, Sungkyu Park and Meeyoung Cha, A Risk Communication Event Detection Model via Contrastive Learning Slides coming soon

    • event speaker

      2:50-3:00 (CET)


    • event speaker

      3:00-3:15 (CET)

      Coffee Break

    • event speaker

      3:15-4:15 (CET)

      Keynote talk: Aylin Caliskan Slides coming soon

    • event speaker

      4:15-4:20 (CET)

      Low Tea

    • event speaker

      4:20-4:35 (CET)

      Anushka Prakash and Harish Tayyar Madabushi, Incorporating Count-Based Features into Pre-Trained Models for Improved Stance Detection Slides coming soon

    • event speaker

      4:35-4:50 (CET)

      Lily Li, Or Levi and Pedram Hosseini, A Multi-Modal Method for Satire Detection using Textual and Visual Cues Slides coming soon

    • event speaker

      4:50-5:00 (CET)


    • event speaker

      5:00-6:00 (CET)

      Keynote talk: Stephan Lewandowsky

    • event speaker

      6:00-6:15 (CET)

      High Tea

    • event speaker

      6:15-6:45 (CET)


    • event speaker

      6:45-7:00 (CET)


    Keynote Speakers

    Aylin Caliskan

    (George Washington University)

    Bio: Aylin Caliskan is an Assistant Professor of Computer Science at George Washington University. Her research interests lie in AI ethics, bias in AI, machine learning, and the implications of machine intelligence on fairness and privacy. She investigates the reasoning behind biased AI representations and decisions by developing explainability methods that uncover and quantify biases of machines. Building these transparency enhancing algorithms involves the heavy use of machine learning, natural language processing, and computer vision in novel ways to interpret AI and gain insights about bias in machines as well as society. In her recent publication in Science, she demonstrated how semantics derived from language corpora contain human-like biases. Prior to that, she developed novel privacy attacks to de-anonymize programmers using code stylometry. Her presentations on both de-anonymization and bias in machine learning are the recipients of best talk awards. Her work on semi-automated anonymization of writing style furthermore received the Privacy Enhancing Technologies Symposium Best Paper Award. Aylin holds a PhD in Computer Science from Drexel University and a Master of Science in Robotics from the University of Pennsylvania. Before joining the faculty at George Washington University, she was a Postdoctoral Researcher and a Fellow at Princeton University's Center for Information Technology Policy.

    Title: Implications of Biased AI on Democracy, Equity, and Justice

    Abstract: Billions of people on the internet are exposed to the outputs of downstream natural language processing (NLP) applications on a daily basis. Many of these NLP applications use word embeddings as general purpose language representations. In this talk, Aylin Caliskan will introduce the Word Embedding Association Test (WEAT) to demonstrate that word embeddings trained on language corpora embed the biases and associations documented by the Implicit Association Test in social psychology. In particular, WEAT measures how statistical regularities of language capture biases and stereotypes, such as racism, sexism, and attitudes toward social groups. Word embeddings are trained on text collected from the internet, that includes society's organic natural language data in addition to text from information influence operations. The adaptation of WEAT to the information influence domain automatically characterizes overall attitudes and biases associated with emerging information influence operations. Accurate analysis of these emerging topics usually requires laborious, manual analysis by experts to annotate a large set of data points to identify biases in new topics. We validate our practical and non-parametric method using known information operation-related tweets from Twitter's Transparency Report. We perform a case study on the COVID-19 pandemic to evaluate our method's performance on non-labeled Twitter data, demonstrating its usability in emerging domains.

    Stephan Lewandowsky

    (University of Bristol)

    Bio: Professor Stephan Lewandowsky is a cognitive scientist at the University of Bristol. He was an Australian Professorial Fellow from 2007 to 2012, and was awarded a Discovery Outstanding Researcher Award from the Australian Research Council in 2011.His research examines people’s memory, decision making, and knowledge structures, with a particular emphasis on how people update information in memory. His most recent research interests examine the potential conflict between human cognition and the physics of the global climate, which has led him into research in climate science and climate modeling.He has published more than 200 scholarly articles, chapters, and books, including numerous papers on how people respond to corrections of misinformation and what variables determine people’s acceptance of scientific findings.Professor Lewandowsky is an award-winning teacher and was Associate Editor of the Journal of Experimental Psychology: Learning, Memory, and Cognition from 2006-2008. He has also contributed around 50 opinion pieces to the global media on issues related to climate change “skepticism” and the coverage of science in the media. He is currently serving as Digital Content Editor for the Psychonomic Society and blogs routinely on cognitive research.

    Title: Diversionary agenda setting and micro-targeted persuasion

    Abstract: We are said to live in a “post-truth” era in which “fake news” has replaced real information, denial has compromised science, and the ontology of knowledge and truth has taken on a relativist element. I argue that to defend evidence-based reasoning and knowledge against those attacks, we must understand the strategies by which the post-truth world is driven forward. I depart from the premise that the post-truth era did not arise spontaneously but is the result of a highly effective political movement that deploys a large number of rhetorical strategies. I focus on two strategies: Diversionary agenda-setting by social media and the use of “micro-targeting” of political messages online. I present evidence for the existence of each strategy and its impact, and how it might be countered.

    The First Global Infodemic: Censorship, Disinformation, and Propaganda in the Era of COVID-19


    Stephan Lewandowsky (SEE ABOVE)

    Andreas Vlachos

    Andreas Vlachos is a senior lecturer at the Natural Language and Information Processing group at the Department of Computer Science and Technology at the University of Cambridge. Current projects include dialogue modelling, automated fact checking and imitation learning. He has also worked on semantic parsing, natural language generation and summarization, language modelling, information extraction, active learning, clustering, and biomedical text mining. His research is supported by ERC, EPSRC, ESRC, Facebook, Amazon, Google and Huawei.

    Joshua Tucker

    Joshua A. Tucker is Professor of Politics, affiliated Professor of Russian and Slavic Studies, and affiliated Professor of Data Science at New York University. He is the Director of NYU’s Jordan Center for Advanced Study of Russia, a co-Director of the NYU Center for Social Media and Politics, and a co-author/editor of the award-winning politics and policy blog The Monkey Cage at The Washington Post. He serves on the advisory board of the American National Election Study, the Comparative Study of Electoral Systems, and numerous academic journals, and was the co-founder and co-editor of the Journal of Experimental Political Science. His original research was on mass political behavior in post-communist countries, including voting and elections, partisanship, public opinion formation, and protest participation. More recently, he has been at the forefront of the newly emerging field of study of the relationship between social media and politics. His research in this area has included studies on the effects of network diversity on tolerance, partisan echo chambers, online hate speech, the effects of exposure to social media on political knowledge, online networks and protest, disinformation and fake news, how authoritarian regimes respond to online opposition, and Russian bots and trolls, and he is currently the co-Chair of the independent academic advisory board of the 2020 Facebook Election Research Study. An internationally recognized scholar, he has served as a keynote speaker for conferences in Sweden, Denmark, Italy, Brazil, the Netherlands, Russia, and the United States, and has given more than 100 invited research presentations at top domestic and international universities and research centers over the past decade. His research has appeared in over two-dozen scholarly journals and has been supported by a wide range of philanthropic foundations, as well as multiple grants from the National Science Foundation. His most recent books are the co-authored Communism’s Shadow: Historical Legacies and Contemporary Political Attitudes (Princeton University Press, 2017), and the co-edited Social Media and Democracy: The State of the Field (Cambridge University Press, 2020).

    Veronica Perez-Rosas

    Veronica Perez-Rosas is an assistant research scientist at the University of Michigan. Her research interests include machine learning, natural language processing, computational linguistics, affect recognition, and multimodal analysis of human behavior. Her research focuses on developing computational methods to analyze, recognize, and predict human affective responses during social interactions. She has authored papers in leading conferences and journals in Natural Language Processing and Computational linguistics and served as a program committee member for multiple international journals and conferences in the same fields.

    Roya Ensafi

    Roya Ensafi is an assistant professor in computer science and engineering at the University of Michigan, where her research focuses on computer networking, security, and privacy. Her notable projects with real-world impact include founding Censored Planet, a global censorship observatory, and researching the Kazakhstan HTTPS MitM interception, the Great Cannon of China, and large-scale study of server-side geoblocking. She has received the NSF CISE Research Initiation Initiative award, the Google Faculty Research Award, and Consumer Report Digital Lab fellowship. Roya’s work has appeared in the popular press publications such as the NY Times, Wired, Business Insider, and ArsTechnica.



    The NLP4IF Workshop is held in conjunction with COLING 2020 that will take place in Barcelona, Spain. COLING 2020 will be held virtually from December 8th through the 11th

    The workshop will be run virtually on the Underline Platform (you need to register for COLING in order to access it)

    Important Dates

    Submission deadline: September 1, 2020 (23:59 PM Pacific Standard Time)

    Notification of acceptance: October 1, 2020

    Camera-ready papers due: October 20, 2020

    Workshop: December 12, 2020


    According to a recent report produced by Freedom House (freedomhouse.org), an independent watchdog organization dedicated to the expansion of freedom and democracy around the world, political rights and civil liberties around the world deteriorated to their lowest point in more than a decade in 2017. Online manipulation and disinformation tactics played an important role in elections in at least 18 countries over the past year, including the United States (see Freedom House reports). Disinformation tactics contributed to a seventh consecutive year of overall decline in internet freedom, as did a rise in disruptions to mobile internet service and increases in physical and technical attacks on human rights defenders and independent media. A record number of governments have restricted mobile internet service for political or security reasons, often in areas populated by ethnic or religious minorities. The use of “fake news,” automated “bot” accounts, and other manipulation methods gained particular attention in the United States. While the country’s online environment remained generally free, it was troubled by a proliferation of fabricated news articles, divisive partisan vitriol, and aggressive harassment of many journalists, both during and after the presidential election campaign. Venezuela, the Philippines, and Turkey were among 30 countries where governments were found to employ armies of “opinion shapers” to spread government views, to drive particular agendas, and to counter government critics on social media. The number of governments attempting to control online discussions in this manner has risen each year since Freedom House began systematically tracking the phenomenon in 2009. Various barriers exist to prevent citizens of a large number of countries from accessing information in many countries around the world. Some involve infrastructural and economic barriers, others include violations of user rights such as surveillance, privacy and repercussions for online speech and activities such as imprisonment, extralegal harassment or cyberattacks. Yet another area is limits on content, which involves legal regulations on content, technical filtering and blocking websites, (self-)censorship. Large Internet service providers (ISPs) are effective monopolies, and have the power to use NLP techniques to control the information flow. Users have been suspended or banned, sometimes without human intervention, and with little opportunity for redress. Users reacted to this by using coded, oblique or metaphorical language, by taking steps to conceal their identity such as the use of multiple accounts, raising questions about who the real originating author of a post actually is.

    Submissions should be written in English and anonymized with regard to the authors and/or their institution (no author-identifying information on the title page nor anywhere in the paper), including referencing style as usual. Authors should also ensure that identifying meta-information is removed from files submitted for review.

    Dual submission policy: papers being submitted to other conferences or workshops can be submitted in parallel to COLING, on condition that submissions at other conferences will be withdrawn if the paper is accepted for COLING. Authors must clearly specify the other conferences or workshops to which the paper is being submitted and also declare that they will withdraw these other submissions if the paper is accepted for COLING, or alternatively, withdraw the paper from COLING 2020. Please list the names and dates of conferences, workshops or meetings where you have submitted or plan to submit this paper in addition to COLING 2020.

    We invite submissions of up to nine (9) pages maximum, plus bibliography for long papers and four (4) pages, plus bibliography, for short papers. The COLING’2020 templates must be used; these are provided in LaTeX and also Microsoft Word format. Submissions will only be accepted in PDF format.

    Submission page: https://www.softconf.com/coling2020/NLP4IF/

    Formatting requirements: https://coling2020.org/coling2020.zip

  • Giovanni Da San Martino, Qatar Computing Research Institute, HBKU. gmartino[at]hbku.edu.qa
  • Chris Brew, LivePerson. christopher.brew[at]gmail.com
  • Giovanni Luca Ciampaglia, University of South Florida. glc3[at]mail.usf.edu
  • Anna Feldman, Montclair State University. feldmana[at]montclair.edu
  • Chris Leberknight, Montclair State University. leberknightc[at]montclair.edu
  • Preslav Nakov, Qatar Computing Research Institute, HBKU. pnakov[at]hbku.edu.qa
  • Tariq Alhindi, Columbia University (USA)
  • Alberto Barŕon-Cedeño, University of Bologna (Italy)
  • Jed Crandall, University of New Mexico, NM (USA)
  • Anjalie Field, Carnegie Mellon University, PA (USA)
  • Yiqing Hua, Cornell Tech (USA)
  • Jeffrey Knockell, The Citizen Lab, University of Toronto (Canada)
  • Henrique Lopes Cardoso, University of Porto (Portugal)
  • Hannah Rashkin, University of Washington (USA)
  • Mailing list for the workshop(https://groups.google.com/forum/#!forum/nlp4if)