The U.S. National Science Foundation (NSF) has awarded $1.8 million in grants to the University of California-Los Angeles (UCLA) and the University of Illinois at Urbana-Champaign (UIUC) to investigate Joint Image-Text Parsing and Reasoning for Analyzing Social and Political News Events. The four-year project will be led by Professors Song-Chun Zhu at UCLA and ChengXiang Zhai at UIUC. The research aims to go "beyond traditional object detection, segmentation, and recognition by studying framing and persuasion techniques in images, an untouched topic in computer vision." In addition, "the team is studying image parsing to fill the semantic gap - a long standing technical barrier in image retrieval."
Here is the complete abstract of the research proposal, courtesy of the NSF:
Summary: Rapidly changing technologies of multi-modal communication, from the global reach of international satellite TV, the proliferation of Internet news outlets, to YouTube, are transforming the news industry. In parallel, ?citizen journalism? is on the rise, enabled by smart phones, social networks, and blogs. The Internet is becoming a vast information ecosystem driven by mediated events ? elections, social movements, natural disasters, disease epidemics ? with rich heterogeneous data: text, image, and video. Meanwhile, the tools and methodologies for users and researchers are not keeping pace: it remains prohibitively labor-intensive to systematically access and study the vast amount of emerging news data.
Leveraging UCLA's ongoing digital collection of 85,000 hours of news videos, including 8.1 billion image frames and 530 million words of closed captioning, the research team is developing a new computational paradigm for analyzing massive datasets of social and political news events: (i) Studying joint image-text parsing to categorize news by topics and events, and analyzing selection and presentation biases across networks and media spheres in a statistical and quantitative manner never before possible; (ii) Studying by joint image-text mining to reason the persuasion intents, and modeling the techniques of verbal and visual persuasions; (iii) Discovering spatio-temporal patterns in the interactions of multiple mediated events, and analyzing agenda setting patterns; and (iv) Developing an interactive multi-perspective news interface, vrNewsScape, for visualizing and interacting with our computational and statistical results.
Intellectual merit: This interdisciplinary project makes innovative contributions to three disciplines. Transforming social science research. The project develops a data-driven paradigm for transforming communication research in the social sciences. By enabling quantitative studies of massive visual datasets, the research team identifies and characterizes large-scale patterns of news mediation and persuasion currently inaccessible to researchers, due to the prohibitive cost of manual analysis. The research team goes beyond traditional object detection, segmentation, and recognition by studying framing and persuasion techniques in images, an untouched topic in computer vision. The team studies semantic associations and meanings for object and scene categories in their social context. Also, the team is studying image parsing to fill the semantic gap ? a long standing technical barrier in image retrieval, and will generate narrative text descriptions from the parse trees so that they can be fused with the input text and closed captioning for topic mining.
The research goes beyond conventional topic mining from text to perform integrative text-image mining, bias detection, and pattern discovery in the spatio-temporal evolution of mediated news events. The research detects and summarizes controversy and mine user-generated content for analyzing communicative intent and persuasive effects.
Broader impacts: vrNewsScape is being made publicly available to researchers and graduate students. Because the news media report on events in multiple different expert domains ? including congressional and presidential politics, international relations, war and public uprisings, natural disasters and humanitarian aid missions, disease epidemics and health initiatives, criminal activity and court cases, celebrities and cultural events ? the analytical tools in development are not limited to a particular research domain in social, political and computer sciences, but permit for the first time a systematic and quantitative examination of the massive datasets required to understand today?s mediated society.
In education, the project extends UCLA?s Digital Civic Learning initiative (dcl.sscnet.ucla.edu), a program involving college and high-school students in the analysis of news, thus delivering education benefits to potentially a huge number of students nationwide in Communication Studies (in 2004, 433,000 college students were enrolled in Communication and Journalism and 209,000 in Political Science[153]), exposing them to a new generation of high-level tools for handling multimodal data and inspiring them to pursue computational thinking, in line with the NSF?s objectives.