Hi everyone,
Please join us for our next DM-Meeting.
****DM-Meeting Fall 2017****
WHO: Yao Zhang
*WHEN: Monday, Sep. 18, 2017 @ 1:15 pmWHERE: Torgerson Hall, Room 3160*
WHAT: Yao will give his prep talk for his final defense.
Please, note that we will skip our meeting next week on Thursday.
The following is the title and the abstract of Yao's talk,
=================
Title: Optimizing and Understanding Network Structure for Diffusion
Abstract
Given a population contact network and electronic medical records of
patients, how to distribute vaccines to individuals to effectively control
a flu epidemic? Similarly, given the Twitter following network and tweets,
how to choose the best communities/groups to stop rumors from spreading?
How to find the best accounts that bridge celebrities and ordinary users?
These questions are related to diffusion (aka propagation) phenomena.
Diffusion can be treated as a behavior of spreading contagions (like
viruses, ideas, memes, etc.) on some underlying network. It is omnipresent
in areas such as social media, public health, and cyber security. Examples
include diseases like flu spreading on person-to-person contact networks,
memes disseminating by online adoption over online friendship networks, and
malware propagating among computer networks. When a contagion spreads,
network structure (like nodes/edges/groups, etc.) plays a major role in
determining the outcome. For instance, a rumor, if propagated by
celebrities, can go viral. Similarly, an epidemic can die out quickly, if
vulnerable demographic groups are successfully targeted for vaccination.
Hence in this thesis, we aim to optimize and understand network structure
better in light of diffusion. We optimize graph topologies by removing
nodes/edges for controlling rumors/ viruses from spreading, and gain a
deeper understanding of a network in terms of diffusion by exploring how
nodes group together for similar roles of dissemination. We develop several
novel graph mining algorithms, with different levels of granularity
(node/edge level to group/community level), from model-driven and
data-driven perspectives, focusing on topics like immunization on networks,
graph summarization, community detection and graph embedding. In contrast
to previous work, we are the first to systematically develops more
realistic, implementable and data-based graph algorithms to control
contagions. In addition, our thesis is also the first work to use diffusion
to effectively summarize graphs and understand communities/groups of
networks in a general way
1. Model-driven. Diffusion processes are usually described using
mathematical models, e.g., the Independent Cascade (IC) model in social
media, and the Susceptible-Infectious-Recovered (SIR) model in
epidemiology. Given such models, we propose to optimize network structure
for controlling propagation (the immunization problem) in several practical
and implementable settings, taking into account the presence of infections,
the uncertain nature of the data and group structure of the population. We
develop efficient algorithms for different interventions, such as
vaccination (node removal) and quarantining (edge removal). In addition, we
study the graph coarsening problem to obtain a better understanding of
relations among nodes when a contagion is propagating. We seek to get a
much smaller representation of a large network, while preserving its
diffusive properties.
2. Data-driven. Model-driven approaches can provide ideal results if
underlying diffusion models are given. However, in many situations,
diffusion processes are very complicated, and it is challenging or even
impossible to pick the most suited model to describe them. In addition,
rapid technological development has provided an abundance of data such as
tweets and electronic medical records. Hence, in the second part of the
thesis, we explore data-driven approaches for diffusion in networks, which
can directly work on propagation data by relaxing modeling assumptions of
diffusion. To be specific, we first develop data-driven immunization
strategies to stop rumors or allocate vaccines by optimizing network
topologies, using large-scale national-level diagnostic patient data with
billions of flu records. Second, we propose a novel community detection
problem to discover "bridge" and "celebrity" communities from social media
data, and design case studies to understand roles of nodes/communities
using diffusion. Second, Finally, we study the subgraph embedding problem,
which seeks to map subgraphs such as a snapshot of a cascade into low
dimensional feature space to facilitate network analysis.
Our work has many applications in multiple areas such as epidemiology,
sociology and computer science. For example, our work on efficient
immunization algorithms, such as data-driven immunization, can help CDC
better allocate vaccines to control flu epidemics in major cities.
Similarly, in social media, our work on understanding network structure
using diffusion can lead to better community discovery, such as finding
media accounts that can boost tweet promotions in Twitter.
Best regards,
Sorour