Hi all,
We had some interesting questions in last meeting regarding learning
features for words, paragraphs, and nodes. The explanation for these
questions are as follows:
1) Input to SkipGram and Continuous Bag of Words Models
In Continuous Bag of Words Model the training data consists of tuples
(input, output) where input is the context ( several words ) and output is
a single word. For example, for sentence "Sam lives in Blacksburg", a
possible training data is input: (Sam, Live) and output is (Blacksburg).
In SkipGram model, the situation is reversed, the goal is to predict
context given a word. Therefore, input is a single word and output is the
context.
2) Does sequence matter in DeepWalk ?
The answer is no. DeepWalk uses SkipGram Model, where the order in the
context does not matter. Therefore as long as two nodes are in the same
random walk (of fixed length), they are considered to be close to each
other, regardless of how far apart they are in the walk.
3) Difference between DeepWalk and Node2Vec
The major difference between approaches in Node2Vec and DeepWalk is that
the DeepWalk uses Random Walk while Node2Vec uses Second Order Random
Walks. The idea behind using Random Walks in DeepWalk is that more central
nodes in "Scale-free" networks tend to appear more frequently in random
walks, they are natural means to capture graph structure. (Word frequency
in natural languages also follows power law distribution).
In Node2Vec, the idea is to capture both macroscopic and microscopic view
of the network structure around a node. This is enabled by Second Order
Random Walks, which is generalization of Random Walks.
This difference has some implications for the features learned. The
features learned for Node2Vec depends on Second Order Random Walks
parameters P and Q. Therefore, for different values of P and Q, we will get
totally different distances between nodes in the feature space. However,
for DeepWalk, the distances between nodes in the feature space are more or
less fixed.
Please let me know if you have further queries.
Best regards,
Bijaya Adhikari.