word2vec part 2: graph building and training

Posted on Tue 03 April 2018 in blog • Tagged with python, machine learning, tensorflow, nlp, prediction, word2vec

In the last post we built the data preprocessing required for word2vec training. In this post we will build the network and perform the training on the text8 dataset (source), a Wikipedia dump of ~17 million tokens.

Note that we are implementing the skip-gram version of word2vec since it has superior performance
Continue reading