0

Suppose I have a corpus of short sentences of which the number of words ranges from 1 to around 500 and the average number of words is around 9. If I train a Gensim Word2vec model using window=5(which is the default), should I use all of the sentences? or I should remove sentences with low word count? If so, is there a rule of thumb for the minimum number of words?

Anonymous Asked question May 14, 2021