jorge@home:~$

  • Processing PDF Files

    Up to this point, we have been able to extract topics from the bible taking each chapter as an article and then extracting topics from the collection of articles. But the initial purpose of this little project of mine was to process and analyze PDF books/articles/papers in bulk; so, let’s...

  • Topic Visualization

    <!DOCTYPE html> g_b_jupyter_visualization Visualizing a Gensim model¶ Fitting the LDA model¶ In [1]: from tinydb import TinyDB, Query import c_a_parameters_bible as pb import sqlite3 from sqlite3 import Error from nltk.corpus import wordnet as wn from gensim import corpora import pickle def get_lemma(word): lemma = wn.morphy(word) if lemma is None: return word...

  • Topic Modeling

    Some considerations to begin with Topic Modeling learnt from the guide articles: If one seeks to create a topic model that humans can interpret, then one would typically choose a low number of topics (e.g., between 10 and 50). If, in contrast, one seeks the topic model to serve as...