jorge@home:~$

  • Milestone 1

    Scripts Source Code This is what we have so far: Data: A whole bible text inside TinyDB documents database entities. Code: A function that returns the raw text of a whole chapter on receiving the ChapterID NLTK environment up and running. The code used to retrieve this information is inside...

  • Database support

    Since a lot of the data retrieved from the API comes in JSON format, and NLTK outputs are also in JSON, it stands to reason to use a JSON Documents based database to store the partial results and help to process and analyze the data, something on the line of...

  • Tokenizing with NLTK

    Using our function to get the whole chapter text we can start trying some of the NLTK tokenizers : Tokenizing text into sentences: >>> import sys >>> import json >>> from nltk.tokenize import sent_tokenize >>> from TextFromChapter import wholeChapter >>> with open('GEN.1.json', 'r') as f: ... chapter_dict = json.load(f) ......