θεόφιλος Journey

Milestone 1
04 Aug 2020
Scripts Source Code This is what we have so far: Data: A whole bible text inside TinyDB documents database entities. Code: A function that returns the raw text of a whole chapter on receiving the ChapterID NLTK environment up and running. The code used to retrieve this information is inside...
Database support
03 Aug 2020
Since a lot of the data retrieved from the API comes in JSON format, and NLTK outputs are also in JSON, it stands to reason to use a JSON Documents based database to store the partial results and help to process and analyze the data, something on the line of...
Tokenizing with NLTK
28 Jul 2020
Using our function to get the whole chapter text we can start trying some of the NLTK tokenizers : Tokenizing text into sentences: >>> import sys >>> import json >>> from nltk.tokenize import sent_tokenize >>> from TextFromChapter import wholeChapter >>> with open('GEN.1.json', 'r') as f: ... chapter_dict = json.load(f) ......

jorge@home:~$