Page History

...

With regards to our sentence segmentation code, we use a lexicon-based segmentation approach. Thus, if a user contributes a new sentence, then provided that the words are in the lexicon, the sentence will be properly segmented, and the sentence can be found by making any of those words it is segmented into the study focus. Because our lexicon is extremely large and comprehensive (several megabytes), then the words in a sentence are usually found in the lexicon, so the sentence is properly segmented. Additionally, this means that when a user contributes a sentence, he doesn't need to segment the sentence himself, or provide any tagging for the words - he simply inputs the sentence, and it is automatically segmented and annotated with its vocab words. However, this automatic lexicon-based segmentation and word extraction also prevents users from being able to use out-of-lexicon terms in their contributed sentences (they can use them, but they will not be annotated with that vocab word).

The server-side code is a , which stores login credentials and the study focus, study focus history, words that are currently allowed to be displayed, sentences that are currently displayed, and contributed sentences for each user, is implemented as set of ASP.NET scripts which communicate with a MySQL database instance.

...

Child pages

Versions Compared

Old Version 40

New Version 41

Key