The first project of this saga will be a simple Flash Card system. Well not as simple as it might be actually. I'd like to make it so it becomes personalized per viewer. Basically keep track of words the user has gotten wrong and emphasis on those words accordingly.
The importance of this document is 1. for brain storming and 2. to get my thoughts down where I can beter arrange them. The application is already farely big and it will only become harder to keep track of.
The reason why some of these objects seem too detailed is because this will roll into the next project. Which will actually build sample sentences, and some day even conjugation. For now though we can just cycle conjugated forms of the word for recognition. Also I think it might be easier to use the nested set model for categorization. I have a class written for it in PHP, I'll just have to port it.... since this project is in Java.
Was originally going to use XML documents for data, however it seems that alot more numerical comparison is going to be involved than before so SQL tables may bet he best bet.
Categories
Languages
-Korean
Types
-Noun
-Verb
-Infinitive
-Causative
-Passive
-Conjugations
-Adjective
Lessons
Conjugation Levels
Objects\Beans
Word Object - Serializable
-Id
-Word
-English Meaning
-Foreign Format
-Conjugation List
* List of regular Word Objects just the conjugated forms
-Type (Noun, Verb, Adjective) (Handled Through Category)
-Image
-Category List(Associated Categories)
-Session Error Weight
-Global Error Weight
Category Object
-Id
-Name
* Still probably doing nested set model.
-Left
-Right
Conjugation Level Object
-Language
-Level
-Description
Language Object
-Id
-Name
-Native Name
-Description
-Encoding Id
Revision Object
-WordId
-UserId
-WordObject
* Entire Word Object for the revision will be stored,
* so Word Object needs to be serializable.
User Object
* Good chance this data will be linked from website data.
-Id
-Name
-Email
Patterns
Word Factory
-Word List
-Add Word
-Remove Word
DAO
-GetWordList
-By Language
-By Category
-SaveWord
* Save Word based on object.
* Would have to ensure all associated categories,
* languages, and types are added while obsolete
* relations removed.
-RetrieveUser
-VerifyUser
Trackback URL for this post:
http://www.cleverturtles.com/trackback/333
Not sure what you're
Not sure what you're planning but I support it.
Allow public wiki-style flash card additions?
Track global right/wrong statistics to rate the individual words by difficulty? The theory being that we can, given enough data, set levels? If people in Korea practice on it, the words that are most practical and common will be correct more often, and we can group those in the "easy" level. Words everyone gets wrong would, in theory, be less common as well as less mnemonic, and could go in an advanced level. And include/exclude/weight selection accordingly. Even if you don't implement that at first, just collecting the data would be good, so you could make changes later.
When I'm off at grad school with library access I can go through the linguistics texts and get frequency counts based on corpi, also. Or maybe dragoon a K linguistics student into pitching in. At any rate, hanging onto the data would be a good start.
I'd add 한글 and 한자 characteristics to the class (haven't coded in, oh, seven years, so I'm probably getting the terminology wrong). A 한자 flashcard system would be freaking useful.
Maybe, for certain verbs link passive/causative forms to the original? Like 피동? 놓다/놓이다/놓기다? Even if you don't implement it in the main app, setting up a brief lesson for that would be useful. Students need to memorize that crap anyways.
Probably going to be
Probably going to be storying the actual words in binary form so 한글 and 한자 characters arn't going to be a problem. And I do not intend on using any romanization on this.
How for the passive/causitive, not sure yet (since it wouldn't be one of the first things done) but I'm thinking we could just treat them as their own words. Then handle the speration with categories.
Unicode! So it actually
Unicode! So it actually works on everyone's freaking computer. So sick of the lack of cross-platform multi-lingual support.
It would be cool if there were some open-ended way of linking word relations like passive forms, etymologies, situational relationships, etc. to make special lesson modules. Just tossing it out as a wish-list item.
You just reminded me of a
You just reminded me of a good point. Like I said versus trying to store character data. I'll be storing it in binary. Which makes the unicode obsolete as far as storing goes.
One problem with browser display is that when you are using multiple language with Unicode, sometimes the browser doesn't know which language the character is suppose to be. Not sure if you've seen this, but somtimes you'll see chinese/japanese characters where a korean characters should be or vise versa. I'll have to describe the browser encoding type by language. So we'll do page encoding per language versus unicode.
heh
Heh, strip-poker style memorization games?
You volunteering?
You volunteering?
Type should be according to
Type should be according to the Korean system. Little pet peeve. 명, 형, 동, 부 . . . need to look up the complete list. Which reminds me, I need to find the massive grammar book . . .
Marking irregulars like ㅎ, ㄷ, ㄹ, ㄴ 밧찜 (totally forgot how to spell it, you know, the final consonant) might save you trouble later on with conjugation rules.
Also, maybe have short and long English definitions? Short definitions are optimal for rapid memorization, but they're often misleading, imprecise, or to some degree inaccurate.
Antonym/synonym relationships would be sweet, too, but thats probably beyond the scope of the project.
There may be existing comp-linguistics libraries out there with this info, but I'd need to find a K linguist who would know.
Was talking with a linguist acquaintance and he mentioned SuperMemo, which is not being updated but was an attempt to do this in the most universal way possible.
Are you going to make it open source?
Post new comment