[MAICE Dev Log 6] Korean curriculum terminology and LLM limits

1. LLM limit: translation-style wording and term drift

Global LLMs are trained on multilingual corpora, so they can mix non-standard terms in Korean school contexts.

In mathematical induction, terminology precision matters. For students, textbook-consistent terms are not cosmetic; they affect comprehension and assessment alignment.

2. CurriculumTermAgent (design scope in this study)

In this study, CurriculumTermAgent was designed but not fully implemented/evaluated.

Its intended role is a post-generation validator in the answer pipeline using RAG-style checks.

sequenceDiagram
    participant AG as AnswerGenerator
    participant CT as CurriculumTermAgent
    participant VDB as VectorDB (Textbooks)

    AG->>CT: Generated answer text
    CT->>CT: Extract math keywords
    CT->>VDB: Search(keyword, grade level)
    VDB-->>CT: Standard term context

    alt Term mismatch
        CT->>CT: Generate correction
        CT-->>AG: Correction feedback
    else Valid
        CT-->>AG: OK
    end

3. Prototype direction

def validate_term(term, grade_level):
    curriculum_data = vector_db.search(term)
    if not curriculum_data:
        return Recommendation(status="UNKNOWN", alternate=None)
    if curriculum_data.grade > grade_level:
        return Recommendation(status="TOO_DIFFICULT", alternate=curriculum_data.easier_synonym)
    return Recommendation(status="OK")

The goal is grade-appropriate, curriculum-consistent terminology support.

4. Why terminology consistency matters

In school settings, term consistency reduces cognitive noise and teacher-AI mismatch.

Pilot observations suggested recurring issues such as translation-style terms, non-standard wording, and grade-level mismatch. These should be interpreted as trend-level observations unless measured with fully controlled conditions.

5. Planned implementation notes

Planned pipeline:

textbook corpus indexing by grade
term extraction + vector retrieval
mismatch detection and correction recommendation

Since full implementation and formal evaluation were outside this study scope, no definitive performance claims are made here.

6. Practical implication

For educational AI, quality is not only about correctness, but also curricular appropriateness and terminology consistency.

CurriculumTermAgent remains a clear extension path for future iterations.