1. Math chat can fail at the input box
In Post 2, we looked at how MAICE classifies, clarifies, and answers student questions. But in a real classroom, students meet one problem even before that flow begins: the input box.
Math questions are not made of ordinary sentences alone. They contain fractions, exponents, sigma notation, recurrence relations, and proof steps. If students have to type all of that with an ordinary keyboard, they end up fighting the input method rather than thinking about mathematics.
The point of this post is not that the UI became visually fancy. It is about the devices that helped students start asking math questions instead of giving up at the input stage.
This issue also appeared in the second DBR cycle beta test (N=11). Students found LaTeX input difficult and wanted to upload photos of problems or their own solution attempts. So the goal of the MAICE frontend was not just to build a pretty chat window.
- Lower the barrier to formula input
- Support photo-based input
- Keep AI-generated LaTeX from breaking during streaming
- Remove UI friction that interrupts learning flow
2. Why SvelteKit was chosen
The MAICE frontend was implemented with SvelteKit. The reason was practical. A small team needed to experiment quickly and handle frequently changing UI states such as formula input and real-time rendering.
Formula-input components involve many DOM changes. Streaming answers also update the screen repeatedly at short intervals. SvelteKit made these state changes relatively simple to express.
React or Vue could also have been used. The important point was not the framework name. The goal was to keep the UI from becoming an obstacle while students were trying to write math questions.
3. Letting students write formulas directly
The MAICE input box is not just a plain text area. Students need to write ordinary language and then switch into formula input when needed. MAICE used MathLive-based input for that purpose.
For developers, MathLive is a formula-editing component. For students, it is closer to a tool that makes fractions and exponents a little less painful to enter. On mobile, students can open a math-symbol keyboard and enter fractions, exponents, parentheses, and related notation directly.
The number of features is not the point. The important point is that students do not abandon the question. If formula input is too hard and students shorten the question itself, the clarification structure from Post 2 never gets a chance to work.
4. Turning photos into question drafts
After the beta test, MAICE introduced OCR support. Here OCR means reading text and formulas from an image so that students can begin from a photo instead of retyping everything. Gemini Vision refers to the vision model used for that image-to-text and image-to-LaTeX step. From a student's point of view, this lets them start from a workbook problem or a handwritten solution attempt.
The actual MAICE flow is as follows. A student can choose an image file, take a photo with the camera, or paste an image from the clipboard. The image is not sent to the server immediately. It first goes through a crop screen, because one photo can contain problem numbers, blank space, other questions, or extra handwriting.
Select / capture / paste image
-> Crop the needed area
-> Recognize formulas and text on the server
-> Insert the result at the current input cursor
-> Student checks and edits it
As a development note, this flow is handled in the frontend by InlineMathInput.svelte and ImageCropModal.svelte. The cropped image is sent as the image field in FormData to POST /student/convert-image-to-latex. On the backend, image_to_latex_service.py accepts only JPG, PNG, and WebP images up to 10MB. It opens the image with PIL, converts it to RGB, and resizes it so that the longest side does not exceed 1536 pixels. It then sends the image to Gemini Vision with an instruction to return the mathematical content as LaTeX without unnecessary explanation.
The API returns a LaTeX/text result that can be inserted back into the input field. That result is not used blindly. The service removes wrappers such as Markdown code fences and normalizes some LaTeX expressions for more stable MathLive handling. For example, \dots, \cdots, and \vdots are unified as \ldots, and \times is converted to \cdot. That kind of replacement is not always the best choice in every mathematical context, so the final check still belongs to the student.
This point matters. MAICE's OCR is not an automatic solving feature. Uploading a photo does not immediately produce the final answer. The image is turned into a question draft and inserted into the input box. The student can check the recognized text and formulas, fix what is wrong, and add where they are stuck.
OCR can also misread content. A 1 can be confused with l, a 0 with o, and handwritten exponents or subscripts can be ambiguous. So OCR should be interpreted as a condition that lowers the input barrier, not as a direct cause of learning gains.
5. Answers also need to render without breaking
Even if students enter a good question, learning flow breaks again if the AI answer is hard to read. Formula-heavy answers are especially fragile during streaming. An unfinished LaTeX delimiter can appear before it is closed, or formula rendering can fail in the middle of a sentence.
MAICE does not wait for the entire answer to finish before showing it. It streams the response gradually. For students, this makes the interaction feel more immediate. For developers, it means the UI must tolerate incomplete formula states while the answer is still arriving.
So rendering was not just a matter of converting Markdown to HTML. Streaming-safe LaTeX means formatting and handling formulas so that a live, partially generated answer does not break the screen before the full expression arrives.
6. How to interpret the UX improvements
MathLive, OCR, and streaming-safe rendering were all important. But we should not say that they directly caused the learning effects.
The beta test and the main experiment differed in timing, participants, system stability, and UI state. OCR was also introduced together with other changes. Therefore, it would be inaccurate to say, "The results improved because OCR was added."
A safer interpretation is this: interface improvements created the participation conditions that allowed students to start and continue asking questions. The main intervention being evaluated was the question-clarification flow, and the UX helped that flow operate in the classroom.
7. Keeping technology from interrupting learning
One thing became clear while building a math AI: a good model is not enough. If students cannot enter a question, the model cannot help. If the answer breaks visually, students cannot follow the explanation.
So the core of Post 3 is not frontend technology itself. It is about helping students continue the act of asking through formula input, photo input, and stable rendering.
The next question is about evaluation. Once conversations are possible, how can we tell whether they are actually good learning conversations? Post 4 introduces the QAC checklist.
💬 댓글
이 글에 대한 의견을 남겨주세요