[MAICE Dev Log 2] How the agents collaborate (Multi-Agent System)

1. At first, one prompt seemed enough

The problem from Post 1 looked simple: if a student question is ambiguous, do not answer immediately. Help the student make the question clearer first.

So at first, we tried putting every role into one prompt.

"Understand the student's level, classify the Bloom knowledge dimension, ask back if the question is unclear, and generate an answer. Also keep textbook terminology when possible."

As a sentence, it sounded plausible. But in real conversations, the roles collided. By role conflict, I mean that the same prompt had to decide whether to ask back and also produce a final answer. Some questions needed a follow-up question, yet the AI started solving them right away. In other cases, the question was already specific enough, but the AI still asked back unnecessarily.

The issue was not simply that the model was not smart enough. The kinds of judgment were different. Identifying the question, asking back, generating an answer, and observing the dialogue flow are different responsibilities.

So MAICE chose to separate the roles instead of making the prompt longer.

The point of this post is not to memorize agent names. It is to understand why "judging a question" and "making an answer" had to be separated.

2. Splitting the system into five roles

MAICE runs five core agents. For readers, the roles matter before the file names.

Role	What it does	Educational meaning
Question classification	Judges what the question is asking	Checks the nature of the question before answering
Question clarification	Asks back when the question is ambiguous	Turns Dewey's problem-definition stage into dialogue
Answer generation	Answers a question that is clear enough	Generates explanations fitted to student level and context
Learning observation	Observes the session flow	Tracks what happened in the conversation
Freepass response	Answers immediately without clarification	Handles the comparison condition and immediate response flow

As a development note, the actual code names are QuestionClassifier, QuestionImprovement, AnswerGenerator, Observer, and FreeTalker. /Users/hwansi/project/01_MAICE/MAICE/agent/worker.py launches these five agents as separate processes and restarts a dead process when needed.

This post covers only the five agents that were part of the active thesis-experiment pipeline. Traces that remain in the code but were not used in the experiment or effect validation are not connected to the result interpretation.

3. Where does a student question flow?

When a student submits a question, MAICE does not start by generating an answer. It first checks whether the question is ready to be answered.

For example, "How do I prove this with mathematical induction?" can be treated as a procedural question. It is still rough, but we can at least see the unit and the activity the student is asking about.

"I don't get this" is different. We do not know which concept is difficult, what solution attempt the student made, or what kind of explanation is needed. If the system gives a solution immediately, the student may never revisit the actual point of confusion.

So MAICE uses a branching point like this.

The key is not only whether the AI can answer well. The system first asks whether the question is prepared well enough to answer.

4. How agents pass messages to one another

Once roles are separated, a new problem appears: each agent must safely pass its decision to the next part of the system.

If everything were a single function call, this would be easy. But in MAICE, answers are streamed, agent processes run separately, and session logs must be stored. The system also has to handle deployment or process restarts.

MAICE therefore uses Redis Streams for message passing. Redis Streams is a message channel that lets multiple workers share a flow while tracking who processed which message through consumer groups.

A classroom analogy helps. A student's question sheet is not thrown randomly onto a desk in the teachers' office. It goes into a work tray, the responsible teacher takes it, marks it as handled, and passes it to the next person when needed.

As a development note, this flow is gathered in agent/utils/redis_streams_client.py. The system uses streams such as maice:backend_to_agent_stream for backend-to-agent messages, maice:agent_to_backend_stream for agent-to-backend messages, and maice:agent_to_backend_stream_session_{session_id} to separate session-specific responses.

Redis Streams does not magically solve every problem. During failures or restarts, a message can be delayed or processed again. That is why session state, processing status, and designs that can safely tolerate repeated handling are also necessary.

5. Where the system actually got stuck

The most delicate problem after splitting agents was loops. The classifier can decide that clarification is needed, the clarification agent can create a follow-up question, and that follow-up can trigger another judgment. Without a limit, the flow can become too long.

So the code separates responsibilities clearly. QuestionClassifier decides whether a question is answerable. If clarification is needed, it sends the question to QuestionImprovement. If the question is clear enough, it sends it to AnswerGenerator. QuestionImprovement keeps clarification_count and max_clarifications: 3, preventing endless follow-up questions and keeping the dialogue from becoming frustrating for the student.

Another issue was context. When agents are separated, each agent may see a different slice of the conversation. If the classifier sees only the question and the answer generator sees only a late summary, the answer can become inconsistent. MAICE therefore manages shared session context in the backend and passes each agent the context it needs for its role.

6. Why this structure matters educationally

The purpose of the multi-agent structure was not to look technically impressive. It was to separate educational interventions and observe what happened at each stage.

Question classification looks at the state of the student's question. Question clarification helps the student redefine the problem. Answer generation happens only after that. The observer records how this flow appeared in actual sessions.

With the roles separated, we can move beyond the vague statement that "the AI gave a good answer." We can ask separately whether the question became clearer, whether the answer was appropriate, and whether the conversation supported the learning process.

But a system structure alone does not make students ask good questions. In mathematics, the moment of input already involves formulas, photos, and rendering. The next post looks at the interface that made the conversation possible in the first place.