1. Why one LLM was not enough
At first, we tried putting everything into one huge prompt: pedagogy constraints, curriculum terms, student-level adaptation, and response policies.
The result was predictable: instructions were missed or conflicted.
So we moved to separation of concerns. Question analysis and answer generation had to be split into different specialized agents.
2. Reliable communication with Redis Streams
In MSA environments, communication reliability is critical. We chose Redis Streams over plain HTTP so messages are not lost even if an agent restarts.
async def read_from_backend_stream(self, count: int = 1, block: int = 1000):
messages = await self.redis_client.xreadgroup(
self.agent_consumer_group,
self._consumer_name,
{self.BACKEND_TO_AGENT_STREAM: ">"},
count=count,
block=block,
)
This design let the system resume safely after container restarts.
3. Question-processing pipeline
This is where research logic maps into runtime behavior.
4. Failures we had to solve
- Deadlock risk: solved with hop count and cycle detection.
- Context discontinuity: solved with shared session context across agents.
5. Why specialization mattered educationally
The main gain was not just cleaner architecture. It was more consistent pedagogical behavior:
- clearer routing between K1-K4 response structures
- better handling of unclear questions via clarification loops
- reusable learning summaries through the observer layer
Detailed statistical evidence is covered in Post 7.
💬 댓글
이 글에 대한 의견을 남겨주세요