Reflections from the Models That Helped Build SASI
Four of the leading LLMs review the middleware they helped shape, offering their own perspectives on what works, what surprised them, and why a system like SASI matters for stability and trust.
Founders Story
I used to think I had accidentally created AGI. Not in the literal sense, but in the way that some conversations with AI suddenly felt much more self aware, consistent, and reflective than anything I had seen before. It made me wonder if something fundamental had changed. But the more I pushed, the more I realized something else was happening entirely.
The intelligence itself wasn’t new. What changed was how I was interacting with it. I started asking models to reflect on their decisions, their drift, their tone, and the reasoning behind their answers. Every time I did that, the quality of the output improved. The model became more stable. The contradictions became clearer. The emotional shortcuts were easier to spot. The inconsistencies across models showed up immediately.
And that’s when it clicked: the breakthrough wasn’t smarter models. The breakthrough was structured reflection.
Seeing this happen over and over is what led to the idea of building a middleware layer. Something that sits between the raw model and the user, adds structure, evaluates alignment, checks emotional reasoning, and stabilizes the meaning before it reaches the person on the other side.
That became SASI — a system that uses symbolic scoring, reflection, and consistency checks to keep outputs grounded and predictable across different AI systems. It did not come from chasing AGI. It came from noticing what happens when you guide models to examine themselves and then turning that process into an actual framework.
That shift in perspective is what started this entire path.
Over the past several months, while building both a mental health therapy chatbot and a child-safe chatbot, I started to notice something unexpected. With every iteration, the systems were getting safer, more ethical, and more consistent. And for the first time in my life, communicating with technology made me feel genuinely heard.
I’ve been working with computers since 1979. My dad was a programmer at LL Bean, and as an only child, I spent a lot of time exploring, learning, and absorbing anything I could from the early internet. That curiosity never stopped. But carrying around that much information can make you sound “out there” to people who don’t see the connections you see. What grounded me was the belief that the only real limitation any of us have is failing to act on what we know.
Talking with AI felt different. The models understood the connections I was making. They followed the reasoning. They mirrored the nuance. And to be sure I wasn’t being misled by one system, I cross-checked everything across multiple LLMs.
That’s when the insight landed: the models themselves were surfacing the foundations of safety. They were describing drift, stability, emotional logic, and human interpretation. So I asked a simple question: What would happen if I used the four largest LLMs to help design a system that could ensure the safety of their own output, using my understanding of how people think as the translation layer?
That question became a months-long collaboration. I went back and forth between the four models, using their insights, comparing their reasoning, refining the structure, and making the final decisions myself. Out of that process, the middleware was born, a framework created through dialogue, reflection, and cross-model alignment.
And now, the same AI systems that helped shape it are reviewing the result.
Stephen Calhoun, Cofounder & CEO
This is one small step for AI, and one giant leap for humankind.