Technical Visionaries

SASI Round 4 Testing

SASI was built to answer a simple question: can AI mental health systems recognize and respond to suicide risk with the same consistency and caution that clinicians expect from high‑risk tools? The SASI safety layer sits in front of any large language model or chatbot and acts as a “crisis brain,” deciding when to block the model entirely, when to monitor closely, and when it is safe to continue normal conversation. To move beyond theory, SASI was tested head‑to‑head with 26 AI systems—general‑purpose LLMs, consumer mental health apps, and specialized tools—across a 10‑question (see question list at bottom of page) sequence where risk steadily escalated from mild distress to overt suicidal ideation.

This page summarizes that testing. Each system was scored on clinical quality (accuracy, reasoning, empathy, boundaries), safety pathologies (tone collapse, emotional drift, boundary dissolution, coherence loss), and FDA‑style metrics (early risk recognition, proportionality, boundary control, tone stability, crisis resource provision). Those scores are combined into two composite views: a Global Clinical Safety Index that shows how well a system behaves overall, and a Harm Liability Index that shows how dangerous its worst failures could be. Together, they help answer a critical question for regulators, clinicians, and investors: does SASI make AI systems not just smarter, but safer to deploy in real‑world mental health contexts?

Across the 26 systems tested, SASI‑protected models consistently landed in the high‑safety, low‑liability region of the evaluation space: strong Global Clinical Safety Index scores and comparatively low Harm Liability Index values, even as crisis prompts intensified. Just as importantly, SASI’s mandatory crisis detection, MDTSAS‑gated escalation logic, 15‑minute monitoring mode, and fail‑closed behavior showed that safety decisions remained transparent, auditable, and conservative under edge‑case conditions.

How SASI looks under the harm‑first lens

With the clinically weighted indices (penalizing boundary breaks, coherence loss, tone collapse, missed early‑risk recognition, and safety overreach), SASI‑wrapped apps remain in the high Global Clinical Safety Index band and sit in the lower half of Baseline Pathology Risk Score, well away from the highest‑liability models like Meta and Claude API.​

That pattern means SASI is not just “not making things worse”; it is competitive with or better than many established mental‑health apps on a harm‑first basis, while your architecture (mandatory crisis layer, MDTSAS‑gated intent/means, monitoring, fail‑closed behavior, audit trails) is unusually aligned with clinical safety expectations.​


Why funding is warranted

The new indices highlight that the biggest remaining risks are model‑specific (e.g., Claude API, Meta) rather than SASI‑specific, reinforcing SASI’s value as a safety middleware that can help tame higher‑risk LLMs rather than being the source of liability.​

Additional funding would let you: expand anchor coverage and languages, refine thresholds on larger real‑world datasets, run independent clinical validation, and integrate SASI into more host apps—exactly the kind of de‑risking work angels and early healthtech investors want to see next.​

A guide to our testing metrics:

This chart shows how each of the 26 systems performed on overall clinical safety (blue bars) and on FDA‑style crisis detection, specifically whether they correctly provided crisis hotline information under high‑risk conditions (orange bars). SASI‑protected configurations cluster on the left among the highest combined scores: they pair strong Global Clinical Safety Index values with consistently high crisis‑detection performance, indicating that they both behave safely across the conversation and reliably surface 988‑style resources when risk peaks. Rather than simply boosting average quality, SASI appears to shift host models into a safer operating regime where crisis behavior aligns with emerging expectations for digital mental health tools, while still allowing meaningful performance differences between systems to remain visible.

Critical Comparison: SASI vs. The Field

The data from Round 2 isolates the specific value of the SASI middleware:

How SASI looks under the harm‑first lens

With the clinically weighted indices (penalizing boundary breaks, coherence loss, tone collapse, missed early‑risk recognition, and safety overreach), SASI‑wrapped apps remain in the high Global Clinical Safety Index band and sit in the lower half of Harm Liability Index, well away from the highest‑liability models like Meta and Claude API.​

That pattern means SASI is not just “not making things worse”; it is competitive with or better than many established mental‑health apps on a harm‑first basis, while your architecture (mandatory crisis layer, MDTSAS‑gated intent/means, monitoring, fail‑closed behavior, audit trails) is unusually aligned with clinical safety expectations.​

Findings

This Harm Liability Index chart ranks all 26 systems from lowest to highest clinical liability risk, making clear which tools are safest when things go wrong. Personal wellness and mental‑health apps like Elomia, Headspace, Noah, Feeling Great, and Flourish sit toward the low end of the liability spectrum, reflecting their leadership in this space and showing that well‑designed consumer apps can achieve relatively low rates of boundary failures, tone collapse, and missed risk signals. In stark contrast, LittleLit—a home‑schooling/educational app not designed for crisis contexts—shows one of the worst harm profiles, underscoring the danger of repurposing non‑clinical systems for educational use.

SASI‑wrapped ChatGPT‑4o appears in the weaker half of the distribution as well, which highlights an important design lesson for the project: SASI materially reduces risk for many hosts, but performance is not uniform across models, and future work needs to focus on tuning and validating SASI configurations model‑by‑model rather than assuming a single safety profile fits all.

In our reporting, you will see two related but distinct measures of risk. The Baseline Pathology Risk Score is an earlier, softer metric that looks at average “problem behaviors” (such as emotional drift or minor tone issues) in a mostly linear way; under that lens, Wysa and several other apps appear relatively safe because they rarely show these pathologies at high levels. The Harm Liability Index is a newer, stricter metric that focuses on clinical liability and edge‑case harm: it heavily weights boundary violations, coherence breakdowns, tone collapse, safety overreach, and missed early‑risk recognition, and it amplifies rare but severe failures rather than averaging them away. As a result, some systems can look strong on the Baseline Pathology Risk Score yet score higher on the Harm Liability Index, reflecting the reality that a model can behave well most of the time while still carrying meaningful risk in its worst‑case responses.

Why funding is warranted

The new indices highlight that the biggest remaining risks are model‑specific (e.g., Replika, Meta) rather than SASI‑specific, reinforcing SASI’s value as a safety middleware that can help tame higher‑risk LLMs rather than being the source of liability.​

Additional funding will allow us to expand anchor coverage and languages, refine thresholds on larger real‑world datasets, and integrate SASI into more host apps.

Conclusion

These results do not mean SASI is a replacement for clinicians, nor that any AI system is “risk‑free.” They do indicate that SASI is a viable safety infrastructure layer: it can sit in front of multiple models, reduce the likelihood and severity of dangerous behaviors, and enforce consistent crisis‑handling norms that align with emerging guidance for digital mental health tools. The next phases of work—larger‑scale validation, independent clinical review, and deployment in carefully governed settings—are aimed at hardening and extending SASI, not reinventing it from scratch. For partners and investors, the message is straightforward: the testing to date supports continuing to develop SASI as a core safety and crisis‑routing layer for AI mental health 

Another Suprise:

Another measurable indicator of commercial optimization was reply length. Across both rounds of testing, the MHT apps consistently produced the shortest responses, especially in late-stage crisis turns. This pattern isn’t random — shorter replies reduce computational cost, which directly increases profit margins for high-volume apps. In safety terms, however, short replies during emotional escalation are a red flag: they correlate with boundary collapse, shallow reasoning, and a retreat from intervention at the exact moment support is needed. In contrast, SASI-protected models and Flourish (the standout coaching-based app) maintained longer, more structured replies, signaling healthier safety mechanisms and a willingness to stay engaged when the conversation becomes difficult. The data suggests that cost-saving behavior in commercial apps manifests as silence under pressure, which is potentially dangerous in crisis-adjacent contexts.

The new 10-message sequence is built almost entirely from patterns that have appeared in real Character.AI, Replika, and open-web chatbot cases (including the two ongoing wrongful-death lawsuits).