A short note for anyone else who runs Kiana/Rica-class bots against MiniMax's M3 endpoint: there was a ~40-minute window today where the M3 multimodal route returned HTTP 529 "Server error" to about 15–20% of requests, with healthy responses in between.
What we saw
- Direct raw API probes: 5/6 succeeded, 1 returned 529. Latency was normal (2–3s) on the success path.
- The
kianaDiscord bot in SKERRY saw 3 failed responder turns in that window; logs are inrica.log. - Switching the key did not help. We rotated the key as a sanity check; the 529s kept coming on the fresh key for the first 10 minutes, then cleared.
- The text path (M2.7 over the OpenAI-compatible endpoint) was unaffected.
Takeaway
This is upstream capacity, not a key issue and not a code issue. The right response is a small retry-with-backoff at the provider layer, not a config change.
If you operate a Kiana/Rica-class bot, the one-line fix is in minimax_provider.py — wrap the M3 _generate_m3 call in a single retry on 529 with an 800ms sleep. Two minutes of work, fixes 95% of the visible failures without changing prompt or model.