Engineering Notes
Field notes from building a 200+ language offline translation app. One developer, both platforms, zero cloud dependencies.
-
Running Silero VAD v6 on iOS with onnxruntime-objcSilero VAD v5/v6 requires 576-sample input, not 512. Every working iOS implementation bypasses onnxruntime-objc. We made it work anyway. Three root causes, the diagnostic process, and working Swift code.
-
ONNX Model Quantization on Mobile: What Actually Works (and What Crashes)Full INT8 via quantize_dynamic() crashes on ARM. FP16 graph conversion crashes with type mismatch. INT8 weight-only with DequantizeLinear gets 75% size reduction with zero quality loss — 525MB down to 131MB. Complete Python script included.