A practical, end-to-end guide to ONNX: what it is, how to export models from PyTorch and TensorFlow, run fast inference with ONNX Runtime, and ship to production.
A decision framework for choosing between AutoML platforms and hand-built models — covering cost, control, accuracy, and the trade-offs that actually matter in production.
A real, honest production architecture: an ONNX image classifier served by FastAPI on Railway, loaded from object storage at startup, with one shared inference session on CPU — and what we'd improve.