Posts

VQ-VAE vs. FSQ

VQ-VAE (Vector-Quantized Variational Autoencoder) is a standard approach in the ML literature for quantizing data1. Quantizing data is critical in any situation where we want to use an autoregressive transformer model on data which isn’t naturally tokenized. This is true in most production models for image, video, and audio generation. In this blog post we demonstrate an alternative to VQ-VAE named FSQ (Finite Scalar Quantization)2 which works better on the MNIST dataset....

Tech Thresholds

In 2016, optimistic founders thought that general self-driving cars were 2-3 years away. In 2024, we don’t have general self-driving cars. What happened? General self-driving cars are bottlenecked by the intelligence of the autonomy system. Founders thought that they just needed intelligence level x, but they actually needed (a) a scalable algorithm to get intelligence from compute, and (b) an intelligence level 5x. (number made up) Both were not possible at the time....