The SLM Roundup #1
It's Friday and welcome to the first issue of SLM Roundup! While the development of AI has been fast and furious the past 2 years, we will be focusing on SLM news.
Here are 3 SLM news this week that stood out to me!
Quantized Llama 1B and 3B models
Meta has released quantized versions of Llama 1B and 3B that offer 2-4x speedup while still having the same quality and safety requirements as the original bf16 models.
They used Quantization Aware training with LoRA adaptors (which prioritizes accuracy), and SpinQuant, which prioritizes portability.
These models are available on Qualcomm and MediaTek SoCs with Arm CPUs, which means that Android users can primarily benefit.
We are seeing the proliferation of small models onto edge devices, and I'm really excited to see its development beyond text/email summaries, grammar editing etc.
DeepSeek releases Janus 1.3B, a small multimodal model with image generation capabilities
It's insane how fast small language models are improving. About a year ago, we barely had small models. Just this week, DeepSeek releases Janus 1.3B. It's a small multimodal model that can actually generate images. While a reddit user u/CheatCodesOfLife highlighted that the generated images are "shit", Janus 1.3B acts as a proof of concept of what is possible.
GPU Poor LLM Gladiator Arena
Karol Danisz is working on a chatbot arena specifically for small models (maxing out at 12B). I love this idea as while LM Arena is great for initial assessment of new models, the BIG BOYS models are still way ahead for general intelligence.
Having a GPU Poor arena helps us better track improvements in SLMs. I've played around with it- it's a little slow but I'm really surprised at the quality of response when I asked for more general 'life advice'. Give it a try!