Keywords:Gemma 3n, Multimodal model, MatFormer, Edge devices, Transformer, Progressive Layer Embedding (PLE), Key-Value Cache Sharing (KV Cache Sharing), LMArena evaluation

🔥 Focus

Google Releases Gemma 3n Multimodal Model: Google has officially released Gemma 3n, an open-source multimodal model specifically designed for on-device applications. The model is based on the innovative MatFormer (Matryoshka Transformer) architecture and is available in two sizes: E2B (2 billion effective parameters) and E4B (4 billion effective parameters), requiring as little as 2GB of memory to run. Gemma 3n natively supports image, audio, video, and text inputs. The E4B version scored over 1300 on the LMArena benchmark, becoming the first model under 10B parameters to achieve this score. Its technical highlights include Per-Layer Embedding (PLE) for significantly improved memory efficiency and the Key-Value Cache Sharing (KV Cache Sharing) mechanism to accelerate long-text processing, aimed at bringing powerful multimodal AI capabilities to edge devices like mobile phones. (Source: GoogleDeepMind, madiator, reach_vb, 36氪)

Leave a Reply

Your email address will not be published. Required fields are marked *