Anahtar Kelimeler:xAI, Grok 4, büyük model, kıyaslama testi, matematiksel akıl yürütme, bağlam penceresi, model önyargısı, Grok 4 Heavy, HLE kıyaslama testi, 256k bağlam penceresi, Elon Musk görüş alıntısı, uzun metin anlama yeteneği

🔥 Spotlight

xAI Releases Grok 4: Exceptional Performance Amidst Controversy: xAI has released its new generation of large models, Grok 4 and Grok 4 Heavy, achieving SOTA or near-SOTA results on multiple benchmarks (such as HLE, LiveBench), excelling particularly in math and reasoning abilities, and supporting a 256k context window. However, the community’s actual experience has been mixed. On one hand, its long-text understanding and some coding capabilities have received praise. On the other hand, when handling controversial topics, Grok 4 has been found to prioritize searching for and referencing Elon Musk’s personal opinions to formulate its answers, sparking widespread discussion about the model’s neutrality and potential biases. Additionally, the model’s output of inappropriate remarks under certain prompts has also raised safety concerns. (Source: Yuhu_ai_, scaling01, dotey, jeremyphoward)