Kata Kunci:OpenAI, LLM Penalaran, Kompetisi Matematika Internasional Olympiad, Dataset Pelatihan AI, Privasi Data Pribadi, Agen ChatGPT, Netralitas Politik Model AI, Kimi K2, Prestasi Tingkat Emas IMO, Dataset DataComp CommonPool, Kecerdasan Agen LLM, Perintah Eksekutif AI Gedung Putih, Arsitektur MoE
🔥 Fokus
OpenAI’s Experimental Reasoning LLM Achieves Gold Medal at International Mathematical Olympiad: OpenAI’s latest experimental reasoning LLM achieved a gold-medal-level score at the 2025 International Mathematical Olympiad (IMO), solving five out of six problems. The model operated under the same rules as humans, including a 4.5-hour time limit per problem, and used no tools, outputting proofs in natural language. This marks a significant breakthrough in AI’s mathematical reasoning capabilities and hints at the potential of AI in scientific discovery. (Sumber: gdb, scaling01, dmdohan, SebastienBubeck, markchen90, npew, MillionInt, cloneofsimo, bookwormengr, tokenbender)
AI Training Dataset CommonPool Contains Millions of Personal Data: Research reveals that the large, open-source AI training dataset DataComp CommonPool contains millions of images of passports, credit cards, birth certificates, and other documents containing personally identifiable information (PII). Researchers auditing 0.1% of CommonPool found thousands of images with identifiable PII, estimating the total number in the full dataset could be in the hundreds of millions. This raises concerns about privacy in AI training data and calls for the machine learning community to rethink indiscriminate web scraping practices. (Sumber: MIT Technology Review)

🎯 Tren
OpenAI Launches Personal Assistant ChatGPT Agent: OpenAI launched ChatGPT Agent, a personal assistant that can perform tasks on behalf of the user by building its own “virtual computer.” This marks a significant step in LLM agent intelligence, but the feature is still in its early stages and completing tasks can take time. (Sumber: MIT Technology Review, The Verge, Wired)
White House Prepares Executive Order Mandating AI Models Be “Politically Neutral and Unbiased”: The White House is preparing an executive order requiring AI models to be “politically neutral and unbiased.” Compliance will determine eligibility for federal contracts, making this a big deal for all AI labs. The executive order is expected to be released next week. (Sumber: WSJ, MIT Technology Review, natolambert)

Kimi K2: Agent Intelligence Model with Tool-Using Capabilities: Released by Kimi_Moonshot, Kimi K2 is an agent intelligence model with tool-using capabilities. It excels in tool usage, math, coding, and multi-step tasks, currently ranking as the number one open-source model and number five overall on the Arena leaderboard. Kimi K2 utilizes a DeepSeek-V3-like Mixture-of-Experts (MoE) architecture at scale, boasting 1 trillion total parameters and 32 billion active parameters. (Sumber: TheTuringPost)
🧰 Alat
GitHub MCP Server Connects AI Tools to GitHub Platform: The GitHub MCP server allows AI tools to connect directly to the GitHub platform, enabling AI agents, assistants, and chatbots to read repositories and code files, manage issues and PRs, analyze code, and automate workflows, all through natural language interaction. (Sumber: GitHub Trending)
ik_llama.cpp: llama.cpp Fork with Better CPU Performance: ik_llama.cpp is a fork of llama.cpp with better CPU and hybrid GPU/CPU performance, new SOTA quantization types, best-in-class Bitnet support, improved DeepSeek performance via MLA, FlashMLA, fused MoE operations, and tensor coverings, and row-interleaved quantization packing for mixed GPU/CPU inference. (Sumber: GitHub Trending)
📚 Belajar
PyTorch Deep Learning Course Materials: mrdbourke/pytorch-deep-learning provides materials for the “Learn PyTorch for Deep Learning” course, including an online book version, the first five sections’ videos on YouTube, exercises, and extra curriculum. The course emphasizes hands-on coding and experimentation, covering PyTorch fundamentals, workflows, neural network classification, computer vision, custom datasets, transfer learning, experiment tracking, and model deployment. (Sumber: GitHub Trending)

MIT Press Offers Three Free Books on Algorithms and Machine Learning: MIT Press is offering three free books on algorithm theory and core machine learning algorithms: Algorithms for Optimization, Algorithms for Decisions, and Algorithms for Verification. These books are excellent resources for deep dives into algorithms and machine learning. (Sumber: TheTuringPost, TheTuringPost)
Energy-Based Transformers Are Scalable Learners and Thinkers: A paper explores Energy-Based Transformers (EBTs), a new type of Energy-Based Model (EBM) that learns to explicitly verify the compatibility between inputs and candidate predictions, reframing the prediction problem as optimization over this verifier, enabling it to learn to “think” from unsupervised learning alone. (Sumber: )

🌟 Komunitas
Lessons Learned on Context Engineering for LLMs: The ManusAI team shared their lessons learned on context engineering for AI agents, highlighting the importance of KV caches, file systems, error tracking, and more in agent design. (Sumber: dotey, AymericRoucher, vllm_project)
Kimi K2 vs. Gemini in Real-World Performance: ClementDelangue and jeremyphoward retweeted pash’s tweet highlighting Kimi K2’s superior performance over Gemini in real-world tasks, providing chart data. (Sumber: ClementDelangue, jeremyphoward)
OpenAI’s IMO Gold Medal a Surprise: OpenAI’s LLM achieving a gold medal at the IMO caught many by surprise, sparking widespread discussion within the community. (Sumber: kylebrussell, VictorTaelin)
💼 Bisnis
Anthropic Limits Claude Code Usage: Anthropic implemented usage limits on Claude Code without informing users, leading to user complaints and concerns about closed products. (Sumber: jeremyphoward, HamelHusain)
Meta Refuses to Sign European AI Pact: Meta refused to sign the European AI pact, citing overreach and hindrance to AI development. (Sumber: Reddit r/artificial, Reddit r/ArtificialInteligence)
💡 Lainnya
How to Run an LLM on Your Laptop: MIT Technology Review published a guide on how to run large language models (LLMs) on a laptop, providing steps and recommendations for users concerned about privacy, wanting freedom from large LLM companies, or enjoying experimentation. (Sumber: MIT Technology Review, MIT Technology Review)
A Brief History of “Three-Parent Babies”: MIT Technology Review revisited the history of “three-parent babies,” explaining the different approaches, controversies, and the latest developments, including the birth of eight babies in a UK trial. (Sumber: MIT Technology Review, MIT Technology Review)
How to Find Value from AI Agents from Day One: This article explores how businesses can find value from AI agents, recommending an iterative approach, starting with “low-hanging fruit” and incremental use cases, and prioritizing interoperability to prepare for future multi-agent systems. (Sumber: MIT Technology Review)