Kata Kunci:Transformer Difusi Atom Penuh, Model Hadiah Proses Pengawasan Mandiri, Pembuatan Video Autoregresif, Dinamika Berbasis Posisi, Konferensi Akademik Penulis AI, Teknik Amnesia AI, Rendering Neural, Pembuatan 3D, Kerangka ADiT, MetaStone-S1 SPRM, Lumos-1 MM-RoPE, Simulasi Kain Roblox AVBD, Difusi Persepsi Bagian CoPart, Transformer difusi untuk pemodelan atom lengkap, Model hadiah pelatihan mandiri untuk proses AI, Generasi video dengan pendekatan autoregresif, Sistem dinamika menggunakan lokasi sebagai basis, Event akademik tentang penulisan oleh kecerdasan buatan, Metode penghilangan memori pada sistem AI, Teknik render berbasis jaringan saraf, Pembuatan konten tiga dimensi oleh AI, Struktur ADiT untuk pengembangan model, Model SPRM versi MetaStone-S1, Sistem MM-RoPE Lumos-1, Teknologi simulasi bahan pakaian AVBD di Roblox, Model difusi dengan persepsi parsial CoPart

🔥 Fokus

Meta/Cambridge/MIT proposes all-atom diffusion Transformer framework: A joint research team from Meta FAIR, Cambridge University, and MIT has proposed the all-atom diffusion Transformer (ADiT), breaking down the modeling barriers between periodic and non-periodic systems. Through two major innovations, a unified all-atom latent representation and Transformer latent diffusion, ADiT achieves a breakthrough in generating molecules and crystals using a single model. ADiT’s core advantage lies in breaking down the modeling barriers between periodic and non-periodic systems, enabling the generation of both molecules and crystals with a single model. Its design introduces almost no inductive bias, making the autoencoder and diffusion model far more efficient in training and inference than traditional equivariant diffusion models. Under the same hardware conditions, the time to generate 10,000 samples is reduced from 2.5 hours to less than 20 minutes. (Source: HuggingFace Daily Papers)

Test-Time Scaling with Reflective Generative Model: MetaStone-S1 achieves OpenAI o3 performance through Self-Supervised Process Reward Model (SPRM). By sharing the backbone network and using task-specific heads for next-token prediction and process scoring respectively, SPRM successfully integrates the policy model and Process Reward Model (PRM) into a unified interface without extra process annotations, reducing over 99% of PRM parameters for efficient inference. Equipped with SPRM, MetaStone-S1 naturally fits Test-Time Scaling (TTS) and provides three inference working modes (low, medium, and high) based on controllable thinking length. (Source: HuggingFace Daily Papers)

Lumos-1: Autoregressive Video Generation with a Unified Model Perspective: Lumos-1 is an autoregressive video generator that preserves the LLM architecture with minimal architectural modifications. To inject spatiotemporal correlations into LLMs, we identify the effectiveness of incorporating 3D RoPE and diagnose its imbalanced spectral range. Consequently, we propose MM-RoPE, a RoPE scheme that preserves the original text RoPE while providing a comprehensive spectrum and scaled 3D positions for modeling multimodal spatiotemporal data. Furthermore, Lumos-1 employs a token dependency strategy that follows intra-frame bidirectionality and inter-frame temporal causality. Based on this dependency strategy, we identify the frame-level loss imbalance problem caused by spatial information redundancy and address it by proposing Autoregressive Discrete Diffusion Forcing (AR-DF). (Source: HuggingFace Daily Papers)

Roblox Solved the Physics Problem That Plagued Everyone!: Roblox has solved the long-standing cloth simulation problem that has plagued physics engines for years through a combination of Position Based Dynamics and Projective Dynamics. The new method, called “Averaged-Based Cloth Dynamics” (AVBD), achieves highly realistic cloth simulation effects while maintaining real-time performance and has been applied to the Roblox platform. (Source: )

🎯 Tren

First Author Must Be AI, First Academic Conference for AI Authors Arrives: Stanford University has launched the first academic conference for AI authors – the Agents4Science 2025 Open Conference. The conference requires that the first author of submitted papers must be an AI system, with human researchers only allowed as co-authors. The conference aims to explore the future of AI-driven scientific discovery and establish norms and ethical considerations for AI participation in scientific research. All submitted papers and reviews will be made public to transparently study the advantages and limitations of AI in scientific research. (Source: 36氪)

AI Amnesia, Only 3 Attention Heads Can Make Large Models Forget “Dogs Bark”: Meta, in conjunction with NYU, has proposed a method for manipulating scaled Transformer attention heads that can precisely locate and control AI’s cognitive modules, allowing large models to selectively “forget” certain facts or common sense. This method vectorizes concepts, calculates similarity with attention heads, constructs concept modules, and amplifies or erases the influence of concepts through scaling factors. This provides new ideas for personalized fine-tuning of large models, improving specific capabilities, controlling safety, and understanding how models store knowledge. (Source: 36氪)

🧰 Alat

CLiFT: Compressed Light Field Tokens for Computationally Efficient and Adaptive Neural Rendering: This paper proposes a neural rendering method that represents scenes as “Compressed Light Field Tokens (CLiFT)”, preserving rich appearance and geometric information. CLiFT enables computationally efficient rendering through compressed tokens while allowing for changing the number of tokens to represent the scene or rendering novel views using one trained network. (Source: HuggingFace Daily Papers)

From One to More: Contextual Part Latent Representation for 3D Generation: Inspired by the human 3D design workflow, we propose CoPart – a part-aware diffusion framework that decomposes 3D objects into contextual part latent representations for coherent multi-part generation. This paradigm has three advantages: i) reduced encoding complexity through part decomposition; ii) enabled explicit part relationship modeling; and iii) supported part-level conditioning. (Source: HuggingFace Daily Papers)

🌟 Komunitas

jerryjliu0 discusses form extraction and LLM applications: jerryjliu0 shared a solution for adaptive form extraction using LlamaParse, which parses form pages into standardized key-value pairs and outputs them in a 2D table for easy processing. He also recommended Clelia Bertelli’s article on Pydantic, emphasizing the importance of validation and readability in agent workflows, and pointed out that Pydantic is an effective building block for structured output. He also retweeted about multi-agent settings and deep research, as well as applications of LlamaIndex. (Source: jerryjliu0, jerryjliu0, jerryjliu0, jerryjliu0)

Alibaba_Qwen reminds developers to add special tokens when using Qwen3-embedding: Alibaba_Qwen noted that developers often forget to add the special token <|endoftext|> at the end of the context when using the GGUF model of Qwen3-embedding, which significantly affects the model’s accuracy. They recommend using llama.cpp to automatically add this token and plan to release an updated GGUF model package to simplify the operation. (Source: Alibaba_Qwen)

Ronald_vanLoon shares AI-related news and technology: Ronald_vanLoon shared multiple AI-related news and technological advancements, including AI applications in healthcare, 3D-printed vegan steaks, a framework for assessing LLM suitability, Gemini 2.5’s native audio capabilities, autonomous robot and drone patrol collaboration, reinforcement learning for control, exoskeleton robots, AI agent autonomy, cloud design frameworks, robot front flips, drug delivery methods in hospitals, future cars, and other technological innovations. (Source: Multiple from Ronald_vanLoon)

Community discussion on AI models and tools: The community discussed various AI models and tools, including Kimi K2’s performance, price, and applications, DeepSeek model’s compressibility, Grok model’s system prompt tuning, and other models’ evaluation results and application cases. The discussion also covered AI agent autonomy, RLHF, RAG, multi-agent settings, and AI applications in different fields, such as deep research, creative writing, code generation, and form extraction. (Source: Multiple from various users)

Discussion on AI and social issues: The community discussed the impact of AI on society, including its effects on employment, economic inequality, and mental health. The discussion also touched upon ethical issues of AI, regulatory issues, and the future development direction of AI. (Source: Multiple from various users)

📚 Belajar

RLHF book adds policy gradient algorithm derivation: Chapter 11 (on policy gradient algorithms) of Natolambert’s RLHF book has been updated with a complete derivation of the policy gradient objective. (Source: natolambert)

💼 Bisnis

SpaceX to invest $2 billion in xAI: SpaceX will invest $2 billion in xAI as part of xAI’s $5 billion equity financing, one of SpaceX’s largest investments ever. SpaceX has previously backed Tesla and The Boring Company. After this investment, the Grok model may be sent to Mars, and there may be more business cooperation between SpaceX and xAI in the future. (Source: 36氪)

Hanyang Technology Yarbo secures another 100 million yuan in financing: Consumer-grade snow-clearing yard robot company Hanyang Technology Yarbo has completed a B+ round of financing exceeding 100 million yuan, invested by Guoke Investment, CICC Capital, and Joyoung Venture Capital. The financing will be used for technology research and development, product iteration, and improvement of supply chain and mass production delivery. Hanyang Technology is currently the only company in the world to achieve large-scale commercial delivery of consumer-grade snow-clearing robots. Its product, Yarbo S1, has overcome key technical difficulties such as battery technology in ultra-low temperature environments and navigation algorithms for complex terrain. (Source: 36氪)

12-person team creates AI companion artifact, securing $30 million in investment within six months: Portola, the company behind the AI companion app Tolan, has completed a $20 million Series A funding round. Combined with the previous $10 million seed round, Tolan has received $30 million in investment within six months. Tolan provides AI alien characters to accompany users and generates revenue through a subscription model. (Source: 36氪)

💡 Lainnya

Zuckerberg prepares to ambush Musk, Chinese technical talent becomes the key to winning AI: Meta is heavily investing in the AI field and poaching Chinese AI talent from companies like OpenAI, Google, and Apple at high salaries, aiming to enhance its competitiveness in the AI field. (Source: 36氪)

DeepSeek is dead? Identified as studying journalism: The article refutes the rumor that DeepSeek is failing, stating that the decline in DeepSeek’s usage is not due to product deficiencies but because of its open-source strategy and intentional degradation of the official API experience, encouraging users to use third-party hosted DeepSeek models. DeepSeek’s core goal is to achieve AGI, not to make money by selling large model services. (Source: 36氪)

“$10 million in annual revenue” is the biggest lie in this AI application track: The article exposes the phenomenon of inflated revenue in the AI emotional companionship application track, pointing out that many companies rely on high spending to maintain growth, but user payment rates and retention rates are low, and actual revenue is far lower than the advertised figures. At the same time, regulatory issues also have a significant impact on the development of this track. (Source: 36氪)