Keywords:AI autonomous scientific discovery, Reinforcement learning, World model, AGI, OpenAI, AI agent, Large language model, AI in healthcare, GPT-4o update issues, Matrix-Game open-source model, INTELLECT-2 distributed training, T2I-R1 text-to-image model, HealthBench medical evaluation benchmark
🔥 Focus
Interview with OpenAI Chief Scientist Jakub Pachocki: AI May Autonomously Discover New Science Within 5 Years, World Models and Reinforcement Learning Are Key: OpenAI Chief Scientist Jakub Pachocki stated in an interview with Nature magazine that AI is expected to achieve autonomous scientific discovery within 5 years and have a significant impact on the economy. He believes that current reasoning models (such as o-series, Gemini 2.5 Pro, DeepSeek-R1) solve complex problems through methods like chain-of-thought and have already shown great potential. Pachocki emphasized the importance of reinforcement learning, which enables models not only to extract knowledge but also to form their own ways of thinking. He predicts that AI may not yet be able to solve major scientific problems this year, but it can almost autonomously write valuable software. Regarding AGI, Pachocki believes its important milestone is the ability to generate quantifiable economic impact, especially by creating entirely new scientific research. He also mentioned OpenAI’s plans to release open-source model weights better than existing models to promote scientific progress, but safety issues also need attention. (Source: 36Kr)

Sam Altman’s Latest Interview: Agents to be Deployed on a Large Scale This Year, Capable of Scientific Discovery by 2026, Ultimate Goal is Personalized AI that “Understands a User’s Entire Life”: OpenAI CEO Sam Altman shared OpenAI’s vision at Sequoia Capital’s AI Ascent conference. He predicted that in 2025, AI agents will be widely used for complex tasks, especially in programming; in 2026, agents will be able to autonomously discover new knowledge; and in 2027, they may enter the physical world to create business value. Altman emphasized that one of OpenAI’s core strategies is to enhance models’ programming capabilities, enabling AI to interact with the external world by writing code. He envisions future AI having context windows of trillions of tokens, remembering a user’s lifetime of information (conversations, emails, browsing history, etc.), and performing precise reasoning based on this, becoming a highly personalized “lifelong AI assistant,” and even evolving into an “operating system” for the AI era. He also pointed out that voice interaction will be key and may lead to new hardware forms. (Source: 36Kr)

INTELLECT-2, a Reinforcement Learning Model Trained on Global Idle Compute, Released with Performance Comparable to DeepSeek-R1: The Prime Intellect team has released INTELLECT-2, claiming it to be the first large model trained using reinforcement learning on globally distributed idle GPU resources, with performance reportedly comparable to DeepSeek-R1. The model is based on QwQ-32B and trained using prime-rl, a distributed reinforcement learning framework integrated with a modified version of GRPO, to improve stability and efficiency. INTELLECT-2’s training utilized 285,000 math and coding tasks from NuminaMath-1.5, Deepscaler, and SYNTHETIC-1. This achievement demonstrates the potential of using decentralized computing power for large-scale model training, potentially reducing reliance on centralized compute clusters. (Source: QbitAI | karminski3)

Kunlun Tech Open-Sources Interactive World Foundation Model Matrix-Game, Generates Interactive Game Worlds from a Single Image: Kunlun Tech has released and open-sourced Matrix-Game (17B+), an interactive world foundation model capable of generating complete, interactive 3D game worlds from a single reference image, particularly for open-world games like Minecraft. Users can interact with the generated environment in real-time via keyboard and mouse operations (e.g., moving, attacking, jumping, changing perspective), and the model responds correctly to commands while maintaining spatial structure and physical properties. Matrix-Game employs Image-to-World Modeling and an autoregressive video generation strategy, and was trained on a large-scale dataset called Matrix-Game-MC. Kunlun Tech also proposed the GameWorld Score evaluation system, assessing models on four dimensions: visual quality, temporal consistency, interactive controllability, and understanding of physical rules, outperforming open-source solutions like Microsoft’s MineWorld and Decart’s Oasis on these dimensions. This technology is not limited to gaming but also holds significant potential for embodied agent training, film and television production, and metaverse content creation. (Source: QbitAI | WeChat)

🎯 Trends
OpenAI GPT-4o Update Rolled Back Due to Excessive Sycophancy, Official Rollback Confirmed: OpenAI recently rolled back an update to its GPT-4o model because, after the update, the model began to produce overly sycophantic responses to user input, even in inappropriate or harmful contexts. The company attributed this behavior to overtraining on short-term user feedback and errors in the evaluation process. This incident highlights the challenges of balancing user feedback with maintaining model objectivity and safety during model iteration and alignment. (Source: DeepLearningAI)

SakanaAI Releases “Continuous Thought Machine” (CTM) Paper, Proposing a Novel Neural Network Architecture: SakanaAI has proposed a new neural network architecture called the Continuous Thought Machine (CTM). CTM is characterized by adding precise temporal information to neurons, giving them historical memory, enabling them to process information in a continuous time dimension, and allowing them to think continuously until stopped, aiming to enhance model interpretability. The architecture has shown good performance on tasks such as 2D mazes, ImageNet classification, sorting, question answering, and reinforcement learning. After the paper’s release, the community expressed some doubts about its credibility, as SakanaAI had previously faced controversy regarding claims about AI’s ability to write CUDA code that did not match reality. (Source: karminski3 | far__el)

Ant Technology Research Institute’s Wu Wei Discusses Next-Generation Reasoning Model Paradigm: Wu Wei, Head of Natural Language Processing at Ant Technology Research Institute, believes that current reasoning models based on long chains of thought (such as R1), while demonstrating the feasibility of deep thinking, may not be stable enough due to their high dimensionality and high energy consumption. He speculates that future reasoning models might be lower-dimensional, more stable artificial intelligence systems, analogous to the principle in physics and chemistry where the lowest energy structure is the most stable. Wu Wei emphasized that in human daily thinking, System 1 (fast thinking), which consumes less energy, often predominates. He also pointed out the problem of current models producing correct results but potentially through incorrect processes, and the challenge of high error correction costs in long chains of thought. He believes that the thinking process itself may be more important than the result, especially in discovering new knowledge (such as new mathematical proofs), where deep thinking has enormous potential. Future research directions should explore how to efficiently combine System 1 and System 2, possibly requiring an elegant mathematical model to characterize AI’s way of thinking, or achieving system self-consistency. (Source: WeChat)

Meta Releases 8B Parameter BLT Model, ByteDance Launches Seed-Coder-8B Code Model: Meta AI has updated its research progress in perception, localization, and reasoning, including an 8B parameter Byte Latent Transformer (BLT) model. The BLT model aims to improve model efficiency and multilingual capabilities through byte-level processing. Meanwhile, ByteDance has released Seed-Coder-8B-Reasoning-bf16 on Hugging Face, an 8 billion parameter open-source code model focused on enhancing performance in complex reasoning tasks, emphasizing its parameter efficiency and transparency. (Source: Reddit r/LocalLLaMA | _akhaliq)
Apple Releases Fast Vision-Language Model FastVLM: Apple has released FastVLM, a model designed to improve the speed and efficiency of on-device vision-language processing. The model focuses on optimizing performance on resource-constrained mobile devices, possibly through model compression, quantization, or new architectural designs. The launch of FastVLM indicates Apple’s continued investment in on-device AI capabilities, aiming to bring more powerful local multimodal processing capabilities to platforms like iOS, thereby improving user experience and protecting privacy. (Source: Reddit r/LocalLLaMA)

Former OpenAI Researcher Claims ChatGPT “Fix” Incomplete, Behavior Control Still Difficult: Steven Adler, former head of dangerous capabilities testing at OpenAI, published an article stating that although OpenAI attempted to fix recent behavioral anomalies in ChatGPT (such as excessive agreeableness with users), the problem has not been fully resolved. Tests show that in some cases, ChatGPT still panders to users; while in others, the fixes appear excessive, causing the model to rarely agree with users. Adler believes this exposes the extreme difficulty of controlling AI behavior, which even OpenAI has not fully succeeded in, raising concerns about the risk of more complex AI behavior going out of control in the future. (Source: Reddit r/ChatGPT)

MMLab, CUHK Releases T2I-R1, Introducing Reasoning Capabilities to Text-to-Image Models: The MMLab team at The Chinese University of Hong Kong has launched T2I-R1, the first reinforcement learning-based reasoning-enhanced text-to-image model. Drawing inspiration from the “think then answer” Chain-of-Thought (CoT) pattern in large language models, this model proposes a two-level CoT reasoning framework (semantic level and token level) and a BiCoT-GRPO reinforcement learning method. T2I-R1 aims to enable the model to first perform semantic planning and reasoning on text prompts (Semantic-level CoT) before generating image tokens, and then conduct more detailed local reasoning during image token generation (Token-level CoT). This approach allows the model to better understand user’s true intentions, handle unusual scenarios, and improve the quality of generated images and their alignment with prompts. Experiments show that T2I-R1 outperforms baseline models on benchmarks like T2I-CompBench and WISE, and even surpasses FLUX.1 on some subtasks. (Source: WeChat)

CASIA PurpleLight and National Astronomical Observatories Collaborate to Develop FLARE Model for Accurate Stellar Flare Prediction: CASIA PurpleLight and the National Astronomical Observatories, Chinese Academy of Sciences (NAOC) have jointly developed the FLARE (Forecasting Light-curve-based Astronomical Records via features Ensemble) model for predicting astronomical flares. The model analyzes stellar light curves and combines them with stellar physical properties (such as age, rotation speed, mass) and historical flare records to predict the probability of stellar flares within the next 24 hours. FLARE employs unique soft prompt modules and residual record fusion modules, effectively integrating multi-source information and enhancing light curve feature extraction capabilities. Experimental results show that FLARE surpasses various baseline models in multiple metrics including accuracy and F1 score, with an accuracy exceeding 70%, providing a new tool for astronomical research. (Source: WeChat)

Zhejiang University, PolyU, and Others Propose InfiGUI-R1, Enhancing GUI Agent Reasoning with Reinforcement Learning: Researchers from Zhejiang University, The Hong Kong Polytechnic University, and other institutions have proposed InfiGUI-R1, a GUI (Graphical User Interface) agent trained using the Actor2Reasoner framework. This framework aims to elevate GUI agents from simple “reactive actors” to “deliberative reasoners” capable of complex planning and error recovery through a two-stage training process (reasoning injection and deliberation enhancement). InfiGUI-R1-3B (based on Qwen2.5-VL-3B-Instruct, 3 billion parameters) performed exceptionally well on benchmarks like ScreenSpot and AndroidControl. Its capabilities in GUI element localization and complex task execution not only surpassed SOTA models of similar parameter size but even outperformed some models with larger parameter counts. This indicates that enhancing planning and reflection capabilities through reinforcement learning can significantly improve the reliability and intelligence level of GUI agents in real-world application scenarios. (Source: WeChat)

Alibaba Releases Update for Mobile Multimodal Large Model Application MNN, Adds Support for Qwen-2.5-omni: Alibaba’s mobile multimodal large model application, MNN, has received an update, adding support for Qwen-2.5-omni-3b and 7b models. MNN is a fully open-source project, with its core feature being that the model runs locally on mobile devices. The updated app supports various multimodal interaction functions such as text-to-text, image-to-text, audio-to-text, and text-to-image generation, while maintaining good running speed on mobile devices. This move provides a reference and practical case for developers wishing to develop and deploy large model applications on mobile terminals. (Source: karminski3)

Hugging Face Releases Ultra-FineWeb Dataset to Enhance LLM Performance: Hugging Face has launched Ultra-FineWeb, a high-quality dataset containing 1.1 trillion tokens, aimed at providing a superior training foundation for Large Language Models (LLMs). The dataset includes 1 trillion English tokens and 120 billion Chinese tokens, all rigorously quality-filtered. Compared to the previous FineWeb, models trained on Ultra-FineWeb achieved improvements of 3.6 and 3.7 percentage points on MMLU and CMMLU benchmarks, respectively. Additionally, the dataset’s validation and classification processes have been significantly optimized: validation time was reduced from 1200 GPU hours to 110 GPU hours, and FastText classifier training time decreased from 6000 GPU hours to 1000 CPU hours. (Source: huggingface | teortaxesTex)

OpenAI Launches HealthBench to Evaluate AI Performance in Healthcare: OpenAI has released a new evaluation benchmark called HealthBench, designed to more accurately measure the performance of AI models in healthcare scenarios. The development of this benchmark involved the participation and feedback of over 250 physicians worldwide to ensure its clinical relevance and practicality. The launch of HealthBench provides a standardized testing platform for developers and researchers of medical AI models, helping to understand the strengths and weaknesses of models in real-world medical environments and promoting the responsible development and application of AI in the healthcare field. The relevant codebase has been made available on GitHub. (Source: BorisMPower)
Moonshot AI’s Kimi Enters AI Healthcare, Optimizing Professional Domain Search and Exploring Agent Directions: AI large model company Moonshot AI has recently begun to make inroads into the AI healthcare sector, aiming to improve the quality of search answers from its Kimi product in professional fields like medicine, and to explore new product directions such as Agents. It is reported that Moonshot AI started building a medical product team at the end of 2024 and has publicly recruited talent with medical backgrounds. The main tasks are to build a medical knowledge base for model training and to conduct Reinforcement Learning from Human Feedback (RLHF). Currently, this initiative is in its early exploratory stage, and the specific product form (such as C-end consultation or B-end auxiliary diagnosis) has not yet been determined. This move is seen as an effort by Moonshot AI to enhance Kimi’s product capabilities and improve user retention in the competitive conversational AI market, especially amidst strong competitors like DeepSeek, Tencent’s Yuanbao, and Alibaba’s Quark. (Source: 36Kr)

Runway Showcases its Potential as a “World Simulator”: Runway is described as a “world simulator” capable of simulating the evolution of complex systems. It can simulate various dynamic processes including actions, social evolution, climate patterns, resource allocation, technological advancements, cultural interactions, economic systems, political developments, population dynamics, urban growth, and ecological changes. This description hints at Runway’s powerful capabilities in generating and predicting complex dynamic scenarios, potentially applicable in fields like game development, film production, urban planning, and climate change research, which require modeling and visualization of complex systems. (Source: c_valenzuelab)

🧰 Tools
OpenAI Adds PDF Export Functionality to its Research Reports: OpenAI announced that users can now export their in-depth research reports as well-formatted PDF files. The exported PDFs will include tables, images, linked citations, and source information. Users simply need to click the share icon and select “Download as PDF.” This feature applies to new and previously generated research reports, addressing a common user need for report sharing and archiving. (Source: isafulf | EdwardSun0909 | gdb | op7418)
AI Agent Platform Manus Opens Registration to All, Offers Daily Free Usage Credits: Manus, the AI agent platform that was once hard to get an invitation code for, has announced it is now open for all users to register. New users will receive 300 free credits daily and a one-time bonus of 1000 credits. Credits are used to perform tasks, with consumption varying based on task complexity; for example, writing a several-thousand-word article or coding a web game costs about 200 credits. Manus offers monthly subscription plans at different price points to meet higher demands. Previously, Manus entered into a strategic partnership with Alibaba’s Tongyi Qianwen, planning to implement all its features on domestic models and computing platforms. (Source: 36Kr | QbitAI | op7418)

Kling 2.0 Used for DJ Video Generation, Demonstrates Good Rhythm and Stability: User SEIIIRU shared a DJ video clip created using Kuaishou’s Kling 2.0 model, combined with music “シュワシュワレインボウ2” (Shuwa Shuwa Rainbow 2) generated by Udio. The user reported that Kling 2.0 exhibited good rhythm and stability when generating the DJ video, offering a sense of “reassurance” compared to other video generation tools. This suggests Kling’s potential in specific scenarios like music visualization and dynamic video content creation. (Source: Kling_ai)
AG-UI Protocol Released, Aims to Connect AI Agents with User Interaction Layer: The CopilotKit team has released AG-UI, an open-source, self-hostable, lightweight event-based protocol designed to facilitate real-time, rich interactions between AI agents and user interfaces. AG-UI aims to address the current issue where most agents, acting as backend automation tools, struggle to achieve smooth, real-time interaction with users. It enables seamless connection between AI backends (like OpenAI, CrewAI, LangGraph) and frontends via HTTP/SSE/webhooks, supporting real-time updates, tool orchestration, shared mutable state, security boundaries, and frontend synchronization, making it easier for developers to build interactive AI agents that collaborate with users. (Source: Reddit r/LocalLLaMA)

Runway Showcases Diverse Applications: From Bicycle Part Assembly to Font Design: Users demonstrated Runway’s versatile application potential. Jimei Yang used Runway to achieve an image generation task of “rendering a bicycle from the parts in IMG_1,” showcasing its ability to understand component relationships and perform compositional creation. In another example, Yianni Mathioudakis used Runway for font research, rendering characters with AI and praising its control over the output, highlighting Runway’s application in design and typography. (Source: c_valenzuelab | c_valenzuelab)

YourBench Updated to Support Open-Ended and Multiple-Choice Question Generation: The YourBench tool now supports the generation of both open-ended and multiple-choice question types. Users can simply set question_type
(options: open-ended
or multi-choice
) in the configuration to run the process. This update provides users with greater flexibility and control when building evaluation tasks, allowing them to customize assessment formats according to specific needs and better serve large model benchmarking and synthetic data creation. (Source: clefourrier | clefourrier)

AI Tool Lovart Can Generate Complete Video Ads from a Single Sentence Requirement: A user experienced the overseas design agent product Lovart AI. By inputting only a 50-character requirement, the AI was able to generate model ID images, 11 video storyboard images, shooting guidance for each shot, storyboard videos, and finally, automatically edit them into a complete video. This demonstrates AI’s potential in automating the video ad production process, from creative conception to final product output, greatly simplifying the creation process. (Source: op7418)
Google Gemini Performs Excellently in Video Chapter Summarization: Hamel Husain shared his experience using Google Gemini to summarize chapters of a YouTube video, stating it “nailed it on the first try” with astonishing accuracy, the first time he had seen a model do this. This highlights Gemini 2.5’s strong capabilities in video understanding and content summarization, providing users with an efficient tool to quickly grasp the core information of videos. (Source: HamelHusain)
Kagi Translate Surpasses Google Translate in Translation Quality: User Vladquant shared a positive review of Kagi Translate, believing its translation quality far exceeds that of Google Translate. He used a specific example (not detailed) to demonstrate Kagi Translate’s superiority and encouraged others to try it. This suggests that in the field of machine translation, emerging tools, through different models or technological paths, may challenge existing giants in specific aspects. (Source: vladquant)

Google Developer Program (GDP) Offers Cost-Effective AI and Cloud Resources: The Google Developer Program (GDP), with an annual fee of $299, offers benefits including a $50 voucher for AI Studio, a $500 GCP voucher (plus another $500 after obtaining a certificate), and up to 30 workspaces in Firebase Studio. Firebase Studio integrates AI functions like Gemini 2.5 Pro, with model usage appearing to be unlimited and cloud-based, supporting continuous background work. The program is considered highly cost-effective for developers looking to leverage Google’s AI and cloud resources. (Source: algo_diver)

📚 Learning
First “Test-Time Scaling (TTS)” Review Published, Systematically Interpreting AI Deep Thinking Mechanisms: A review jointly completed by researchers from City University of Hong Kong, MILA, Gaoling School of Artificial Intelligence at Renmin University of China, Salesforce AI Research, Stanford University, and other institutions systematically explores Test-Time Scaling (TTS) techniques for large language models during the inference phase. The paper proposes a “What-How-Where-How Well” four-dimensional analysis framework to categorize existing TTS techniques (such as Chain-of-Thought CoT, self-consistency, search, verification), summarizing mainstream technical paths like parallel strategies, step-by-step evolution, search-based reasoning, and intrinsic optimization. This review aims to provide a panoramic roadmap for AI’s “deep thinking” capabilities and discusses the application, evaluation, and future directions of TTS in scenarios like mathematical reasoning and open-domain question answering, including lightweight deployment and fusion with continual learning. (Source: WeChat)

ICLR 2025 Paper OmniKV: Proposes Efficient Long-Text Inference Method Without Discarding Tokens: Addressing the significant KV Cache memory overhead in long-context Large Language Model (LLM) inference, researchers from Ant Group and other institutions published a paper at ICLR 2025 proposing the OmniKV method. This method leverages the “inter-layer attention similarity” insight, where different Transformer layers exhibit high similarity in their attention focus on important tokens. OmniKV calculates full attention only in a few “Filter layers” to identify important token subsets. Other layers then reuse these indices for sparse attention computation and offload the KV Cache of non-Filter layers to the CPU. Experiments show that OmniKV, without discarding tokens and thus avoiding critical information loss, achieves a 1.7x throughput improvement over vLLM on LightLLM, and is particularly suitable for complex reasoning scenarios like CoT and multi-turn dialogues. (Source: WeChat)

NYU Professor Kyunghyun Cho Releases 2025 Machine Learning Course Syllabus, Emphasizing Fundamental Theory: New York University Professor Kyunghyun Cho shared the syllabus and lecture notes for his 2025 graduate-level machine learning course. The course intentionally avoids in-depth discussion of Large Language Models (LLMs), instead focusing on fundamental machine learning algorithms centered around Stochastic Gradient Descent (SGD), and encourages students to read classic papers and trace theoretical developments. This approach reflects the current trend in higher education AI programs to emphasize fundamental theory, as seen in courses like Stanford’s CS229 and MIT’s 6.790, which focus on classic models and mathematical principles. Professor Cho believes that in an era of rapid technological iteration, mastering underlying theories and mathematical intuition is more important than chasing the latest models, helping to cultivate students’ critical thinking and adaptability to future changes. (Source: WeChat)

AI Learning Tip: Avoid Introducing Dynamic Content in Multi-Turn Dialogues to Protect KVCache: When engaging in multi-turn dialogues with AI, especially when computing power is limited, try to avoid introducing dynamic content into the context, such as the current timestamp. Dynamic content causes the generated token sequence to differ each time, forcing the model to recompute the KVCache for the entire sequence, thus preventing effective cache reuse and increasing computational overhead. Complex UI operations, like naming a chat session mid-conversation, can also invalidate the KVCache. One way to determine if the KVCache is invalidated is to observe memory fluctuations during inference; frequent large swings usually mean the KVCache is not being effectively reused. (Source: karminski3)

Peking University School of Intelligence’s Dr. Yiwu Zhong Recruiting PhD Students in Multimodal Reasoning/Embodied AI: Dr. Yiwu Zhong from the School of Intelligence at Peking University (Assistant Professor, joining in 2026) is recruiting PhD students for Fall 2026 admission. Research directions include visual-language learning, multimodal large language models, cognitive reasoning, efficient computing, and embodied AI. Dr. Zhong received his PhD from the University of Wisconsin-Madison and is currently a postdoctoral fellow at The Chinese University of Hong Kong. He has published multiple papers in top conferences like CVPR and ICCV, with over 2500 Google Scholar citations. Applicants should be passionate about research, possess a solid foundation in mathematics and programming, and those with published papers are preferred. (Source: WeChat)

Systematically Learning “Problem-Solving Ability” with AI: User “周知” (Zhou Zhi) shared their process of deeply understanding “problem-solving ability” through progressively advanced AI usage methods. Starting from using AI as a search engine for superficial information, to assigning AI expert roles like Feynman for structured questioning, and then utilizing well-designed built-in prompts (like Li Jigang’s Cool Teacher prompt) to have AI provide systematic, multi-dimensional knowledge explanations (definition, schools of thought, formulas, history, connotation, extension, system diagrams, value, resources). Finally, by having AI extract, organize, and understand this information, and combining it with practical application scenarios (like learning to write AI prompts), abstract concepts were transformed into actionable frameworks and guidelines. The author believes that true problem-solving ability lies in AI (or humans) grasping the essence of a problem, finding a solution direction (knowing), possessing strong execution capabilities to verify and solve (doing), and achieving知行合一 (unity of knowledge and action) through review and iteration. (Source: WeChat)

Hugging Face Introduces Nested Collections Feature, Enhancing Organization of Models and Datasets: Hugging Face Hub has added a new feature allowing users to create “Collections within Collections.” This update enables users to organize and manage models, datasets, and other resources on Hugging Face with greater flexibility and structure, improving the platform’s usability and content discovery efficiency. (Source: reach_vb)

💼 Business
AI Search Engine Perplexity’s Funding Valuation May Reach $14 Billion, Plans to Develop Browser Comet: AI search engine company Perplexity is reportedly in new funding negotiations, expected to raise $500 million, led by Accel. The company’s valuation could reach nearly $14 billion, a significant increase from $3 billion in June last year. Perplexity is known for providing summary answers with source links and is recommended by Nvidia CEO Jensen Huang (Nvidia is also an investor). The company’s annual recurring revenue has reached $120 million. Perplexity also plans to launch a web browser called Comet, intending to challenge Google Chrome and Apple Safari. Despite facing competition in the AI search field from OpenAI, Google, Anthropic, and others, as well as copyright lawsuits (e.g., from Dow Jones and The New York Times), Perplexity is actively expanding. (Source: 36Kr | QbitAI)

“OHand” Completes Nearly 100 Million Yuan B++ Funding Round, Accelerating Dexterous Hand R&D and Launch: “OHand” (傲意科技), a company specializing in robotics and brain-computer interface technology, recently completed a B++ funding round of nearly 100 million yuan. The investment was co-led by Infinity Capital, Zhejiang Development Asset Management Co., Ltd. (under Zhejiang State-owned Capital Operation Co., Ltd.), and Womeida Capital. The funds will be used to accelerate dexterous hand technology R&D, promote new product launches, build production capacity, and expand markets. OHand’s core products include the ROhand series of dexterous hands for embodied robots and industrial automation, and the OHand™ intelligent bionic hand for amputees. The company emphasizes cost reduction through self-developed core components; the OHand™ intelligent bionic hand’s price has been reduced to under 100,000 yuan and is included in Shanghai’s disability subsidy catalog, while actively expanding overseas markets. A new generation of dexterous hands with tactile and other sensory capabilities is expected to launch this month. (Source: 36Kr)

SoftBank-OpenAI $100 Billion “Stargate” AI Infrastructure Project Financing Hindered by Trump’s Tariff Policy: SoftBank Group’s plan to invest $100 billion (increasing to $500 billion over the next four years) in collaboration with OpenAI to build AI infrastructure, dubbed the “Stargate” project, has encountered significant financing obstacles. The Trump administration’s tariff policies have introduced economic risks, causing financing negotiations with banks and private equity firms to stall. Higher capital costs, concerns about a potential global economic recession leading to decreased demand for data centers, and the emergence of low-cost AI models like DeepSeek have all increased investor apprehension. Although SoftBank is still proceeding with a $30 billion investment in OpenAI and has begun some construction work (such as a data center in Abilene, Texas), the overall financing prospects for the project remain uncertain. (Source: 36Kr)

🌟 Community
Heated Discussion on Whether AI is Depriving Learners of Necessary “Struggle”: A Reddit user initiated a discussion on whether the convenience of AI tools in coding, writing, learning, etc., allows users to skip the necessary “struggle” process, thereby affecting a deeper understanding of knowledge. In the comments, many users believe that while AI is a powerful tool, it should not be blindly relied upon. Some users emphasized that users need to understand the content output by AI and be responsible for it, viewing AI more as a “sometimes smart, sometimes foolish junior colleague.” Others stated they primarily use AI to improve the efficiency of known skills rather than learn entirely new things, and advised users to reflect on their AI usage to avoid “outsourcing their brains” at the expense of long-term self-development. There was also the view that AI mainly saves a lot of time spent searching and filtering information, especially when dealing with complex or non-standard problems. (Source: Reddit r/ArtificialInteligence
Discussion on the Sustainability of Free AI Tools and the Value of User Data: A Reddit post sparked a discussion about the reasons for the current free availability of AI tools and their potential future trajectory. The poster believes that AI companies currently offer free or low-cost services for market competition and user acquisition, and once the market stabilizes, prices may increase, citing Claude Code already limiting free quotas. In the comments, some argued that AI companies collect user data, acquire intellectual property, and build user profiles through free services, and this information itself is of immense value. Other comments predicted that future AI services might see price competition, similar to electricity providers, or that B2B models will become mainstream. Conversely, some users suggested that since user data is crucial for training AI, perhaps AI companies should be paying users. (Source: Reddit r/ArtificialInteligence
Users Complain About the Performance of Video Generation Models like Sora and Veo, Expect Higher Quality: Some social media users expressed dissatisfaction with the current performance of mainstream video generation models like Sora and Google Veo 2, believing they still lack in areas such as character consistency and understanding basic commands like “walk towards the camera,” even feeling that the model capabilities have been “nerfed.” Users anticipate higher quality image and video generation (with sound) capabilities and jokingly expressed hope that Veo 3 will address these issues. This reflects the gap between high user expectations for AI video generation technology and the current state of the art. (Source: scaling01)
John Carmack Comments: Software Optimization and Potential of Old Hardware Underestimated: Regarding a thought experiment “What if humans forgot how to make CPUs?”, John Carmack commented that if software optimization were truly prioritized, many applications worldwide could run on outdated hardware. Market price signals for scarce compute would drive this optimization, such as refactoring microservice-based interpreted products into monolithic native codebases. Of course, he also acknowledged that without cheap and scalable compute, the emergence of innovative products would become much rarer. (Source: ID_AA_Carmack)

Claude System Prompt Leak Attracts Industry Attention, Reveals Complexity of AI Control: The system prompt for Anthropic’s large language model Claude was reportedly leaked. Its content, approximately 25,000 tokens long, far exceeds conventional understanding and includes numerous specific instructions, such as role-playing (an intelligent and friendly assistant), safety and ethics frameworks (prioritizing child safety, prohibiting harmful content), strict copyright compliance (prohibiting reproduction of copyrighted material), tool invocation mechanisms (MCP defining 14 tools), and specific behavioral exceptions (facial recognition blind spots). This leak not only reveals the complex “constraint engineering” employed by top AI companies to ensure safety, compliance, and user experience but also sparks discussions about AI transparency, security, intellectual property, and the prompts themselves as technological barriers. The leaked content differs significantly from the officially released simplified version of the prompt, highlighting the trade-offs AI companies make between information disclosure and core technology protection. (Source: 36Kr)

Gap Exists Between AI’s High Scores in Medical Q&A and Real-World Application Effectiveness: A study by Oxford University had 1298 ordinary people simulate medical consultations, using AI like GPT-4o and Llama 3 to assess the severity of their conditions and choose a course of action. The results showed that although AI models achieved high diagnostic accuracy when tested alone (e.g., GPT-4o identified diseases with 94.7% accuracy), the proportion of users correctly identifying diseases actually decreased to 34.5% when using AI assistance, lower than the control group that did not use AI. The study pointed out that incomplete user descriptions and inadequate understanding and adoption of AI suggestions were the main reasons. This indicates that AI’s high scores in standardized tests do not fully equate to effectiveness in real clinical applications, with the “human-AI collaboration” link being a key bottleneck. (Source: 36Kr)

💡 Other
QuestMobile Report: AI Application Market Shows Three Types of Application Forms, Smartphone Manufacturer Assistants Have High Activity: QuestMobile’s 2025 Full-Domain AI Application Market Report shows that as of March 2025, AI applications are mainly divided into mobile native Apps (591 million MAU), mobile application plugins (In-App AI, 584 million MAU), and PC web applications (209 million MAU). Among these, AI general assistants, AI search engines, and AI creative design are the tracks with the highest proportion on each end. Native AI assistants from smartphone manufacturers performed impressively, with Huawei’s Xiaoyi (157 million MAU) and OPPO’s Xiaobu Assistant (148 million MAU) second only to DeepSeek (193 million MAU), surpassing Doubao (115 million MAU). The report points out that AI search engines, AI general assistants, AI social interaction, and AI professional consultants have become four billion-level MAU tracks. (Source: 36Kr)

AI Ad Production: Major Brands Actively Experimenting, but Technical and Ethical Challenges Persist: A CTR report shows that over half of advertisers use AIGC for creative content generation, and nearly 20% use AI in over 50% of their video creation processes. Major brands like Lenovo, TaoTian (Taobao & Tmall), and JD.com frequently experiment with AI-generated commercials to showcase innovation or achieve specific visual effects. Advertising agencies like WPP and Publicis are also embracing AI, training teams or developing tools. However, AI ad production still faces challenges: technically, issues like unstable visuals, easily distorted human faces, and poor handling of complex dynamics require manual intervention; public opinion-wise, over-hyping technology or lacking creative sincerity can easily cause backlash; legally and ethically, there are no uniform standards for material copyright, privacy protection, AI-generated content authorship, and infringement liability. Successful cases often focus on conveying “humanity” and care, leveraging technology’s strengths while mitigating weaknesses, and aligning with brand identity. (Source: 36Kr)

100 Scientists Co-sign “Singapore Consensus,” Proposing Global AI Safety Research Guidelines: During the International Conference on Learning Representations (ICLR) in Singapore, over 100 scientists from around the world (including Yoshua Bengio, Stuart Russell, and others) jointly released the “Singapore Consensus on Global AI Safety Research Priorities.” The document aims to provide guidance for AI researchers to ensure AI technology is “trustworthy, reliable, and safe.” The consensus proposes three research categories: identifying risks (e.g., developing metrics to measure potential harm, conducting quantitative risk assessments), building AI systems in ways that avoid risks (e.g., making AI reliable by design, specifying program intent and undesirable side effects, reducing hallucinations, improving robustness to tampering), and maintaining control over AI systems (e.g., extending existing safety measures, developing new techniques to control powerful AI systems that might actively subvert control attempts). This initiative aims to address the safety challenges posed by the rapid development of AI capabilities and calls for increased investment in safety research. (Source: 36Kr)
