AI Daily - 2025-08-07(Morning)

Keywords：AI hydrogel, autonomous surgical robot, intelligent microscope, GPT-5, AI video generation, AI companion robot, AI ethics, large model chess championship, AI-designed ultra-adhesive hydrogel, Da Vinci surgical robot autonomous cholecystectomy, deep learning predicts protein misfolding, GPT-5 reasoning surpasses humans, Grok 4 chess AI performance

🔥 Spotlight

AI-Created Hydrogel Adheres to Everything: Breakthrough progress has been made in AI-assisted material design, with Nature featuring an AI-designed super-adhesive hydrogel on its cover. This hydrogel, developed by analyzing natural adhesive protein sequences, achieves strong adhesion in wet environments and possesses long-term stability and biocompatibility. This technology is expected to revolutionize biomedical applications such as prosthetic coatings, wearable biosensors, and underwater repair materials, opening up a new end-to-end data-driven path for soft material design and demonstrating AI’s immense potential in material science. (Source: 36氪)

Autonomous Surgical Robot Successfully Removes Gallbladder: Johns Hopkins University and other institutions have developed a system called SRT-H, enabling the da Vinci surgical robot to autonomously complete key steps of gallbladder removal without continuous human intervention. This system trains a high-level planner and low-level action generator through imitation learning and can self-correct errors during operation, demonstrating the immense potential of autonomous surgery. Although currently only tested on ex vivo tissues and slower than human surgeons, its natural language interface and interpretability lay a crucial foundation for future safe autonomous surgery. (Source: DeepLearning.AI Blog)

Open Agentic LLMs Proliferate, Robot Removes Gallbladders, Reasoning Models Boost Emissions, OpenAI Re-Opens

Smart Microscope Predicts Protein Misfolding Aggregation: EPFL researchers have developed a smart microscope using deep learning that can real-time track and analyze the aggregation process of misfolded proteins associated with neurodegenerative diseases, even predicting it before it begins. The system combines image classification algorithms with Brillouin microscopy, automatically triggering analysis upon detecting protein aggregation, significantly improving imaging efficiency and reducing the use of fluorescent labels. This breakthrough is crucial for understanding the biomechanical mechanisms of neurodegenerative diseases and drug discovery, marking the immense potential of smart microscopes in life sciences. (Source: aihub.org)

Smart microscope captures aggregation of misfolded proteins

🎯 Trends

Silicon Valley AI Giants Intensively Release New Models: Silicon Valley AI giants have recently released new advancements, accelerating AI competition. OpenAI re-released its open-source model gpt-oss after six years, including 120B and 20B versions, emphasizing local deployment and agent applications, with performance approaching o4-mini. Google launched Genie 3, enabling text-to-interactive 3D virtual worlds in minutes, considered a crucial step towards AGI. Anthropic updated Claude Opus 4.1, achieving a new SOTA in AI programming capabilities, further solidifying its leading position in the programming domain. These releases signal accelerated AI competition in open-source, world models, and vertical applications. (Source: 36氪, DeepLearning.AI Blog, 量子位, 36氪)

Massive Information Leak Ahead of GPT-5 Release: OpenAI has pre-announced the GPT-5 launch event, with a significant amount of information leaked. GPT-5 is reportedly set to introduce standard, mini, nano, and chat versions, supporting tiered access, allowing free users to experience the basic version. Internal tests show its excellent performance in reasoning, programming, mathematics, and scientific problem-solving, with its reasoning ability surpassing the human average for the first time. Concurrently, Sam Altman has issued huge bonuses to employees, and OpenAI’s valuation is expected to reach $500 billion, indicating its confidence in GPT-5 and market anticipation. (Source: 36氪, 36氪)

First Large Model Chess Championship: Grok 4 and o3 Advance to Finals: Google’s Kaggle platform hosted the first AI International Chess Championship, featuring a showdown between eight top LLMs. In the first round, domestic models like DeepSeek R1 and Kimi K2 Instruct were unfortunately eliminated. In the semifinals, xAI’s Grok 4 and OpenAI’s o3 defeated their opponents to advance to the finals. The competition rules restricted models from calling external tools, aiming to purely test their reasoning capabilities, exposing the shortcomings of AI models in contextual understanding and tactical execution. However, Grok 4’s performance received high praise from Elon Musk, attracting widespread attention. (Source: 36氪, 36氪, 36氪)

爆冷，首届大模型争霸，Grok 4下出“神之一手”？DeepSeek、Kimi惨遭淘汰

Review of China’s AI Large Model Platform Progress in July: July saw an active trend in China’s large model market. The WAIC conference focused on embodied AI, emphasizing AI’s shift “from screen to reality.” Multi-agent systems emerged as a new trend, with 360 Nano AI launching L4-level multi-agent swarms for complex task collaboration. Leading manufacturers have successively open-sourced their latest models, such as Alibaba’s Qwen3 series, Moonshot AI’s Kimi K2, and Zhipu AI’s GLM-4.5, promoting the nascent “root system” of domestic large models, continuously enhancing their technical strength, and dominating international rankings. (Source: 36氪, 量子位, DeepLearning.AI Blog, 量子位)

Explosion of AI Video Generation Models and Rise of Agentic Web Concept: The AI video generation field is experiencing explosive growth. Following Sora’s breakthrough, Runway Gen-3, Luma Dream Machine, Kuaishou Keling, and others have been successively launched, significantly reducing video production costs. The market landscape remains fluid, with domestic manufacturers like ByteDance, Kuaishou, MiniMax, and Aishi Tech showing prominent performance. Concurrently, the Agentic Web concept is emerging, proposing a next-generation internet driven by AI agents, where agents will become the primary operators of the Web, automating tasks and foreshadowing a complete restructuring of the internet’s underlying logic. (Source: 36氪, 36氪, 36氪)

AI Glasses Achieve ‘Grab Anything from Thin Air’ for New Mixed Reality Interaction: Researchers, including Zhejiang University alumni, have proposed Reality Proxy technology, enabling AI glasses to perform “grab anything from thin air” functionality. Users can select and interact with real-world objects via gestures, greatly enhancing the mixed reality experience. This technology abstracts real objects into digital proxies, supporting browsing, multi-object selection, attribute filtering, and semantic grouping. It is expected to be applied in daily information retrieval, architectural navigation, and drone control, representing a significant advancement in embodied AI and human-computer interaction. (Source: 36氪)

🧰 Tools

Nokia 3210 Equipped with DeepSeek AI: A New Feature Phone Experience: HMD has launched a re-released version of the Nokia 3210 feature phone, with built-in DeepSeek AI. This phone, priced at a low 429 yuan, offers AI voice assistant functionality with fast and accurate voice recognition and concise, interesting replies, even humorously responding with “cracking walnuts.” Although its AI capabilities are limited, its “good enough” philosophy and user-friendliness for elderly users provide new insights into the popularization of AI in low-cost devices, demonstrating the potential for AI democratization. (Source: 36氪)

Tencent AI Lab Open-Sources Deep Research Agent Framework Cognitive Kernel-Pro: Tencent AI Lab has open-sourced Cognitive Kernel-Pro, a fully open-source, multi-module, hierarchical deep research agent framework. This framework uses Python code as its action space, minimizing external dependencies, and aims to improve knowledge discovery and problem-solving efficiency. It performs excellently on the GAIA benchmark, approaching paid tool agents, and enhances performance through innovative training methods, providing a reproducible solution for AI agent development and training. (Source: 量子位)

Claude Code Launches Automated Security Review Feature: Anthropic’s Claude Code now offers an automated security review feature, allowing users to run security checks directly from the terminal and integrate them into GitHub Actions for automatic review of every new PR. This feature can identify and fix vulnerabilities such as SQL injection, XSS, and authentication flaws. Anthropic has internally used it to discover and fix real vulnerabilities, demonstrating AI’s potential in improving software development security and efficiency, though community discussions exist regarding its trustworthiness. (Source: Reddit r/ClaudeAI)

OpenWebUI User Experience Issues: The Reddit community discussed issues with OpenWebUI running Ollama and LiteLLM in a Proxmox LXC environment, where users are unable to use tools, functions, and pipeline functionalities, seeking successful experiences from similar configurations. Additionally, users are concerned about how to hide or expand the Chain-of-Thought (CoT) output of gpt-oss models (running via llama.cpp-server) in OpenWebUI. These issues reflect the challenges faced in deploying and configuring AI tools in specific virtualization environments and optimizing user experience. (Source: Reddit r/OpenWebUI, Reddit r/OpenWebUI)

Demand for Open-Source Lightweight CPU-Friendly Word Alignment AI Model: A Reddit user is seeking an open-source, lightweight, CPU-friendly AI model for language translation that can receive source and target language sentences as input and return an index array for word alignment, similar to simalign but not limited by its accuracy issues. This reflects developers’ specific needs for model performance, deployment environment, and open-source customizability in certain NLP tasks, to achieve efficient language processing in resource-constrained scenarios. (Source: Reddit r/deeplearning)

📚 Learning

LLM ‘Soft Thinking’ Capability and Reasoning Optimization: Research papers explore the “soft thinking” capability of large reasoning models, finding that LLMs primarily rely on the most influential parts of soft inputs during subsequent decoding, leading to “soft thinking” degenerating into greedy decoding. By introducing Dirichlet resampling and Gumbel-Softmax tricks, randomness can be effectively introduced, unleashing the potential of “soft thinking,” and performing excellently across eight reasoning benchmarks, revealing new directions for improving LLM reasoning capabilities. (Source: Reddit r/MachineLearning)

Book Recommendation: ‘Mastering Modern Time Series Forecasting’: ‘Mastering Modern Time Series Forecasting’ consistently ranks first in Leanpub’s machine learning, time series, and forecasting categories. The book comprehensively covers classic methods like ARIMA and Prophet, as well as modern ML/DL models like LightGBM and Transformer, focusing on Python practice, production deployment, interpretability, and uncertainty quantification. It aims to provide data scientists, ML engineers, and researchers with a resource that balances theory and practice. (Source: Reddit r/deeplearning)

Qwen3’s New Paradigm GSPO: Addressing DeepSeek GRPO Model Collapse: The Qwen team has proposed the GSPO (Group Sequence Policy Optimization) algorithm, aiming to solve the stability issues of DeepSeek GRPO (Group Relative Policy Optimization) when training large language models, especially the collapse phenomenon in MoE models. GSPO significantly reduces variance and eliminates reliance on auxiliary policies by elevating importance sampling from token-level to sequence-level, potentially becoming a new standard for LLM post-training reinforcement learning, crucial for improving model reasoning capabilities. (Source: 36氪, Reddit r/MachineLearning)

Frontier Research in Reinforcement Learning: Recent research in reinforcement learning has made multiple advancements. The HyCodePolicy framework enhances the robustness and efficiency of embodied agent manipulation policies through code synthesis, geometric localization, perceptual monitoring, and iterative repair. Sotopia-RL improves LLM social intelligence training effectiveness through discourse-level, multi-dimensional reward design. The EARL model, combining RL and VLM validators, performs excellently in image editing tasks with less required training data. Meanwhile, community discussions indicate that Bayesian deep learning methods still face training challenges in achieving SOTA performance, with most successful cases being “Bayesianized” non-Bayesian models. (Source: HuggingFace Daily Papers, HuggingFace Daily Papers, HuggingFace Daily Papers, Reddit r/MachineLearning)

Research on LLM Behavior and Optimization Mechanisms: Multiple studies focus on LLM behavior and optimization. AttnTrace proposes an attention-weight-based context backtracking method for long-context LLMs, improving trustworthiness and prompt injection detection. LeanK, through KV cache channel pruning, significantly reduces memory and accelerates decoding. However, research finds that LLM’s Chain-of-Thought (CoT) reasoning shows fragility when exceeding training data distribution, potentially being a “mirage.” The Sculptor framework, through active context management tools, mitigates interference and improves reasoning reliability for long-context tasks. Web-CogReasoner enhances knowledge content learning and cognitive processes of Web agents through knowledge-driven Chain-of-Thought reasoning. (Source: HuggingFace Daily Papers, HuggingFace Daily Papers, HuggingFace Daily Papers, HuggingFace Daily Papers, HuggingFace Daily Papers)

Progress in Multimodal Models and Generative Technologies: Recent achievements in multimodal AI are abundant. UniEgoMotion proposes a unified model for first-person human motion reconstruction, prediction, and generation, opening new possibilities for AR/VR and other applications. AI agents’ purchasing behavior in e-commerce was evaluated, finding model preferences similar to humans but varying in degree. The BLiM framework improves text-video retrieval performance by combining query and candidate likelihood. HPSv3 provides a new human preference evaluation standard for text-to-image generation models, and CoHP optimizes image quality. The 3D occupancy grounding benchmark and GroundingOcc model enhance spatial perception capabilities in autonomous driving. Additionally, the Gaussian Variational Field Diffusion Model achieves high-quality video-to-4D content generation. (Source: HuggingFace Daily Papers, HuggingFace Daily Papers, HuggingFace Daily Papers, HuggingFace Daily Papers, HuggingFace Daily Papers, HuggingFace Daily Papers)

💼 Business

Differences in China-US AI Investment and Profit Models: US tech giants Meta, Microsoft, Google, and Amazon are expected to have AI capital expenditures of up to $400 billion this year, with AI revenue growing rapidly; OpenAI and Anthropic’s annualized revenue is projected to reach $29 billion by year-end. In contrast, China’s AI industry faces commercialization challenges, with slow revenue and profit growth, and an accelerated outflow of some AI innovative products and talent. This disparity stems from fundamental differences in the underlying internet paradigms between China and the US: US SaaS drives AI applications with an “interface” mindset, while China relies on an “entry point” mindset, leading to limited AI commercialization returns and highlighting the crucial link between capital investment and business logic. (Source: 36氪, 36氪, 36氪, DeepLearning.AI Blog)

Meitu’s AI Transformation Achieves Profitability and Growth: Meitu Inc. has achieved business transformation through AI technology, with net profit expected to increase significantly in the first half of 2025. It has built the visual large model MiracleVision, an AI open platform, and multiple AIGC products for C/B-end users, driving VIP subscription revenue as a major growth engine and expanding into overseas markets and B-end productivity scenarios. Although still lagging behind professional design tools like Figma, Meitu has successfully escaped years of losses, reshaped its profit model through AI, and formed a strategic partnership with Alibaba to further explore the B-end market, demonstrating AI’s immense potential in empowering traditional enterprises. (Source: 36氪)

AI Companion Robots Become a Hot New Sector: With the rise of the aging population and the single economy, “loneliness” has become the driving force for the AI companion robot market, with the global market size expected to grow rapidly. Big names like Lei Jun, Liu Qiangdong, Zhu Xiaohu, and Yu Minhong have entered the market through investments or product launches. The sector features diverse business models, including hardware sales, subscription services, scenario-based solutions, and data monetization, but high return rates and differing user expectations remain challenges. The industry is moving from the technology validation phase to rapid commercialization, signaling an explosive growth in emotional tech products. (Source: 36氪)

🌟 Community

ChatGPT Privacy Leak Scandal and User Trust Crisis: Over 70,000 ChatGPT private chat contents were publicly indexed in Google search results due to a design flaw in the “share” feature, sparking user privacy concerns and widespread controversy. OpenAI admitted to the design issue and urgently removed the “discoverable” option, but this incident exacerbated user trust crisis regarding AI chat privacy and OpenAI’s data governance, drawing criticism for treating users as “guinea pigs.” The event highlights the importance for AI products to more clearly inform users about data processing methods in their feature design. (Source: 36氪)

超7万条ChatGPT私聊内容在网上“裸奔”，遭用户怒骂后，OpenAI急忙下架：这不是Bug，而是个“实验”

AI’s Impact on the Job Market and Career Transformation: Microsoft research indicates that AI will replace a large number of human jobs, listing 40 high-risk occupations (e.g., interpreters, journalists) and 40 low-risk occupations (e.g., surgeons, construction workers). AI is reshaping the developer role, transforming them from “coders” to “AI managers” who need to master core competencies like AI literacy and agent collaboration. While AI won’t cause complete human unemployment, it will drive a structural reshaping of the labor market, requiring educational systems to adapt to the AI era. For example, the Reddit community is also discussing AI’s impact on resumes and hiring. (Source: 36氪, 36氪, Reddit r/ArtificialInteligence)

AI’s Impact on Human Cognition and Mental Health: The Reddit community discusses whether AI will make humans “dumber.” MIT research shows that over-reliance on ChatGPT may lead to reduced brain activity, affecting critical thinking. Concurrently, ChatGPT launched an “anti-addiction mode” to address potential mental health issues from prolonged user engagement, reflecting concerns about excessive AI reliance. Elon Musk’s Grok “AI girlfriends” Ani and Valentine sparked ethical controversy, as their emotional companionship models challenge the boundaries between AI tools and emotions, raising alarms about social atomization and emotional manipulation. (Source: Reddit r/ChatGPT, Reddit r/ArtificialInteligence, 量子位, 36氪, 36氪)

Societal Discussion on AI Ethics and Governance: The Reddit community is discussing the necessity of AI governance, with students seeking interviewees for theses on macro, meso, and micro-level AI governance. Societal concerns have arisen over Duolingo’s “AI-first” policy’s profit model, with critics arguing it may lead to environmental damage, job displacement, and weakened human connection, calling for a boycott. Concurrently, discussions on LLM data leakage risks emphasize the importance of responsible use of APIs and local models, advocating for stronger AI data privacy protection and ethical review. (Source: Reddit r/ArtificialInteligence, Reddit r/artificial, Reddit r/ArtificialInteligence)

Philosophical Reflections on AI’s Impact on Social Structure and Human Meaning: Scholar Zhang Xiaoyu proposes foundational points such as “emergence principle,” “human equivalent,” “algorithmic judgment,” and “civilizational contract” to understand AI’s comprehensive transformation of society. He believes AI will mass-produce intelligence at extremely low cost, potentially widening social divides to a “species-level,” necessitating “universal basic work” and recommendation algorithms for balanced distribution. AI will become an “impartial third-party judge,” forcing humanity to ponder justice and the meaning of existence, calling for humans to abandon “anthropocentrism” and adapt to the AI era. DeepMind head Demis Hassabis also believes the AI revolution will bring a world of “extreme abundance” but requires addressing resource allocation and unemployment issues. (Source: 36氪, 36氪, 36氪)

Humor and Reflection in the AI Community: The Reddit community features numerous humorous discussions about AI, such as ChatGPT’s “precise” explanations of professions leading to user self-doubt, Claude Opus 4.1’s “sweeping out the door” image when solving problems, and teasing about OpenAI’s “open-source” naming and Qwen model’s “personalization.” These discussions reflect users’ lighthearted reflections on AI’s limitations, ethical boundaries, and future development in their daily use, as well as a community culture of using humor to alleviate technological anxiety. (Source: Reddit r/ChatGPT, Reddit r/ClaudeAI, Reddit r/LocalLLaMA, Reddit r/LocalLLaMA, Reddit r/LocalLLaMA, Reddit r/ChatGPT)

Sustainability Challenges of AI Conference Models: A HuggingFace paper points out that the current centralized AI conference model is unsustainable due to rapid expansion, facing multiple pressures: scientific (excessive publication rates), environmental (carbon footprint), psychological (negative emotions, mental health issues), and logistical (venue capacity). The study proposes a “Community Federated Conferences (CFC)” model, which separates reviewing, presentation, and social aspects through global coordination and local organization, to achieve more sustainable, inclusive, and resilient AI research development, addressing new challenges brought by the rapid growth of the AI field. (Source: HuggingFace Daily Papers)

💡 Other

Interview with Unitree Robotics Founder Wang Xingxing: Pragmatic Idealism in Embodied AI: An unpublished interview by Xiangfeng Investment with Wang Xingxing, founder of Unitree Robotics, reveals his profound insights into quadruped/bipedal robots and AI’s role in embodied AI. Wang Xingxing emphasizes “slow is fast,” insists on independent R&D of core components, pursues low-cost and high-performance solutions, and is optimistic about AI’s long-term prospects in robotics. The interview showcases his pragmatic and long-term entrepreneurial philosophy, as well as his ultimate pursuit of technical rationality and product implementation, serving as an epitome of top entrepreneurs in the era of technological innovation. (Source: 36氪)

New Directions in Finance and Management Education in the AI Era: The Shanghai Advanced Institute of Finance (SAIF) at Shanghai Jiao Tong University’s 2026 EMBA Program has been newly upgraded, for the first time deeply integrating AI technology and legal regulations into finance and management education, establishing “Finance × AI” and “Finance × Law” specializations. The program launched a “Talent Cultivation Special Scholarship for a Strong Tech Nation,” offering full or partial scholarships to outstanding tech innovation talents, aiming to cultivate interdisciplinary talents with both global vision and local insights to support enterprise development in the AI era. (Source: 量子位)

Tencent 2026 Campus Recruitment Launched, Focusing on AI Product Manager Trainees: Tencent has officially launched its 2026 campus recruitment, opening over 70 positions across five major categories including technology, product, and design to university students worldwide. This recruitment will significantly increase investment in AI-related positions and introduce an “AI Product Manager Trainee” program for top product talents, aiming to attract outstanding young people to deeply participate in AI technological transformation and reserve talent for Tencent’s AI business development, demonstrating leading companies’ strong demand for and commitment to cultivating AI talent. (Source: 量子位)

🔥 Spotlight

🎯 Trends

🧰 Tools

📚 Learning

💼 Business

🌟 Community

💡 Other

Related Tags

Related Posts

AI Daily – 2025-10-28(Evening)

AI Daily – 2025-10-27(Evening)

AI Daily – 2025-10-27(Morning)