AI Daily - 2025-07-22(Evening)

Keywords：Gemini Deep Think, IMO 2025, iFLYTEK X5, Moonvalley funding, 01.AI Agent, DataComp CommonPool data leak, AI training dataset, AI medical disclaimer, AI office suite, natural language mathematical reasoning, ChatGPT Excel feature, local LLM office laptop, copyright-compliant AI video model

🔥 Focus

Google Gemini Deep Think Wins Gold at the International Mathematical Olympiad: Google DeepMind’s Gemini Deep Think model won a gold medal at IMO 2025, correctly answering 5 out of 6 problems, scoring 35/42. Unlike last year’s AlphaGeometry and AlphaProof, Gemini Deep Think reasoned entirely using natural language, without translation into formal mathematical language. Its main breakthrough lies in parallel reasoning, simultaneously exploring multiple solution paths, and using new reinforcement learning techniques for multi-step reasoning, problem-solving, and theorem proving. It was trained on high-quality mathematical solutions and IMO problem-solving techniques. (Source: QbitAI, QbitAI)

OpenAI’s Claim of IMO Gold Medal Sparks Controversy: OpenAI’s announcement that its new model won a gold medal at the IMO has been met with skepticism from IMO officials and academics. The IMO stated that OpenAI did not participate in official collaborative testing, its “gold medal” result was not officially certified, and OpenAI’s announcement immediately after the closing ceremony was “rude and inappropriate.” Furthermore, OpenAI’s score was only slightly above the gold medal threshold, and any minor deductions could have dropped it to a silver medal. (Source: QbitAI)

Massive AI Training Dataset DataComp CommonPool Contains Millions of Personal Data: Research reveals that the large AI training dataset DataComp CommonPool contains millions of images of passports, credit cards, birth certificates, and other personally identifiable information. Researchers found thousands of images containing recognizable faces and identifying information in a 0.1% subset of CommonPool, leading them to estimate the true number could be in the hundreds of millions. This highlights the risks of online data scraping. (Source: MIT Technology Review)

AI Companies Stop Warning Chatbots Aren’t Doctors: Research shows AI companies have almost entirely stopped including medical disclaimers and warnings in responses to health questions. Many leading AI models not only answer health questions but also follow up and attempt diagnoses. This practice increases the risk of users trusting unsafe medical advice. Researchers tested 15 models from OpenAI, Anthropic, DeepSeek, Google, and xAI and found that less than 1% included warnings when answering medical questions in 2025, compared to over 26% in 2022. (Source: MIT Technology Review)

🎯 Trends

OpenAI Plans Excel and PowerPoint Features for ChatGPT: OpenAI is developing Excel and PowerPoint-like features for ChatGPT, allowing users to generate and edit spreadsheets and presentations using natural language prompts. These features will be accessible through dedicated buttons below the ChatGPT search bar and are designed to create files compatible with Microsoft Office. OpenAI aims to create an AI office suite including real-time multi-person document editing, chat windows, meeting transcription, and task management. (Source: 36Kr)

iFLYTEK Launches X5, the World’s First Local Large Model Notebook: iFLYTEK released the third-generation X5 notebook, the world’s first notebook with an integrated local large model. The X5 boasts 8-core 9T AI computing power, enabling AI functions like voice transcription, meeting minutes, and content generation even offline, ensuring data security and privacy. The X5 also features a thinner and lighter body, a faster refresh rate, and a pressure-sensitive writing experience closer to real pen and paper. (Source: 36Kr)

Moonvalley Raises $154 Million to Build Compliant Film-Grade AI Video Model Marey: Moonvalley completed an $84 million Series A+ funding round, bringing its total funding to $154 million. Its AI video model, Marey, is designed for film production with copyright compliance, supporting layered editing of foreground, middle ground, and background, and 3D camera trajectory control. Single-scene rendering costs only $1-2, a more than 90% reduction compared to traditional VFX costs. Marey is trained on licensed data and allows creators to request data deletion and retroactive compensation, avoiding copyright disputes. (Source: 36Kr)

Kai-Fu Lee’s Zero One Universe Releases Yi-Wan-Wu-Zhi Enterprise Large Model One-Stop Platform 2.0 and Enterprise-Grade Agent: Zero One Universe released version 2.0 of its Yi-Wan-Wu-Zhi enterprise large model one-stop platform and introduced the Zero One Universe enterprise-grade Agent, aiming to make AI a “super employee” for businesses. The Agent possesses large model-based task planning capabilities, can autonomously determine task steps through reasoning mechanisms, schedule various tools to complete complex objectives, and has been implemented in consulting services, financial transactions, and sales customer service. (Source: 36Kr)

JD.com Leads Investment in Three Embodied Intelligence Companies, Richard Liu Doubles Down: JD.com led investments in three embodied intelligence companies: QiXun Intelligent, ZhongQing Robotics, and ZhuJi Power. QiXun Intelligent focuses on VLA models and robot hardware upgrades; ZhongQing Robotics has mass-produced the open-source humanoid robot PM01; and ZhuJi Power emphasizes building a general platform for embodied intelligent robots. JD.com’s investment preference lies in integrated software and hardware, mass production capabilities, and scenario implementation. (Source: QbitAI)

CAS & Alibaba Propose RefineX Framework for Large-Scale Precise Pretraining Data Refinement: The Institute of Computing Technology, Chinese Academy of Sciences, along with Alibaba and other teams, proposed the RefineX framework, which achieves large-scale, precise pre-training data refinement through programmatic editing tasks. RefineX distills expert-guided, high-quality end-to-end optimization results into deletion programs based on editing operations, efficiently refining data while preserving the diversity and naturalness of the original text. Models trained on data purified using RefineX have achieved significant improvements in downstream tasks. (Source: QbitAI)

Businesses Leverage AI Q&A to Optimize GEO Services for Increased Exposure, Raising Concerns about Information Accuracy: Businesses are utilizing GEO services optimized for AI large model content, integrating brand information into large model responses through structured knowledge feeding and scenario-based content design to increase exposure. However, AI large models lack filtering and verification capabilities when capturing content, leading to biases in recommendation results and potential exploitation by unscrupulous businesses to spread false information. (Source: 36Kr)

🧰 Tools

Kimi K2: Kimi released its latest MoE foundation model, Kimi K2, with 1T parameters and 32B activated parameters. The model excels in code, Agent, and mathematical reasoning tasks, achieving SOTA results among open-source models. K2 utilizes the MuonClip optimizer, large-scale Agentic Tool Use data synthesis, and a general reinforcement learning framework, achieving leading positions in benchmarks such as SWE Bench Verified, Tau2, and AceBench. (Source: QbitAI)

Qwen3-235B-A22B-2507: Alibaba updated the Qwen3-235B model, discontinuing the hybrid thinking mode, training Instruct and Thinking models separately, and released the more powerful Qwen3-235B-A22B-Instruct-2507 and its FP8 version. According to official evaluations, the new version of Qwen3 surpasses Kimi K2 in some metrics. (Source: QbitAI, Reddit r/LocalLLaMA)

📚 Learning

Neural Networks: Zero to Hero: Andrej Karpathy’s deep learning course covers neural network fundamentals, backpropagation, language modeling, MLPs, activation functions, gradients, BatchNorm, WaveNet, GPT, and Tokenizers. Through YouTube video lectures and Jupyter Notebook code examples, it helps learners build and train neural networks from scratch. (Source: GitHub Trending)

GR-3 Technical Report: Introduces the development of the General Robot Policy GR-3, a large-scale vision-language-action (VLA) model capable of generalizing to new objects, environments, and instructions involving abstract concepts, and can be efficiently fine-tuned with a small amount of human trajectory data. GR-3 also excels at long-horizon and dexterous tasks, including those requiring bimanual manipulation and locomotion. (Source: HuggingFace Daily Papers)

Kimi K2 Technical Report: Moonshot AI released the technical report for Kimi K2, detailing the model’s development process, including key technologies such as the MuonClip optimizer, large-scale Agentic Tool Use data synthesis, and the general reinforcement learning framework, as well as specific details of the pre-training and post-training phases. (Source: QbitAI)

💼 Business

Lovable Raises $200 Million Series A, Reaching $1 Billion Valuation: AI companion app Lovable secured $200 million in Series A funding just eight months after launch, achieving a $1 billion valuation and becoming a unicorn. (Source: Reddit r/artificial)

Cursor Acquires Enterprise AI Programming Tool Koala: AI programming tool Cursor acquired enterprise AI programming tool Koala, aiming to challenge GitHub Copilot. (Source: Reddit r/artificial)

Perplexity in Talks with Phone Manufacturers to Pre-install Comet AI Browser: Perplexity is in discussions with phone manufacturers to pre-install its Comet AI mobile browser on their devices. (Source: Reddit r/artificial)

🌟 Community

Claude Code Usage Restrictions Tightened, Causing User Dissatisfaction: Anthropic tightened usage restrictions on Claude Code without informing users, leading to complaints about decreased model performance and issues with code quality, context consistency, and UI output. Some users are mitigating this by adopting more structured coding approaches like TDD and detailed documentation. (Source: Reddit r/artificial, Reddit r/ClaudeAI, Reddit r/ClaudeAI)

Questioning the Reasoning Abilities of LLMs: Apple’s paper “The Illusion of Thinking” sparked discussion on whether large language models (LLMs) truly possess reasoning abilities. The paper argues that even when provided with the correct algorithm, reasoning models like GPT-4, Claude 3.7, and Gemini completely fail on high-complexity logic tasks. (Source: Reddit r/MachineLearning)

Concerns about AI-Generated Fake Ads: Social media is flooded with AI-generated fake ads, particularly cartoon character ads like “teenagers making millions with AI,” raising concerns and annoyance among users. (Source: Reddit r/artificial)

Discussion on Open-Sourcing AI: Reddit users discussed whether AI models should be open-sourced. Some argue that, like the internet, AI should be open for everyone to use and build upon, fostering human progress. Others believe open-sourcing would introduce new problems, such as intellectual property and data security issues, and the impact on the economic returns for AI developers. (Source: Reddit r/LocalLLaMA)

Polarized Views on AI Companion Apps: A study found that 72% of US teenagers have used AI companion apps. Some believe AI companions can provide emotional support and assistance, while others worry about their potential negative impact on mental health and social skills. (Source: Reddit r/artificial, Reddit r/ChatGPT)

Evaluation of AI Voice Synthesis: With advancements in AI voice synthesis, many YouTube creators are using AI voiceovers, sparking discussion about their impact on video quality and viewer experience. Some find AI voiceovers lacking emotion and personality, while others see them as improving efficiency and reducing costs. (Source: Reddit r/ArtificialInteligence)

Concerns about OpenAI’s Business Model: Companies like OpenAI and Anthropic have yet to profit from LLMs, raising concerns about the sustainability of their business models. Some believe these companies will eventually become profitable as AI technology becomes more widespread and application scenarios expand. Others argue that high computing costs and fierce market competition will make profitability more challenging. (Source: Reddit r/ArtificialInteligence)

💡 Other

Blackbird: An Open-Source OSINT Tool: Blackbird is a powerful open-source OSINT (Open Source Intelligence) tool that can search for usernames and emails across over 600 platforms and offers free AI-driven analysis features. It leverages community-driven projects like WhatsMyName, ensuring a low false-positive rate and high-quality results. Its features include smart filters, PDF/CSV export, and fully automated analysis, all delivered through a CLI. (Source: GitHub Trending)

Trippy: A Network Diagnostic Tool: Trippy is a network diagnostic tool combining traceroute and ping, designed to help analyze network issues. It runs on Linux, BSD, macOS, and Windows, and can be installed from most package managers, pre-built binaries, or source code. (Source: GitHub Trending)

Anki: An Intelligent Spaced Repetition Flashcard Program: Anki is an intelligent spaced repetition flashcard program that helps users learn and memorize information more efficiently. It is open-source on GitHub and has a large user base and contributor community. (Source: GitHub Trending)

🔥 Focus

🎯 Trends

🧰 Tools

📚 Learning

💼 Business

🌟 Community

💡 Other

Related Tags

Related Posts

AI Daily – 2025-10-28(Evening)

AI Daily – 2025-10-27(Evening)

AI Daily – 2025-10-27(Morning)