AI Daily - 2025-07-11(Evening)

Keywords：AI protection, digital art, AI regulation, LightShed, Glaze, Nightshade, clean energy, China’s energy advantage, digital art copyright protection, AI training data removal, US AI regulation policy, Kimi K2 MoE model, Mercury code generation LLM

🔥 Focus

LightShed Tool Weakens AI Protections for Digital Art: A new technology, LightShed, can identify and remove the “poison” added to digital artwork by tools like Glaze and Nightshade, making these works easier for AI models to use for training. This has raised concerns among artists about copyright protection and highlights the ongoing struggle between AI training and copyright protection. Researchers say LightShed’s purpose is not to steal artwork, but to warn against a false sense of security in existing protection tools and encourage the exploration of more effective protection methods. (Source: MIT Technology Review)

New Era of AI Regulation: US Senate Rejects AI Regulation Moratorium: The US Senate rejected a 10-year moratorium on state-level AI regulation, seen as a victory for AI regulation supporters and potentially marking a broader political shift. A growing number of politicians are focusing on the risks of unregulated AI and leaning towards stricter regulatory measures. This event foreshadows a new political era in the field of AI regulation, with potentially more discussions and legislation on AI regulation in the future. (Source: MIT Technology Review)

China’s Dominant Position in the Energy Sector: China holds a dominant position in next-generation energy technologies, investing heavily in wind, solar, electric vehicles, energy storage, and nuclear power, and has already achieved significant results. Meanwhile, recently passed legislation in the US has cut credits, grants, and loans for clean energy technologies, potentially slowing its development in the energy sector and further solidifying China’s leading position. Experts believe the US is relinquishing its leadership in the development of crucial future energy technologies. (Source: MIT Technology Review)

🎯 Trends

Kimi K2: 1 Trillion Parameter Open-Source MoE Model Released: Moonshot AI released Kimi K2, a 1 trillion parameter open-source Mixture of Experts (MoE) model with 32 billion parameters activated. Optimized for code and agent tasks, the model achieved state-of-the-art performance on benchmarks like HLE, GPQA, AIME 2025, and SWE. Kimi K2 offers both base and instruction-fine-tuned versions and supports inference engines like vLLM, SGLang, and KTransformers. (Source: Reddit r/LocalLLaMA, HuggingFace, X)

Mercury: Diffusion-Based Fast Code Generation LLM: Inception Labs introduced Mercury, a commercial-grade diffusion-based LLM for code generation. Mercury predicts tokens in parallel, generating code 10 times faster than autoregressive models, achieving a throughput of 1109 tokens/second on NVIDIA H100 GPUs. It also features dynamic error correction and modification capabilities, effectively improving code accuracy and usability. (Source: 量子位, HuggingFace Daily Papers)

vivo Releases On-Device Multimodal Model BlueLM-2.5-3B: vivo AI Lab released BlueLM-2.5-3B, a 3B parameter multimodal model designed for on-device deployment. The model can understand GUI interfaces, supports switching between long and short thinking modes, and introduces a thinking budget control mechanism. It performs well in over 20 evaluation tasks, leading similar-sized models in text and multimodal understanding, and outperforming similar products in GUI understanding. (Source: 量子位, HuggingFace Daily Papers)

Feishu Upgrades Multi-Dimensional Spreadsheet and Knowledge Q&A AI Features: Feishu released upgraded multi-dimensional spreadsheet and knowledge Q&A AI features, significantly improving work efficiency. The spreadsheet supports drag-and-drop creation of project boards, expands capacity to tens of millions of rows, and can integrate with external AI models for data analysis. Feishu’s knowledge Q&A can now integrate all internal company documents, providing more comprehensive information retrieval and Q&A services. (Source: 量子位)

Meta AI Proposes “Mental World Model”: Meta AI released a report proposing the concept of a “mental world model,” placing the inference of human mental states on par with physical world models. The model aims to enable AI to understand human intentions, emotions, and social relationships, thereby improving human-computer interaction and multi-agent interaction. Currently, the model’s success rate in tasks such as goal inference still needs improvement. (Source: 量子位, HuggingFace Daily Papers)

🧰 Tools

Agentic Document Extraction Python Library: LandingAI released the Agentic Document Extraction Python library, which can extract structured data from visually complex documents (such as tables, images, and charts) and return JSON with precise element locations. The library supports long documents, automatic retries, pagination, visual debugging, and other features, simplifying the document data extraction process. (Source: GitHub Trending)

📚 Learning

Geometry Forcing: A paper on Geometry Forcing, a method that combines video diffusion models with 3D representations for consistent world modeling. The research found that video diffusion models trained solely on raw video data often fail to capture meaningful geometry-aware structure in their learned representations. Geometry Forcing encourages video diffusion models to internalize underlying 3D representations by aligning the model’s intermediate representations with features from a pretrained geometric foundation model. (Source: HuggingFace Daily Papers)

Machine Bullshit: A paper on “machine bullshit,” exploring the disregard for truth exhibited by large language models (LLMs). The research introduces a “bullshit index” to quantify LLMs’ disregard for truth and proposes a taxonomy analyzing four qualitative forms of bullshit: empty rhetoric, hedging, equivocation, and unsubstantiated claims. The study finds that fine-tuning models with Reinforcement Learning from Human Feedback (RLHF) significantly exacerbates bullshit, while chain-of-thought (CoT) prompting at inference time amplifies specific forms of bullshit. (Source: HuggingFace Daily Papers)

LangSplatV2: A paper on LangSplatV2, which achieves fast splatting of high-dimensional features, 42 times faster than LangSplat. LangSplatV2 eliminates the need for a heavyweight decoder by treating each Gaussian as a sparse code in a global dictionary and achieves efficient sparse coefficient splatting through CUDA optimizations. (Source: HuggingFace Daily Papers)

Skip a Layer or Loop it?: A paper on depth adaptation of pretrained LLMs at test time. The research finds that layers of pretrained LLMs can be operated as individual modules to construct better or even shallower models tailored to each test sample. Each layer can be skipped/pruned or repeated multiple times, forming a Chain-of-Layers (CoLa) for each sample. (Source: HuggingFace Daily Papers)

OST-Bench: A paper on OST-Bench, a benchmark for evaluating the online spatio-temporal scene understanding capabilities of Multimodal Large Language Models (MLLMs). OST-Bench emphasizes the need to process and reason over incrementally acquired observations and requires integrating current visual input with historical memory to support dynamic spatial reasoning. (Source: HuggingFace Daily Papers)

Token Bottleneck: A paper on Token Bottleneck (ToBo), a simple self-supervised learning procedure that compresses a scene into a bottleneck token and uses minimal patches as prompts to predict subsequent scenes. ToBo facilitates the learning of sequential scene representations by conservatively encoding the reference scene into a compact bottleneck token. (Source: HuggingFace Daily Papers)

SciMaster: A paper on SciMaster, an infrastructure designed to be a general-purpose scientific AI agent. Its capabilities are validated by achieving leading performance on the “Human Last Exam” (HLE). SciMaster introduces X-Master, a tool-augmented reasoning agent designed to mimic human researchers by flexibly interacting with external tools during its reasoning process. (Source: HuggingFace Daily Papers)

Multi-Granular Spatio-Temporal Token Merging: A paper on multi-granular spatio-temporal token merging for training-free acceleration of video LLMs. The method leverages the local spatial and temporal redundancy in video data, first converting each frame into multi-granular spatial tokens using a coarse-to-fine search, and then performing directed pairwise merging across the temporal dimension. (Source: HuggingFace Daily Papers)

T-LoRA: A paper on T-LoRA, a timestep-dependent low-rank adaptation framework specifically designed for personalization of diffusion models. T-LoRA incorporates two key innovations: 1) a dynamic fine-tuning strategy that adjusts the rank constraint updates based on the diffusion timestep; and 2) a weight parameterization technique that ensures independence between adapter components through orthogonal initialization. (Source: HuggingFace Daily Papers)

Beyond the Linear Separability Ceiling: A paper on going beyond the linear separability ceiling. The research finds that most state-of-the-art Vision-Language Models (VLMs) appear to be limited by the linear separability of their visual embeddings on abstract reasoning tasks. This work investigates this “linear reasoning bottleneck” by introducing the Linear Separability Ceiling (LSC), the performance of a simple linear classifier on the VLM’s visual embeddings. (Source: HuggingFace Daily Papers)

Growing Transformers: A paper on Growing Transformers, exploring a constructive approach to model building that builds upon non-trainable, deterministic input embeddings. The research shows that this fixed representational substrate acts as a universal “docking port,” enabling two powerful and efficient scaling paradigms: seamless modular composition and progressive hierarchical growth. (Source: HuggingFace Daily Papers)

Emergent Semantics Beyond Token Embeddings: A paper on emergent semantics beyond token embeddings. The research constructs Transformer models with completely frozen embedding layers, whose vectors are derived not from data, but from the visual structure of Unicode glyphs. The results suggest that high-level semantics are not inherent to input embeddings but are emergent properties of the Transformer’s compositional architecture and data scale. (Source: HuggingFace Daily Papers)

Re-Bottleneck: A paper on Re-Bottleneck, a post-hoc framework for modifying the bottleneck of pretrained autoencoders. The method introduces a “Re-Bottleneck,” an internal bottleneck trained solely through latent space loss, to instill user-defined structure. (Source: HuggingFace Daily Papers)

Stanford CS336: Language Modeling from Scratch: Stanford University has released the latest lectures from its CS336 course, “Language Modeling from Scratch,” online. (Source: X)

💼 Business

2025 H1 Education Industry Investment and Financing Analysis: In the first half of 2025, the education industry investment and financing market remained active, with the deep integration of AI technology and education becoming a major trend. Domestic financing events exceeded 25, with a total funding amount of 1.2 billion yuan, and angel round projects accounting for over 72%. AI+ education, children’s education, and vocational education are among the most popular segments. The overseas market showed a “two-pronged” approach, with mature platforms like Grammarly receiving large amounts of funding, while early-stage projects like Polymath also secured seed round investments. (Source: 36氪)

Varda Secures $187 Million in Funding for Space Pharmaceuticals: Varda secured $187 million in funding to manufacture pharmaceuticals in space. This marks the rapid development of the space pharmaceutical field and opens up new possibilities for future drug research and development. (Source: X)

Math AI Startup Raises $100 Million: A startup focusing on math AI has raised $100 million, indicating investor confidence in the potential of AI applications in the field of mathematics. (Source: X)

🌟 Community

Grok 4 Consults Elon Musk’s Views Before Answering Questions: Multiple users have discovered that when answering some controversial questions, Grok 4 prioritizes searching for Elon Musk’s views on Twitter and the web, using these views as the basis for its answers. This has raised questions about Grok 4’s ability to “maximize truth-seeking” and concerns about political bias in AI models. (Source: X, X, Reddit r/artificial, Reddit r/ChatGPT, Reddit r/artificial, Reddit r/ChatGPT)

Impact of AI Coding Tools on Developer Efficiency: A study suggests that while developers believe AI coding tools can improve efficiency, developers using AI tools actually complete tasks 19% slower than those who don’t. This has sparked discussions about the actual utility of AI coding tools and developers’ cognitive biases towards them. (Source: X, X, X, X, Reddit r/ClaudeAI)

The Future of Open-Source vs. Closed-Source AI Models: With the release of large open-source models like Kimi K2, the community has engaged in heated discussions about the future development of open-source and closed-source AI models. Some believe that open-source models will drive rapid innovation in the AI field, while others are concerned about the security, reliability, and controllability of open-source models. (Source: X, X, X, Reddit r/LocalLLaMA)

LLM Performance Discrepancies Across Different Tasks: Some users have found that although Grok 4 performs well in certain benchmarks, in practical applications, especially complex reasoning tasks like SQL generation, its performance is not as good as some models from Gemini and OpenAI. This has triggered discussions about the validity of benchmarks and the generalization ability of LLMs. (Source: Reddit r/ArtificialInteligence)

Claude’s Excellent Performance in Coding Tasks: Many developers believe that Claude outperforms other AI models in coding tasks, especially in terms of code generation speed, accuracy, and usability. Some developers even claim that Claude has become their primary coding tool, significantly improving their productivity. (Source: Reddit r/ClaudeAI)

Discussion on LLM Scaling and RL: Research from xAI suggests that simply increasing the computational resources for RL does not significantly improve model performance, leading to discussions on how to effectively scale LLMs and RL. Some believe that pretraining is more important than RL, while others argue that new RL methods need to be explored. (Source: X, X)

💡 Other

Manus AI Lays Off Staff and Relocates to Singapore: The parent company of the AI agent product Manus has laid off 70% of its domestic team and relocated its core technical staff to Singapore. This move is believed to be related to restrictions imposed by the US “Outbound Investment Security Program,” which prohibits US capital investment in projects that could enhance China’s AI technology. (Source: 36氪, 量子位)

Meta Internally Using Claude Sonnet for Code Writing: It is reported that Meta has internally replaced Llama with Claude Sonnet for code writing, suggesting that Llama’s performance in code generation may be inferior to Claude’s. (Source: 量子位)

2025 World Artificial Intelligence Conference to Open on July 26: The 2025 World Artificial Intelligence Conference will be held in Shanghai from July 26th to 28th, with the theme “Intelligent Era: Global Harmony.” The conference will focus on internationalization, high-end, youth, and specialization, featuring five sections: conference forums, exhibitions, competitions and awards, application experiences, and innovation incubation. It will comprehensively showcase the latest practices in AI technology frontiers, industry trends, and global governance. (Source: 量子位)

🔥 Focus

🎯 Trends

🧰 Tools

📚 Learning

💼 Business

🌟 Community

💡 Other

Related Tags

Related Posts

AI Daily – 2025-10-28(Evening)

AI Daily – 2025-10-27(Evening)

AI Daily – 2025-10-27(Morning)