GLM-4.7-REAP-218B-A32B
218B parameter MoE model with 32B active params, vLLM compatible, REAP methodology for enhanced reasoning.
Open source ↗Top AI News Weekly
Speech AI dominates with Pocket-TTS, VibeVoice-7B, and PersonaPlex-7B. ShowUI-Aloha/π advance GUI agents, Google launches Universal Commerce Protocol, and GLM-Image brings hybrid image generation.
36 launches and research drops that matter for enterprise AI builders—curated, tagged, and ready for your next roadmap sync.
New drops
36
Unique sources
31
Key themes
Frontier models · Agents · Infra
New reasoning systems, world models, and alignment papers that move the state of the art.
218B parameter MoE model with 32B active params, vLLM compatible, REAP methodology for enhanced reasoning.
Open source ↗Hybrid autoregressive + diffusion image model with text-to-image, editing, style transfer, and identity-preserving generation.
Open source ↗Open translation model supporting 55 languages, 12B outperforms Gemma 3 27B baseline with multimodal capabilities.
Open source ↗Training-free prompting strategy to mitigate mode collapse in LLMs, achieving 2-3x diversity improvement.
Open source ↗Small orchestrators managing models and tools for complex reasoning on Humanity's Last Exam benchmark.
Open source ↗Ultra-compact 80M TTS model with <1GB memory, 32kHz crystal clear audio, WebUI and OpenAI-compatible endpoint.
Open source ↗66M param multilingual TTS, 167× faster than real-time, optimized for on-device deployment.
Open source ↗Full-duplex speech model with consistent persona, based on Moshi architecture for natural conversations.
Open source ↗Frontier open-source TTS for podcasts and multi-speaker audio with 7.5Hz continuous speech tokenizers.
Open source ↗Lightweight CPU-only TTS that fits in your pocket, pip install and go with 1.6B delayed streams model.
Open source ↗Pre-trained enterprise-grade STT/TTS models with multi-language support via PyTorch hub.
Open source ↗Tiny 52KB audio upsampler, 16kHz→48kHz at 3500x realtime for TTS enhancement.
Open source ↗Top open-source diffusion model with realistic people, rich textures, and accurate text rendering via ComfyUI.
Open source ↗Video, audio, and physics-native generation techniques shaping spatial computing.
4D-aware video world model with unified control over camera and multi-object motion via GeoAdapter.
Open source ↗Visual instruction-based image editor, powerful open-source framework for text-guided editing.
Open source ↗Simple and efficient zero-shot monocular depth estimation with reduced parameters and computational cost.
Open source ↗Robust conditional 3D shape generation from casual captures using Aria glasses pipeline.
Open source ↗Unified rig and motion learning from mesh sequences with Gaussian bones and skinning weights.
Open source ↗Heart-related medical AI visualization and reconstruction research.
Open source ↗Unified scene and human reconstruction in a single feed-forward pass.
Open source ↗AI video generation platform with latest V5 model for high-quality video creation.
Open source ↗Embodied agents learning to act in complex virtual and hybrid worlds.
Human-taught GUI agent that learns workflows from screen recordings with recorder, learner, planner, and actor.
Open source ↗450M flow-based VLA model for continuous GUI actions, generating smooth clicks and drags from screen observations.
Open source ↗Open framework for real-time video AI with Stream's edge network for ultra-low latency.
Open source ↗Research on small language models as the future of agentic AI for specialized applications.
Open source ↗Interface library for RL post-training with environments including Echo, Code Sandbox, and Oumi integration.
Open source ↗Minimal GPU design in Verilog to learn GPU architecture from ground up, 10k+ stars educational project.
Open source ↗Frameworks, playbooks, and OSS repos that level-up AI engineering velocity.
Lightweight CLI for MCP servers with JSON output, agent-optimized for Gemini CLI and Claude Code.
Open source ↗Claude Code plugin for persistent memory across sessions, captures tool usage and injects context.
Open source ↗Open protocol for agentic commerce, co-developed with Shopify, Stripe, Walmart, and 20+ partners.
Open source ↗Comprehensive MLOps + LLMOps tutorial series covering foundations to production deployment.
Open source ↗Battle-tested prompting strategies for RAG-based search engines like Perplexity AI.
Open source ↗Python library for reading and writing PDFs, powered by QPDF with Pythonic interface.
Open source ↗Terminal UI for Instagram in TypeScript/Python, the ultimate weapon against brainrot.
Open source ↗1-click installers with UV for 100x faster installs, Torch 2.9, CUDA 13, FaceID, and IP-Adapter.
Open source ↗Curated list of video production tools, AI generators, teleprompters, and editing software.
Open source ↗