Metastax Feed - Curated Data Engineering, ML/AI & Analytics Articles

Why Powerful ML Is Deceptively Easy — Part 2

This article, part two of a series, explores different forms of data leakage beyond temporal leakage, including spatial, structural, and coverage-related issues. It discusses how these subtle leakages can deceptively inflate model performance.

towardsdatascience.com ml

3h

Meta’s AI Storage Blueprint at Scale

This article from Meta Engineering outlines the architectural blueprint for their AI storage systems, designed to handle the exponential growth of model capabilities and training dataset sizes. It details the importance of reliable and fast storage access for rapid AI development and computation.

engineering.fb.com architecture

3h

How OpenAI Delivers Low-Latency Voice AI for 900M Users

This article details the architectural journey and engineering challenges faced by OpenAI in delivering low-latency voice AI services to 900 million users. It explores the system design choices and optimizations necessary to achieve performance at such a massive scale.

blog.bytebytego.com architecture

4h

Persistent Latent Memory for Multi-Hop LLM Agents: How a 6G Handover Paper Closes the Agent Cold-Start

This article introduces Inductive Latent Context Persistence (ILCP) as a method to optimize multi-hop LLM agent pipelines. It explains how ILCP reduces the cost of hand-offs by transferring a compressed hidden state, allowing downstream agents to avoid re-creating context and mitigating the agent co

towardsdatascience.com agents

4h

1BRC on a Threadripper 9980X

This article details benchmarks of the One Billion Row Challenge (1BRC) conducted on a Threadripper 9980X processor. It compares these new results to the original benchmarks run on an EPYC 7502P, focusing on the performance of processing large datasets.

jack-vanlightly.com data-engineering

5h

chDB-WASM: complete ClickHouse OLAP engine, compiled to WebAssembly

The article announces chDB-WASM, a project that compiles the entire ClickHouse OLAP engine to WebAssembly. This development enables browser-native SQL and embedded analytics applications. It represents a significant technical achievement in making powerful analytical databases accessible in edge env

twitter.com clickhouse

5h

Show HN: SyntheticRows – expand small datasets, with an honest quality score

This 'Show HN' presents SyntheticRows, a project designed to expand small datasets by generating synthetic data. The tool includes a feature for providing an 'honest quality score' alongside the expanded datasets.

syntheticrows.com data-quality

6h

What Can We Do When Memory Becomes the New Bottleneck in Data Engineering?

This article explores strategies for managing memory bottlenecks in data engineering workflows, particularly when scaling compute resources is not feasible. It details how techniques such as Pandas chunking and the use of Dask and Polars can help process millions of records effectively.

towardsdatascience.com data-engineering

6h

Where AI Agents Belong in Data Engineering: The Correctness Layer

The article explores the strategic placement of AI agents within data engineering pipelines, proposing their role in a dedicated "correctness layer." It discusses how these agents can enhance data quality and reliability. The post outlines conceptual frameworks for integrating agentic architectures

altimate.ai agents

6h

Designing an MCP Server for Unstructured Data

The article details the architectural considerations and design principles for building a Multi-Agent Collaboration Protocol (MCP) server to manage unstructured data. It explores challenges specific to unstructured data and proposes solutions for agent communication and data handling. The post provi

mkikta.com agents

7h

DA-Studio: An Agentic System for End-to-End Data Analysis

This paper proposes DA-Studio, an agentic system aimed at automating multi-step data analysis workflows from heterogeneous inputs. The system focuses on autonomously organizing tasks and executing generated code within a controlled environment. Its design addresses the complexity of real-world data

arxiv.org agents

15h

Clean Me If You Can: A Large Collection of Real-World Addresses for Data Cleaning Benchmarking

This research introduces a substantial dataset of real-world addresses intended for benchmarking data cleaning processes. It aims to overcome limitations of existing controlled-environment benchmarks by providing a more representative collection for evaluating error detection and correction methods

arxiv.org data-quality

15h

Test-Time Verification for Text-to-SQL via Outcome Reward Models

This research focuses on improving the reliability of large language models for structured reasoning tasks like Text-to-SQL at inference time. It proposes a new approach called Outcome Reward Models for test-time verification, departing from traditional methods like Best-of-N sampling or Majority Vo

arxiv.org llm

15h

Large Databases Need Small, Open-Weight Language Models

This paper highlights the prohibitive costs associated with using proprietary large language model APIs for operations on massive databases. It contends that LM-enhanced relational operators can incur significant expenses, potentially exceeding $10,000 for a single query. The authors advocate for th

arxiv.org llm

15h

Knowledge Graphs as the Missing Data Layer for LLM-Based Industrial Asset Operations

This paper investigates the accuracy challenges faced by LLM-based agents in industrial asset operations when reasoning over flat document stores. It references AssetOpsBench, which shows GPT-4 agents achieving only 65% accuracy in certain scenarios. The authors argue for integrating knowledge graph

arxiv.org knowledge-graphs

15h

Explaining Rankings with Hidden Group Bonuses

This paper tackles the fundamental challenge of identifying linear utility functions that align with observed candidate rankings in various applications. The research has implications for fields such as admissions, hiring processes, and recommendation systems. It builds upon previous work concerning

arxiv.org ml

15h

How we scale PgBouncer in ClickHouse Managed Postgres

The article details how ClickHouse Managed Postgres scales PgBouncer beyond its single-threaded limitation. It describes running a peered fleet of PgBouncer processes utilizing so_reuseport to distribute connection pooling across multiple CPU cores. Benchmarks demonstrate the effectiveness of this a

clickhouse.com clickhouse

19h

Multi-token Residual Prediction

This article explores multi-token residual prediction, a specialized topic within machine learning. It likely delves into the mathematical or algorithmic details of this prediction method, potentially for optimizing model performance or efficiency.

modal.com ml

19h

Why Agent Loops Fail in Production (and the Database Patterns That Fix Them)

The article investigates why AI agent loops frequently fail in production environments, attributing these failures primarily to issues with managing agent state across iterations rather than model performance. It then proposes specific database patterns designed to address these state management cha

cockroachlabs.com agents

19h

New SQL features from the latest standards meeting

A Postgres contributor and SQL standards committee member reports on the outcomes of the latest SQL standardization meeting. The article details new SQL standard features, including QUALIFY and INSERT ... BY NAME, and discusses their specific implications for the Postgres database.

postgresweekly.com postgres

19h

How Artemis Security runs 69x faster detection queries with ClickHouse Cloud

Artemis Security achieved a 69x reduction in detection query times and up to 60x faster investigative lookups by optimizing its ClickHouse deployment. The article highlights the use of ClickHouse query coalescing, materialized extraction, and AI-powered debugging with Claude for these performance ga

clickhouse.com clickhouse

19h

ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration

This article introduces ScarfBench, a benchmark specifically designed to evaluate AI agents tasked with migrating enterprise Java frameworks. It details the methodology for assessing agent performance in this complex domain.

huggingface.co ml

1d

From monolith to Lakebase to LTAP: rethinking the database from storage up

This article discusses the evolution of database architectures, moving from traditional monoliths to the Lakebase concept and ultimately to LTAP. It explores a re-evaluation of database design starting from the underlying storage layer, providing insights into future database paradigms.

databricks.com databricks

1d

Have your agent record video demos of its work with shot-scraper video

This article describes how to enable AI agents to record video demonstrations of their operations and output using the `shot-scraper video` tool, offering a method for evaluating or showcasing agent performance.

simonwillison.net agents

1d

Context Engineering for RAG : The Four Typed Inputs Behind Every RAG Answer

The article details the concept of context engineering within RAG pipelines, identifying four distinct typed inputs that feed into an LLM call to generate a RAG answer. It also mentions follow-up work on corpus, conversation, and tool extensions.

towardsdatascience.com llm

1d

Inside Thinking Machines’ Interaction Models

This article examines a research preview, specifically focusing on the concept of an interaction model as proposed by Thinking Machines. It delves into the details of what this model entails and its implications for system design.

blog.bytebytego.com architecture

1d

Benchmarking Hardwood 1.0 on a Threadripper 9980X

This article presents a detailed benchmark of Hardwood 1.0, a Java library for reading Parquet files, executed on a Threadripper 9980X. It compares Hardwood's row and columnar reader APIs against initial benchmarks published by the author, Gunnar Morling, in the v1.0 announcement.

jack-vanlightly.com arrow

1d

Stop Choosing Between Local and Cloud LLMs: A Field Guide to Hybrid Patterns

The article presents a field guide to hybrid local-cloud LLM patterns, offering a hands-on walkthrough of a workflow combining Gemma 4 and GPT-5.4. It covers aspects like reasoning and structured outputs within this hybrid approach.

towardsdatascience.com llm

1d

Agentic Coding on Supabase with OpenCode

OpenCode integrates with Supabase, allowing an agent to connect to databases, Edge Functions, and logs. The article explains how MCP setup is configured automatically for this integration.

supabase.com postgres

1d

Database Context Compression for Text-to-SQL on Real-World Large Databases

This research explores a new method for database context compression, specifically designed to improve Text-to-SQL performance on large, complex enterprise databases. The paper argues that current Text-to-SQL models struggle with real-world benchmarks like Spider 2.0 and BIRD, proposing a solution t

arxiv.org semantic-layer

1d

Algebraic Subgraph Counting

This paper investigates algebraic subgraph counting, a core problem within graph analytics that involves determining the number of subgraph isomorphisms for a query graph within a larger data graph. It builds upon the candidate tree-based framework, offering insights into its application for efficie

arxiv.org knowledge-graphs

1d

CADENZA: Compiling Natural-Language Intent into Task-Specific Operator DAGs for Semantic Query Processing

This research presents CADENZA, a system designed to transform natural language intent into task-specific operator Directed Acyclic Graphs (DAGs) for semantic query processing engines. It details how these engines extend relational query processing with semantic operators executed via model inferenc

arxiv.org semantic-layer

1d

Enterprise Data Modelling Methodologies: A Comparative Analysis of Inmon, Kimball, and Data Vault

This paper offers a comprehensive comparative analysis of established enterprise data modeling methodologies, including Inmon, Kimball, and Data Vault. It discusses their design principles and long-term impacts on analytical capabilities, operational agility, and regulatory compliance for data-drive

arxiv.org data-engineering

1d

SemJoin: Semantic Join Optimization

This research introduces SemJoin, a method for optimizing semantic joins to integrate unstructured data into relational database systems. It focuses on evaluating joins under natural-language predicates, leveraging large language models to enhance natural language querying and analysis capabilities.

arxiv.org semantic-layer

1d

Mandol: An Agglomerative Agent Memory System for Long-Term Conversations

This paper describes Mandol, an agglomerative agent memory system developed to support long-term conversations in AI agents. It tackles the complexities of remembering and querying cross-session, multi-typed information with intricate correlations, contrasting with existing systems that often rely o

arxiv.org agents

1d

Experience Graphs: The Data Foundation for Self-Improving Agents

This research argues for a new class of database system architectures to support emerging long-horizon agentic tasks such as code generation and scientific discovery. It proposes Experience Graphs as a foundational data structure for building self-improving agents, recognizing the need for systems t

arxiv.org agents

1d

CLIP: Lightweight Cosine-Law-Based Inverted-List Pruning for IVF-Based Vector Search

This paper introduces CLIP, a lightweight pruning method for inverted file (IVF)-based vector search, a key component in modern multimodal retrieval systems. It leverages cosine-law principles to enhance the scalability, update efficiency, and hardware friendliness of these widely adopted vector sea

arxiv.org vector-db

1d

MaDI-Bench: An End-to-End Data Integration Benchmark

This paper presents MaDI-Bench, an end-to-end benchmark designed for evaluating comprehensive data integration processes. It covers a sequence of interdependent tasks including schema matching, value normalization, entity blocking, entity matching, and data fusion, aiming to provide a coherent repre

arxiv.org data-engineering

1d

Latent Bridges for Multi-Table Question Answering

The article presents GRAB, a constructor-encoder-bridge pipeline for multi-table question answering. This method converts relational data into a heterogeneous graph, encodes it using message passing, and then transfers these signals to a Large Language Model via a set of query-contextualized latent

arxiv.org llm

1d

Statistically Indistinguishable, Operationally Distinct: A Formal Barrier for Tabular Foundation Models

The article posits that tabular foundation models cannot effectively reason about data produced by running systems without explicit access to the rules governing those systems. It formalizes this limitation and introduces the Operational Turing Test, which constructs pairs of legal and rule-violatin

arxiv.org llm

1d

SAKE: Software Architectural Knowledge Evaluation Benchmark for Large Language Models

The article introduces SAKE, a Software Architectural Knowledge Evaluation benchmark, to measure Large Language Models' reasoning abilities in software architecture. It highlights that while LLMs assist in software development, their competence in architectural decision-making, which relies on under

arxiv.org llm

1d

How Far Do On-Prem Open LLMs Get on Text-to-SQL? A Cross-Family Size x Technique Frontier on BIRD

The article investigates the performance of on-premises, open-weight Large Language Models for Text-to-SQL tasks. It offers a fully reproducible benchmark on the BIRD dataset, assessing how well these models perform and which popular accuracy-improving techniques provide worthwhile compute returns,

arxiv.org llm

1d

Elastic Scheduling of Intermittent Query Processing in a Cluster Environment

The article proposes an elastic scheduling mechanism for intermittent query processing in a cluster environment. It addresses applications that process tuple streams over a window, requiring results by a deadline, by processing tuples in batches rather than continuously to balance resource usage and

arxiv.org streaming

1d

How Redpanda Cloud Topics rethinks Kafka compaction

This article details how Redpanda's Cloud Topics architecture rethinks Kafka compaction to overcome common issues like disk saturation and CPU overload in traditional Kafka clusters. It explains the redesign's approach to reducing redundant work, lowering cloud storage expenses, and preserving Kafka

redpanda.com streaming

1d

Core dump epidemiology: fixing an 18-year-old bug

OpenAI engineers conducted large-scale core dump analysis to diagnose and resolve rare infrastructure crashes. This investigation led to the discovery and rectification of both a hardware fault and a software bug that had persisted for 18 years.

openai.com engineering

1d

Inside Genebench-Pro

This article details the architecture and methodology behind Genebench-Pro, a new benchmark designed to evaluate advanced AI agent capabilities in complex reasoning tasks. It covers the system's components, evaluation metrics, and initial findings regarding agent performance.

openai.com llm

1d

How Jua delivers the world’s most accurate physics simulations 3x faster with ClickHouse Cloud

Jua replaced a file-based forecast pipeline with ClickHouse Cloud, reducing data delivery time from one hour to 20 minutes and historical query times from hours to seconds. This improved speed provides an advantage for energy traders.

clickhouse.com clickhouse

1d

A Quadrillion Rows across three Clouds: scaling LogHouse

The ClickHouse team scaled their internal logging platform, LogHouse, from 19 PiB to 431 PiB and 1.59 quadrillion rows across three cloud providers. The article details how they rearchitected the system to manage 80 GiB/s of writes while maintaining fast queries and minimizing underlying complexity.

clickhouse.com clickhouse

1d

Ray Data 2.56: Improving Reliability for AI Data Pipelines

This article details the enhancements introduced in Ray Data version 2.56, with a specific focus on improving reliability for AI data pipelines. It covers features designed to create more robust and fault-tolerant data processing workflows for machine learning applications.

anyscale.com ml

1d

Cost Attribution in Discord’s API

Discord's API spans over 1700 endpoints across hundreds of Kubernetes deployments. The article describes the challenge of accurately tracking per-feature hosting costs without requiring extensive system restructuring. Jim Benton explains the methodologies and approaches Discord adopted to address th

discord.com architecture

1d

HTML table extractor

This article explores methods and considerations for extracting structured data from HTML tables using AI agents. It likely details challenges in parsing varied HTML structures and how agents can be configured or prompted to accurately identify and extract tabular information.

simonwillison.net llm

1d

Why the Data Platform Determines Legal AI Outcomes

This article argues that a data-platform-centric approach, rather than solely focusing on smarter models, is crucial for developing legal AI that is governed, context-aware, and contributes to institutional intelligence. It highlights the importance of the data platform in shaping AI outcomes.

snowflake.com snowflake

2d

Count the number of Safari tabs

This article describes a specific application of AI agents to interact with a user's operating system, detailing how an agent can be configured to count the number of open tabs in the Safari browser. It likely delves into the mechanics of agent-system integration and tool invocation.

simonwillison.net llm

2d

DiScoFormer: One transformer for density and score, across distributions

This article introduces DiScoFormer, a novel transformer architecture designed to jointly model density and score functions across various data distributions. It details the design and capabilities of this unified model.

huggingface.co ml

2d

Ornith-1.0: Self-Scaffolding LLMs for Agentic Coding

The article introduces Ornith-1.0, a framework that enables Large Language Models to self-scaffold for agentic coding tasks. It details how LLMs can generate and refine their own execution plans and tools to improve coding performance and reliability.

simonwillison.net llm

2d

How AI Agents Manage Memory and Avoid Forgetfulness

This article explains the architectural patterns behind how AI agents manage memory and prevent forgetfulness. It explores the foundational constraints driving these designs, examines the resulting system architectures, and discusses the associated tradeoffs.

blog.bytebytego.com llm

2d

Prompt Engineering Fails Quietly — Prompt Regression Is Why

This article highlights how minor prompt changes can silently disrupt critical LLM behavior in production, introducing a practical framework designed to detect these hidden prompt regressions before they impact users.

towardsdatascience.com llm

2d

GenPage: Towards End-to-End Generative Homepage Construction at Netflix

The article explores GenPage, a system developed at Netflix for end-to-end generative homepage construction. It likely delves into the architecture and technical considerations behind building such a system for a large-scale platform.

netflixtechblog.com mlops

2d

How to Choose Between Small and Frontier Models

The article explores the rising trend of small language models and provides guidance on how to choose between these smaller models and larger frontier models for various applications.

towardsdatascience.com llm

2d

Benchmarks and Obscurantism: A “red” line that should not be crossed

This article argues that benchmarks are only meaningful if they are inspectable, challengeable, and reproducible. It details the findings from testing Databricks' claims regarding ClickHouse performance and highlights discrepancies in methodology.

clickhouse.com clickhouse

2d

Agents hate friction: early thoughts on building for agents

This article explores the paradigm shift required in software and hardware design when the primary user becomes an LLM rather than a human. It discusses early considerations for building experiences optimized for AI agents.

clickhouse.com agents

2d

PostgreSQL-Compatible Databases for AI at Scale: What to Evaluate from Day One

This article discusses key architectural considerations for choosing PostgreSQL-compatible databases to support AI at scale. It covers various evaluation criteria that impact long-term system viability and operational costs for projects needing robust database foundations.

cockroachlabs.com postgres

2d

Search Is How Agents See the World

The article explains the reliance of AI agents on search for world understanding before action. It details how Materialize assists in keeping computed search documents and vector embeddings current and synchronized with changes in underlying source systems.

materialize.com agents

2d

Tail Control: The Counterintuitive Engineering of Reliable Agentic Workflows

The article explores the engineering challenges of building reliable agentic workflows for customer-facing APIs. It highlights that consistent delivery is a problem of variance, not just speed, and proposes counterintuitive solutions to ensure timely and usable high-quality answers.

towardsdatascience.com agents

3d

EP220: RAG vs Graph RAG vs Agentic RAG

This article provides a comparison of Retrieval Augmented Generation (RAG) approaches, specifically contrasting standard RAG with Graph RAG and Agentic RAG. It outlines the three distinct methods for connecting Large Language Models to data.

blog.bytebytego.com llm

4d

We Built a Routing Layer to Cut Our AI Costs. It Broke the Product.

This article describes a team's experience building an AI inference routing layer to reduce costs, which initially halved their AI bill but led to a decline in customer satisfaction. It identifies cost-optimization routing layers as a Pareto trap and presents a methodology for detecting such quality

towardsdatascience.com mlops

4d

Framesmith 1.7 – a quality gate that tells an AI agent when a UI is done

The article presents Framesmith 1.7, an open-source project that functions as a quality gate for AI agents. This system informs an AI agent when a user interface has reached a 'done' state, providing a mechanism for agents to assess UI completion.

github.com agents

5d

MySQL's New Governance Model: Two steps forward and one step backwards

This article provides a critical assessment of MySQL's recently introduced governance model. It discusses the perceived advancements and setbacks of these changes, offering insights into the implications for the database's future development and community involvement.

villagesql.com data-governance

5d

What happened after 2,000 people tried to hack my AI assistant

This article recounts the outcomes after 2,000 individuals attempted to exploit vulnerabilities in an AI assistant. It likely covers security challenges, adversarial interactions, and insights gained from real-world testing of AI agent robustness.

simonwillison.net agents

5d

Incident Report: CVE-2026-LGTM

This article presents an incident report detailing a security vulnerability, identified as CVE-2026-LGTM. It likely provides a technical analysis of the issue and lessons learned from its resolution.

simonwillison.net agents

5d

Quoting OpenAI

This article explores methods and considerations for reliably extracting and attributing specific information from OpenAI models. It focuses on techniques to ensure agents can accurately quote or reference model outputs, addressing challenges in building dependable AI agent workflows.

simonwillison.net llm

5d

DuckDB SQLite Extension

The article links to the GitHub repository for the DuckDB SQLite Extension. This extension allows users to integrate DuckDB's analytical capabilities with existing SQLite databases.

github.com duckdb

5d

Just Use Postgres for Task Queues

The article advocates for using PostgreSQL as a task queue solution. It then details strategies and methods for scaling Postgres queues to handle increased loads and maintain performance in production environments.

dbos.dev postgres

5d

From Local LLM to Tool-Using Agent

This article details the process of constructing a lightweight research agent. It demonstrates using Gemma 4, Ollama for local LLM inference, the OpenAI Agents SDK, and Tavily MCP for tool integration.

towardsdatascience.com agents

5d

Oracle promises to open up MySQL governance, but the community wants guarantees

This article reports on Oracle's commitments to open up MySQL's governance model and the community's demand for concrete guarantees regarding these changes. It covers the ongoing dialogue between Oracle and the MySQL community concerning future development and control.

theregister.com data-governance

5d

Water Cooler Small Talk, Ep. 11: Overfitting in RAG evaluation

This article explores the concept of overfitting within the context of RAG evaluation. It discusses why achieving high scores on evaluation benchmarks does not necessarily equate to a comprehensive understanding of the subject matter by the RAG system.

towardsdatascience.com llm

5d

What One Year in AI Security and Governance Changed About How I See AI

The article details the author's changed perspective on AI after a year focused on security and governance. It likely explores practical challenges, risk management, and ethical considerations encountered when deploying AI in real-world scenarios.

codebynight.dev governance

5d

Fixing Failures in Browser-Use Models: Why More Data Isn't Enough

This article from Fig.inc discusses challenges in improving browser-use models beyond simply increasing training data. It explores inherent issues and limitations in current data collection and model design for complex user interactions. The post provides insights into diagnosing and addressing pers

fig.inc ml

5d

Amplify the Expert: A Philosophy for Building Enterprise RAG

This article presents an architectural philosophy for building Retrieval-Augmented Generation (RAG) systems in enterprise environments. It discusses the strategic choices and design principles necessary for integrating expert knowledge into RAG pipelines. The post provides a framework for robust and

towardsdatascience.com llm

5d

The Shape of the System - Engineering for Bounded Cognition

This post explores the concept of 'bounded cognition' in engineering, advocating for system designs that account for human cognitive limits. It discusses strategies to reduce complexity and improve comprehensibility in software and data systems. The article offers a framework for building more manag

shapeofthesystem.com engineering

5d

Context engineering: shifting from "tokenmaxxing" to deliberate curation

This article explores the evolution of context engineering in AI, moving from simply maximizing token input to deliberate data curation for LLMs. It discusses the limitations of large context windows and the need for structured, relevant information to improve AI model performance. The post outlines

corti.com context-engineering

5d

Show HN: Loomabase – Column-level CRDT sync for SQLite + Postgres

This project introduces Loomabase, an open-source solution enabling column-level Conflict-free Replicated Data Type (CRDT) synchronization for SQLite and PostgreSQL databases. It allows for robust, eventually consistent replication of individual column changes across distributed environments. The re

github.com postgres

5d

Monedula Apache Kafka Simulator

This resource introduces Monedula, a simulator designed for Apache Kafka environments. It allows users to model and test Kafka cluster behavior under various load conditions and configurations. The simulator aids in understanding Kafka internals and optimizing streaming data architectures without de

monedula.dev kafka

5d

Fintech Engineering Handbook - Patterns for building software that handles money

This article presents an engineering handbook focused on patterns for constructing software systems that handle money. It aims to provide architectural insights and practical guidance relevant to fintech development.

w.pitula.me engineering

5d

Context loss is the real reason AI coding slows down engineering teams

This article argues that context loss is the primary factor slowing down engineering teams utilizing AI for coding tasks. It explores how the inability of AI tools to maintain sufficient context impacts developer efficiency.

brunelly.com llm

5d

Query Cost Model Calibration in Confidential Virtual Machines

This article examines query cost model calibration for databases deployed within confidential virtual machines, specifically noting AMD SEV-SNP. It addresses the protection of sensitive cloud data while minimizing changes to legacy database management systems.

arxiv.org architecture

5d

3D Spatial Pattern Matching

This article introduces 3D spatial pattern matching, defining it as the process of aligning query entities and constraints with database entities and relations. It covers various applications, including similar region search and road network matching.

arxiv.org data-engineering

5d

EcoTable: Cost-effective Table Integration in Data Lakes for Natural Language Queries

This article proposes EcoTable, a method for cost-effective table integration within data lakes, designed to overcome challenges posed by diverse formats like CSV and Parquet. It also facilitates data access through natural language queries, aiming to simplify traditional ETL.

arxiv.org lakehouse

5d

BtrLog: Low-Latency Logging for Cloud Database Systems

This article introduces BtrLog, a system designed to provide low-latency write-ahead logging for cloud database systems. It addresses the challenges of achieving WAL durability with remote storage, specifically mentioning the latency issues associated with options like EBS.

arxiv.org architecture

5d

Understanding Domain-Aware Distribution Alignment in Budgeted Entity Matching

Entity Matching (EM) is a fundamental operation in data integration pipelines, focused on comparing records from different sources to determine if they refer to the same real-world entity. This paper introduces a method that incorporates domain information and addresses distribution alignment within

arxiv.org data-engineering

5d

Trino's summer of grammar

This article discusses the importance of SQL grammar in a query engine like Trino. It explains how SQL is defined by its grammar, including predicates, operators, and forms. The post likely covers Trino's adherence to the ISO 9075 standard and ongoing work on its SQL dialect.

trino.io trino

5d

How I hunt for vulnerabilities with AI

An experienced software engineer details their method for identifying vulnerabilities within the ClickHouse codebase using large language models such as GitHub Copilot, Claude Opus, and Gemini. The process involves generating hypotheses and accelerating the validation of potential security flaws wit

clickhouse.com clickhouse

5d

Run a vLLM Server on HF Jobs in One Command

The article details how to deploy a vLLM server on Hugging Face Jobs using a single command. It covers the setup and operational steps necessary for running high-performance LLM inference in a hosted environment.

huggingface.co llm

5d

Privacy-Aware Infrastructure in the AI-Native Era: An Asset Classification Case Study

The article from Meta Engineering discusses the necessity of robust data understanding for effective privacy controls in AI-native environments. It presents a case study on asset classification, detailing how systems must precisely identify data to enforce policies like retention, access, and anonym

engineering.fb.com governance

5d

Hardwood 1.0: A Fast, Lightweight Apache Parquet Reader for the JVM

The article introduces Hardwood 1.0, a new open-source Apache Parquet reader for the JVM optimized for speed and minimal dependencies. It discusses the design decisions made to enhance performance and reduce the memory footprint for processing Parquet files.

morling.dev parquet

5d

Data Benchmarks and Limitations [video]

The video discusses various data benchmarks and their inherent limitations. It explores the methodologies used in data benchmarking and highlights common pitfalls and considerations when interpreting performance metrics.

youtube.com engineering

6d

Show HN: Topos – Structural code quality metrics for agent-written programs

The article introduces Topos, a system addressing the challenge of reviewing code generated rapidly by AI agents. Topos parses programs into various graph representations, such as AST and CFG, to score them across structural quality pillars including simplicity, composability, and security.

krv.ai agents

6d

Parquet: More than just "Turbo CSV"

The article explores the technical advantages of Parquet, moving beyond its common perception as a simple CSV alternative. It details how Parquet's columnar format, compression, and schema evolution capabilities offer significant performance benefits for data storage and processing.

csvbase.com arrow

6d

Vector RAG Isn’t Enough — I Built a Context Graph Layer for Multi-Agent Memory

The article introduces a context graph layer built to augment multi-agent memory, moving beyond limitations found in traditional vector RAG approaches. It includes a benchmark comparing raw chat history, vector-only RAG, and this new context graph layer, revealing insights into relational retrieval

towardsdatascience.com agents

6d

Testing a Kafka Proxy: Taming Millions of Permutations

The article details the complex challenges involved in testing a Kafka proxy, specifically addressing the need to manage millions of permutation test cases. It describes the engineering approaches and methodologies developed to ensure robust and reliable proxy behavior in a streaming environment.

conduktor.io kafka

6d

The Hot Path Belongs to GBDTs, Agents Own the Cold Path: A Payment-Fraud Benchmark

The article presents a reproducible benchmark for payment fraud detection, evaluating the performance of Gradient Boosted Decision Trees (GBDTs) versus AI agents. It analyzes their efficacy across metrics like latency, cost, and reproducibility, delineating scenarios where each technology proves mor

towardsdatascience.com agents

6d

Autodata: An agentic data scientist to create high quality synthetic data

The article introduces "Autodata," an agentic data scientist designed for generating high-quality synthetic data. It describes the agent's architecture and methodologies for creating synthetic datasets that maintain fidelity and utility for various analytical tasks.

arxiv.org agents

6d

We Rewrote WAL-G for Postgres Backups in Rust: Meet WAL-RUS

The article introduces WAL-RUS, a rewrite of the WAL-G tool for Postgres backups, now implemented in Rust. It covers the technical rationale behind the rewrite and the features of this new open-source project aimed at improving backup reliability and performance.

clickhouse.com postgres

6d

Show HN: BrainAPI's event-centric graph from any data with four-stage pipeline

This project, BrainAPI, constructs event-centric knowledge graphs from various data sources through a four-stage pipeline. The GitHub repository provides details on its architecture and implementation for transforming raw data into structured graph representations.

github.com knowledge-graphs

6d

Which tokens does a hybrid model predict better?

This article explores the performance of hybrid language models by analyzing which tokens they predict more accurately. It likely provides insights into the operational characteristics and architectural advantages of these models in various prediction scenarios.

huggingface.co ml

6d

Achieving Near-Linear Training Scalability for Pinterest’s Foundation Models

Pinterest details the architectural and engineering strategies they implemented to achieve near-linear training scalability for their foundation models. The article describes the specific optimizations and distributed systems design choices made to efficiently train large-scale ML models within thei

medium.com ml

6d

How to Provision Data Access for Natural Language Data Queries at Scale

Addresses the complexities of provisioning data access for natural language data queries within large-scale systems. The article explores methods and architectural patterns for managing access control and ensuring data security in such environments.

immuta.com governance

6d

3 Agents. 3 LLMs. 1 Aging GPU: Engineering Parallel Inference on Bare Metal

The article details methods for running three distinct LLMs concurrently on a single 8GB GPU, addressing common VRAM limitations. It explains the use of C++ layer multiplexing and admission control to manage resources and achieve parallel inference on bare metal hardware.

towardsdatascience.com llm

6d

A Tiny Compiler for Data-Parallel Kernels

This article describes the process of creating a small, custom compiler designed to process data-parallel kernels. It delves into the architectural considerations and implementation specifics required to achieve optimized execution for data-intensive tasks.

healeycodes.com architecture

6d

Letting an LLM Pick the Right RAG Page: The Arbiter Pattern at the End of Retrieval

This article introduces the Arbiter Pattern for RAG systems, where an LLM is employed to rank and select the most appropriate document from a set of candidates. It outlines how a single LLM call generates a reasoned choice, producing a structured output for auditing.

towardsdatascience.com llm

6d

How we built saga rollbacks for Cloudflare Workflows

Cloudflare details the development of saga-style rollbacks for its Workflows durable execution engine, which handles multi-step applications. The article explains how developers can now define compensating actions for each step within a workflow to ensure transactional consistency.

blog.cloudflare.com architecture

6d

How to Build 1-Minute OHLC Bars from Non-Uniform Market Snapshot Data

This article details methodologies for constructing one-minute Open-High-Low-Close (OHLC) bars from market snapshot data that is not uniformly sampled. It outlines the technical steps involved in time-series aggregation from irregular inputs.

medium.com streaming

6d

Show HN: MAVS-GC – An Open-Source Governance Architecture for AI Systems

This article introduces MAVS-GC (Multi Adaptive Vetting Systems-Governance Core), an open-source project proposing a specific governance architecture for AI systems. The project investigates the impact of an explicit governance layer placed atop multiple specialist components on overall system behav

docs.google.com governance

6d

Treat the Context Window as a Data Assembly Problem

This article proposes treating the LLM context window as a data assembly problem, focusing on structuring and optimizing data input for large language models. It likely explores strategies for efficiently preparing machine-readable metadata and contextual memory to improve LLM performance.

klr-pattern.github.io llm

6d

TabClean: Reusable LLM-Synthesized Programs for Tabular Data Cleaning

This article presents TabClean, a method that employs LLM-synthesized programs to address common data cleaning challenges in tabular data. It focuses on resolving issues such as missing values, inconsistent formats, and violated dependencies frequently encountered in production analytics and machine

arxiv.org data-quality

6d

CV-Rules: Serializability Verification of Concurrency Control Protocols via Explicit Transaction Ordering

This article introduces CV-rules, a new method for characterizing and verifying serializability in concurrency control protocols. It proposes that a transaction order must satisfy specific per-read conditions, the C-rule (Causality) and V-rule (View Consistency), to constrain reads-from relationship

arxiv.org architecture

6d

VADAOrchestra: Neurosymbolic Orchestration of Adaptive Reasoning Workflows

This arXiv paper introduces VADAOrchestra, a framework for neurosymbolic orchestration of adaptive reasoning workflows. It addresses how real-world decision-making dynamically evolves with new context and data, moving beyond traditional fixed business processes.

arxiv.org agents

6d

Kafka's log compaction corrupts data. Here's how we fixed it

The article details a specific problem found in Apache Kafka's log compaction process that can lead to data corruption. It explains how to reproduce this issue and outlines the method Redpanda used to resolve it within their platform.

redpanda.com streaming

6d

Routing for serverless servers with Pingora, Envoy, and Spanner

Details the routing mechanisms employed for serverless servers, focusing on the integration and functionality of Pingora, Envoy, and Spanner. The article explores how these components work together to manage traffic in a serverless environment.

modal.com architecture

6d

Weaviate 1.38 Release

This Weaviate 1.38 release introduces several key features, including the general availability of the HFresh disk-based vector index and the built-in MCP Server. It also details the re-engineered cluster-wide asynchronous replication, which now operates from a single scheduler, and previews the Boos

weaviate.io vector-db

6d

Announcing Silk: a silky smooth fiber runtime for ClickHouse

ClickHouse announces Silk, a new open-source C++ fiber runtime designed for its database, featuring a NUMA-aware work-stealing scheduler and io_uring I/O. The article highlights its zero heap allocation in the steady state, achieving nanosecond-level fiber yields and significantly reducing tail late

clickhouse.com clickhouse

6d

How Vibe.co handles billions of ad impressions with ClickHouse Cloud

This article details how Vibe.co managed to scale its Connected TV ad impression data from 100 GB to 2 TB without requiring architectural changes. It specifically covers their migration process from Postgres to ClickHouse Cloud as the solution for handling billions of ad impressions.

clickhouse.com clickhouse

6d

simonw/browser-compat-db

This entry points to the `simonw/browser-compat-db` GitHub repository, a project by Simon Willison. The repository likely involves a database focused on browser compatibility, potentially integrating advanced LLM or agent technologies as indicated by its associated tags.

simonwillison.net llm

6d

How to Tell If Your Kafka Self-Service Is Working?

The article explores methods to assess the success of Kafka self-service platforms, focusing on key metrics and indicators. It details how organizations can determine if their self-service initiatives are truly empowering developers and streamlining operations.

medium.com kafka

7d

Automated Schema Evolution in Pinterest’s Next-Generation DB Ingestion Framework

This article details Pinterest's implementation of automated schema evolution within their next-generation database ingestion framework. It describes the system designed to manage and adapt to schema changes automatically at scale.

medium.com data-engineering

7d

Vibe Coding to Agentic Engineering with Claude Code

The article transitions from traditional coding practices to 'agentic engineering' by leveraging AI models like Claude Code. It discusses how AI agents can assist in code generation, debugging, and overall software development workflows.

apimatic.io agents

7d

Looking Ahead to Postgres 19

The article outlines expected features and improvements for the upcoming PostgreSQL 19 release, currently in beta. It details advancements in areas like performance, new SQL functionalities, and potential architectural changes for the database.

snowflake.com postgres

7d

The emergence of the web data infrastructure layer for AI

The article discusses the development of a dedicated data infrastructure layer tailored for AI, focusing on how web-scale data can be effectively organized and served to AI models. It examines the architectural components and challenges involved in building these new data pipelines for AI consumptio

technologyreview.com architecture

7d

Show HN: DBOSify – Drop-in Temporal replacement built on Postgres

DBOSify is presented as an open-source project designed to replace Temporal-style workflow orchestration, leveraging PostgreSQL for durable state management. It aims to provide a reliable, ACID-compliant platform for building complex, long-running applications.

github.com postgres

7d

Medical diagnosis AIs can be tricked into telling whose data trained them

The article reports that medical diagnosis AI systems can be manipulated to reveal specific data points from their training datasets. This vulnerability allows for the identification of individuals whose medical information was used to train these models. The finding highlights a significant privacy

theregister.com ml

7d

Faster VLM Fine-Tuning With Materialized Model Features in LanceDB

This article describes a technique to accelerate VLM (Vision Language Model) fine-tuning. It explains how LanceDB, Lance format, and Geneva are used to materialize expensive multimodal features once, allowing subsequent training directly from these pre-computed columns.

lancedb.com vector-db

7d

Your First Task as a Data Engineer in a New Company? Make the ETL Pipeline Testable

The article outlines a practical onboarding workflow for a new data engineer, emphasizing the immediate task of making ETL pipelines testable. It details steps for setting up environments, implementing automated testing protocols, and leveraging AI for development assistance. The guide focuses on es

towardsdatascience.com data-engineering

7d

The state of agentic analytics, from 50 real data teams

The article presents findings on the current landscape of agentic analytics, synthesizing experiences from 50 real data teams. It covers the adoption, challenges, and evolving patterns in using AI agents for analytical tasks. The report offers a snapshot of how organizations are integrating agentic

blog.getcassis.com agents

7d

Accelerating Transformers Fine-Tuning with NVIDIA NeMo AutoModel

The article details methods for accelerating the fine-tuning process of transformer models using NVIDIA NeMo AutoModel. It explores how this framework can optimize computational efficiency and reduce training times for large language models. The content focuses on practical techniques to enhance per

huggingface.co ml

7d

Large Language Models vs Small Language Models

This article examines the constraints and tradeoffs of large versus small language models. It delves into three layers of model design and investigates production systems that combine both types of models.

blog.bytebytego.com ml

7d

A Three-Phase Factual Recall Circuit in Gemma-2B and Gemma-12B-IT

The article investigates the internal mechanisms of factual recall within Gemma-2B and Gemma-12B-IT transformer models. It employs activation patching to reveal a three-phase circuit for how facts are stored, routed, and retrieved across different layers. The analysis highlights the significant role

towardsdatascience.com llm

7d

Zero-Copy Data Movement from NIC to GPU at 100s of Gbps

The article details a novel system for high-throughput, zero-copy data movement between network interface cards and GPUs, achieving speeds of hundreds of gigabits per second. It explores the architecture and implementation strategies for enhancing data pipeline performance in high-performance comput

nvidia.github.io mlops

7d

Why I Stopped Using One Agent and Built a Multi-Agent Pipeline Instead

The article explains the decision to transition from a single AI agent to a multi-agent pipeline architecture, using text-to-SQL as a practical use case. It details the reasoning behind this shift and describes the construction of the multi-agent system.

towardsdatascience.com agents

7d

Show HN: OpenModeling Core – multidimensional modeling for financial models

The article presents OpenModeling Core, an open-source project centered on multidimensional modeling tailored for financial applications. It highlights the framework's capabilities in structuring complex financial data for analysis.

github.com data-quality

7d

Kafka Share Groups - Pathological fetch waits with record_limit

The article examines performance issues within Kafka share groups, focusing on pathological fetch waits. It explains how using share.acquire.mode=record_limit combined with fewer consumers than partitions and various forms of partition skew can lead to subpar performance. The post details the diagno

jack-vanlightly.com kafka

7d

When Does Data Help Automated Agent Engineering?

This article investigates the circumstances under which data significantly contributes to the engineering and improvement of automated AI agents. It likely discusses how data can be leveraged for agent training, evaluation, and overall system robustness.

andrewjesson.com agents

7d

Anchor Detection for RAG: Parallel Detectors, Then One LLM Call at the End

The article describes an architectural approach for Retrieval-Augmented Generation (RAG) pipelines, focusing on anchor detection. It outlines a strategy involving parallel detectors for information retrieval, followed by a consolidated LLM call. The retrieval method prioritizes keywords, then table

towardsdatascience.com llm

7d

Autoops: Multi-region data and service mesh operated by a Makefile

This project, Autoops, describes a system for managing multi-region data infrastructure and service meshes, with its operations driven by a Makefile. It details the architecture and the pragmatic approach to orchestrating complex distributed systems.

github.com orchestration

7d

9 Weeks to AGI: Accountability and Governance Infrastructure for the EU AI Act

This article explores the infrastructure required for accountability and governance under the EU AI Act. It delves into the technical considerations for building systems that ensure compliance and provide oversight for advanced AI models.

assimilatedhuman.github.io governance

7d

Why AI Agents Need a CLI, Not Just an MCP Server

The article discusses the architectural requirements for AI agents, arguing for the necessity of a command-line interface in addition to the Model Context Protocol (MCP) server. It examines how MCP enables agents to interact with data systems but highlights a broader need for CLI-based interaction p

dremio.com agents

7d

Show HN: Clai – Context engineering for terminal powerusers

The article introduces Clai, an open-source project designed for context engineering, aiming to enhance the terminal experience for power users. It outlines how the system manages and utilizes contextual information within a command-line interface.

github.com agents

7d

Unlocking the Cloudflare app ecosystem with OAuth for all

The article announces the general availability of Self-Managed OAuth for developers on Cloudflare, but primarily focuses on the technical process behind this rollout. It describes how Cloudflare executed a zero-downtime migration of its core OAuth engine to enable this new feature.

blog.cloudflare.com engineering

7d

On the Semantics of Generative SPARQL

This paper proposes an extension to SPARQL by introducing a generative query construct called `GenOp`. This new operation allows SPARQL queries to invoke a language model and generate typed solution mappings, while preserving the fixed-dataset assumption for query semantics.

arxiv.org knowledge-graphs

7d

Unified Dominance Graph for Interval-Predicate Approximate Nearest Neighbor Search

The paper introduces a Unified Dominance Graph method for Interval-Predicate Approximate Nearest Neighbor Search (ANNS). This technique addresses the need for hybrid queries in applications such as temporal databases, financial data analysis, and retrieval-augmented generation.

arxiv.org vector-db

7d

Entity Resolution via Batched Oracle Queries

This paper considers an oracle that processes a limited batch of records to cluster entities referring to the same real-world object. It studies methods to interrogate such an oracle for resolving entities in datasets significantly larger than a single batch.

arxiv.org knowledge-graphs

7d

Accelerating Presto with GPUs

This paper describes how Presto was extended to be GPU-aware, focusing on critical challenges such as efficient data transfer from storage to GPU operators. It also addresses enabling data exchange between operators without leaving GPU memory, even in a distributed query environment.

arxiv.org trino

7d

One Index for Subsumption and Roll-up across Time, Geography, and Ontology

This paper observes that time-series, geospatial, and ontology systems all maintain hierarchies, such as day <= month <= year, zip <= city, and is-a / part-of relationships, and typically index them separately. It proposes a unified index for these subsumption posets, focusing on order testing workl

arxiv.org knowledge-graphs

7d

Can Aggregate Invariants Accelerate Continuous Subgraph Matching? Limits, Laws, and a Dynamic Spectral Index

This research investigates whether aggregate structural tests, specifically Laplacian interlacing for rejecting candidate subgraphs, can accelerate continuous subgraph matching. It builds on spectral filtering's success in pruning for static subgraph matching.

arxiv.org knowledge-graphs

7d

Abstractions of Queries in Ontology-Based Data Access

This paper examines query abstraction in an ontology-based data access (OBDA) setting, where multiple data sources are integrated through mappings to an ontology. It specifically considers an OBDA framework based on existential rules and the certain answer semantics.

arxiv.org ontology

7d

Are We Ready For An Agent-Native Memory System?

This paper discusses the rapid evolution of memory systems for large language model (LLM) agents, which have expanded beyond simple retrieval-augmented mechanisms. These systems now support persistent information storage, retrieval, update, consolidation, and dynamic lifecycle governance.

arxiv.org agents

7d

LGTD: Local-Global Trend Decomposition for Season-Length-Free Time Series Analysis

This paper presents LGTD, a novel method for decomposing time series into trend, seasonal, and residual components without requiring a predefined season length. This technique is fundamental for applications such as anomaly detection, change-point analysis, and forecasting.

arxiv.org analytics

7d

TSseek: Regular Expression-Based Similarity Search for Distributed Time Series Datasets

This paper introduces TSseek, a technique for regular expression-based similarity search in distributed time series datasets. It addresses the limitation of existing methods that typically require a precise sequence of values as the query input.

arxiv.org analytics

7d

ORQ: Complex Analytics on Private Data with Strong Security Guarantees

This paper presents ORQ, a system designed for collaborative analysis of large private datasets using cryptographically secure multi-party computation (MPC). ORQ offers strong protection against semi-honest or malicious parties and efficiently evaluates relational queries.

arxiv.org data-governance

7d

The $\mathbf{P}$-Completeness of Inverted Index Traversal: On the Complexity of Evaluating Boolean Query DAGs

This paper investigates the computational complexity of evaluating Boolean query Directed Acyclic Graphs (DAGs) over text fields, which are increasingly employed by modern AI agents for neuro-symbolic reasoning workflows. It specifically focuses on the P-completeness of inverted index traversal.

arxiv.org agents

7d

ErrorLLM: Modeling SQL Errors for Text-to-SQL Refinement

This arXiv paper introduces ErrorLLM, a framework designed to improve the accuracy of text-to-SQL generation by addressing common errors. It focuses on the SQL refinement task, detailing how to model and correct erroneous SQL queries produced by large language models. The work aims to enhance the re

arxiv.org llm

7d

Knowledge-Graph Grounding Helps LLMs Only for Out-of-Training Knowledge: A Controlled Study on Clinical Question Answering

This arXiv paper presents a controlled study exploring the impact of knowledge-graph grounding on large language models, specifically in clinical question answering. It investigates whether structured knowledge benefits LLMs only for information outside their training data, following up on reports t

arxiv.org llm

7d

Show HN: BitVanes – A zero-trust RAG pipeline engine in Rust, WASM, and Arrow

This article introduces BitVanes, a zero-trust, local-first ETL engine for RAG pipelines built with Rust, WASM, and Apache Arrow. It processes sensitive documents locally for parsing, PII scrubbing, chunking, and vectorization, addressing concerns about shipping raw data to cloud services.

bitvanes.com llm

7d

Measuring Search Ranking Quality with LLM Judged NDCG

The article explores a method for measuring search ranking quality by employing LLMs to judge Normalized Discounted Cumulative Gain (NDCG). It presents an approach to leverage large language models for evaluating the effectiveness of search algorithms.

corvi.careers llm

7d

Why AI Agents Need Real-Time Analytics and Hybrid Search: The Data Infra for Production Agents

This article argues that AI agents require real-time analytics in addition to vector search for effective operation. It proposes that Apache Doris addresses this need by providing a unified real-time engine capable of native hybrid search tailored for agent workloads.

doris.apache.org agents

7d

Introducing the FFASR Leaderboard: Benchmarking ASR in the Real World

The article announces the launch of the FFASR Leaderboard, designed to benchmark Automatic Speech Recognition (ASR) models against real-world audio data. It outlines the methodology for evaluating ASR performance in practical conditions, moving beyond controlled datasets. The leaderboard aims to pro

huggingface.co ml

7d

What's coming in Postgres 19 (and what's still missing)

The article looks ahead to the upcoming Postgres 19 release, detailing the new features and improvements anticipated in this version. It covers various quality-of-life enhancements and potential advancements that day-to-day Postgres users will find beneficial. The discussion also touches upon areas

postgresweekly.com postgres

7d

How Visa went from multi-day reporting to conversational analytics agents with ClickHouse Cloud and LibreChat

This article details how Visa implemented conversational BI agents using LibreChat and ClickHouse Cloud for Authorize.net payments data. The solution transformed multi-day reporting into sub-second queries, saving users 8-10 hours weekly.

clickhouse.com clickhouse

7d

Achieve state-of-the-art inference latencies with speculative decoding

This article explores methods to achieve state-of-the-art inference latencies, specifically focusing on the application of speculative decoding. It aims to provide insights into optimizing performance for machine learning models.

modal.com ml

7d

LLM-CTF benchmark – 2,639 real data points from NeurIPS and original runs

The article introduces the LLM-CTF benchmark dataset, comprising 2,639 real data points derived from NeurIPS research and new experimental runs. This benchmark aims to provide resources for evaluating LLMs in Capture The Flag scenarios.

kaggle.com llm

7d

ATProto Permissioned Data Proposal Draft

This GitHub pull request introduces a draft proposal for integrating permissioned data capabilities into the ATProto, outlining design considerations and mechanisms for controlling data access.

github.com governance

7d

Expert-aware quantisation: near-Q4 quality at near-Q2 size?

The post investigates an 'expert-aware' quantization method designed to compress machine learning models to nearly Q2 size while retaining quality comparable to Q4 quantization. It discusses the technical approach and potential benefits for model deployment.

martinalderson.com llm

7d

Pg_graphwright: A Postgres knowledge-graph index that inherits row-level-SEC

This article introduces `pg_graphwright`, a new Postgres extension designed to function as a knowledge graph index. A key feature is its ability to inherit row-level security from the underlying Postgres tables, simplifying access control for graph data.

github.com postgres

7d

Operationalizing Data Orchestration: Best Practices for DevOps, Infra, and Code Locations

This article outlines best practices for operationalizing data orchestration, specifically addressing considerations for DevOps teams, infrastructure setup, and managing code locations within an orchestration framework.

dagster.io orchestration

7d

Real-Time Hyper-Personalization in 2026: Architecture Guide

This architecture guide outlines approaches for real-time hyper-personalization systems. It details potential system designs for delivering highly customized experiences at scale, expected by 2026.

confluent.io kafka

8d

How to Eliminate Training-Serving Skew With a Unified Real-Time Streaming ML Pipeline (2026 Guide)

This guide explains how to eliminate training-serving skew in ML pipelines, a common issue degrading accuracy and increasing costs. It proposes a unified kappa architecture leveraging Apache Flink and Iceberg for real-time streaming ML.

confluent.io mlops

8d

OPFS + Pyodide test harness

The article explores combining the Origin Private File System (OPFS) with Pyodide to create a test harness for in-browser execution. It details the technical challenges and solutions for running Python environments directly within the browser, leveraging local storage capabilities for data processin

simonwillison.net embedded-analytics

8d

A Brief Rant About the New Product Development Lifecycle - WarpStream

This article from WarpStream details their application of AI coding agents within their product development lifecycle. It explains how these agents are utilized effectively without compromising the system's uptime and reliability.

warpstream.com streaming

8d

Reliability fail: No automated zone failover for Coinbase’s global trading service

The article analyzes a major reliability incident at Coinbase where the global trading service lacked automated zone failover. It delves into the architectural implications of this failure and discusses the lessons learned regarding system resilience.

blog.pragmaticengineer.com engineering

8d

How Meta Engineered Ultra-Narrow Batteries for AI Glasses

Meta details the engineering challenges involved in designing ultra-narrow batteries for AI-powered smart glasses, such as the Ray-Ban Meta. The article discusses how to provide sufficient energy to support features like cameras, speakers, displays, and AI workloads within the compact form factor of

engineering.fb.com engineering

8d

Retrieval Is Filtering, Not Search: A Mental Model for Enterprise RAG

The article proposes a mental model for enterprise Retrieval Augmented Generation (RAG) systems, suggesting that retrieval should be viewed as a filtering process rather than string search. It details strategies like filtering line_df and toc_df and expanding context from small anchors for more effe

towardsdatascience.com llm

8d

What Are Lakehouse Catalogs? The Role of Catalogs in Apache Iceberg

This article, part of an Apache Iceberg Masterclass, explains what lakehouse catalogs are, their importance, and how to select among various options. It builds on previous discussions of the write process and atomic commits facilitated by catalogs.

dremio.com iceberg

8d

Build real agentic apps using CUGA: two dozen working examples on a lightweight harness

This article introduces CUGA, a lightweight harness designed for building agentic applications. It provides two dozen practical examples demonstrating how to construct and implement real-world AI agents.

huggingface.co agents

8d

SQLBuild - Skip Unnecessary Rebuilds for Your Existing dbt Project, Free & OSS (No Per-Skip Bill)

This post introduces SQLBuild, an open-source tool designed to optimize existing dbt projects. It functions by identifying and building only models that have changed, thereby skipping unnecessary rebuilds and reusing existing production tables to save compute resources. The project is fully open sou

reddit.com dbt

8d

Overcoming deserialization bottlenecks in data pipelines: an alternative zero-copy backend for Protobuf

The article addresses significant CPU costs incurred by Protobuf parsing overhead in data pipelines handling large volumes of structured data. It proposes an alternative zero-copy backend for Protobuf to mitigate deserialization bottlenecks, particularly when only partial data access or passthrough

reddit.com data-engineering

8d

ReSequel: Robust LLM-assisted Query Rewriting and Optimization using Templatization and Sampling

This arXiv paper introduces ReSequel, a method for LLM-assisted query rewriting and optimization. It explores using templatization and sampling to transform SQL queries into semantically equivalent, more efficient forms, building on heuristic and cost-based optimization.

arxiv.org llm

8d

RAIDS: Rethinking Data Systems as Responsible Intelligent Infrastructure

This arXiv paper introduces RAIDS, a framework for rethinking data systems as responsible intelligent infrastructure. It addresses the gap in responsibility mechanisms as data systems evolve into decision-making tools, discussing the need for sufficient support, satisfied constraints, and actionable

arxiv.org governance

8d

Cache-Aware I/O Cost Modeling for Disk-Based Learned Indexes

This arXiv paper addresses the absence of a principled I/O cost model for disk-resident learned indexes. It proposes a cache-aware I/O cost model, which is essential for effective index tuning and query optimization in database management systems.

arxiv.org architecture

8d

When Is a Columnar Scan Bandwidth-Bound? A Decode-Throughput Law and Its Cross-Hardware Validation

This arXiv paper investigates when columnar scans, which involve decompression, filtering, and aggregation, are limited by memory bandwidth rather than compute. It presents a predictive decode-throughput law and validates it across different hardware configurations to identify performance bottleneck

arxiv.org architecture

8d

Disk-Based Interval Indexes Under the Increasing Ending Time Assumption

This arXiv paper examines disk-based interval indexes, which are crucial for managing lifespan or validity intervals in temporal databases. It proposes that various interval indexes can be unified by a fundamental corner structure, especially under the assumption of increasing ending times.

arxiv.org architecture

8d

Graph-Enhanced Large Language Models for Spatial Search

This arXiv paper explores enhancing Large Language Models with graph structures to improve their spatial search and reasoning abilities. It builds upon Retrieval Augmented Generation (RAG) to overcome current LLM limitations in complex, domain-specific spatial tasks.

arxiv.org llm

8d

SemCEB: A Cardinality Estimation Benchmark for Semantic Operators

This arXiv paper introduces SemCEB, a new benchmark for evaluating cardinality estimation within semantic operators that utilize multi-modal large language models. It focuses on SQL operators, like filters and joins, where predicates are defined by natural language instructions, which is crucial for

arxiv.org llm

8d

A Set-Theoretic Approach to Detecting Logic Bugs in DBMS Inner Join Optimizations

This arXiv paper details a set-theoretic approach designed to identify logic bugs in the inner join optimizations performed by database management system query optimizers. It focuses on the crucial role of join optimization in determining efficient query execution strategies.

arxiv.org architecture

8d

A Compositional Language for Property Graphs

This arXiv paper proposes a new compositional language for property graphs. It aims to address the lack of compositionality in standardized graph query languages such as GQL and SQL/PGQ, which is a significant limitation when querying knowledge graphs. The paper presents both theoretical aspects and

arxiv.org knowledge-graphs

8d

SQLConductor: Search-to-Policy Learning for Step-wise Text-to-SQL Orchestration

This arXiv paper introduces SQLConductor, a framework that employs search-to-policy learning for step-wise Text-to-SQL orchestration. It aims to improve natural language access to relational databases, particularly in complex real-world settings where coordinated reasoning is essential. The approach

arxiv.org semantic-layer

8d

The Table Says Otherwise: Testing LLMs with Counterfactual Relational Data

This arXiv paper proposes a method for testing Large Language Models (LLMs) by using counterfactual relational data. The research investigates whether LLMs answer natural-language questions over structured data by interpreting the provided table or by recalling previously learned real-world facts. T

arxiv.org llm

8d

Generative Responsible AI Data Evaluation Schema (GRAIDES) for AI Assurance in Local Government

This arXiv paper presents the Generative Responsible AI Data Evaluation Schema (GRAIDES), designed for AI assurance in local government. The schema aims to establish well-governed, measurable evidence of generative AI performance and safety. It addresses challenges related to fragmented and inconsis

arxiv.org governance

8d

FireDataForge: A Unified Framework for Multi-Source Wildfire Data Retrieval and Integration

This arXiv paper presents FireDataForge, a unified framework for retrieving and integrating multi-source wildfire geospatial data. The framework aims to overcome the significant preprocessing burden associated with diverse data formats, coordinate systems, spatial resolutions, and temporal cadences,

arxiv.org data-engineering

8d

Universal Encoders for Modular Relational Deep Learning

This arXiv paper explores Relational Deep Learning (RDL) models that represent multi-tabular databases as temporal heterogeneous graphs for end-to-end representation learning. It identifies significant generalization obstacles in current RDL approaches and proposes universal encoders to create modul

arxiv.org knowledge-graphs

8d

TACO: Task-Aware Column Description Generation Using LLMs

This arXiv paper presents TACO, a system that leverages Large Language Models to generate accurate and informative column descriptions for tabular data. Such descriptions are vital for various downstream Natural Language Processing tasks, including Natural Language to SQL, table question answering,

arxiv.org semantic-layer

8d

Toward More Controllable AI Video Editing: An Early Research Exploration at Netflix

The article presents Netflix's initial research into achieving more controllable artificial intelligence for video editing workflows. It explores methods and challenges in enabling greater user direction over AI-powered creative processes.

netflixtechblog.com ml

8d

Shipping huggingface_hub every week with AI, open tools, and a human in the loop

This article details the weekly release process for huggingface_hub, emphasizing the integration of AI-driven automation and open-source tools. It highlights the crucial role of human oversight in maintaining quality and consistency in a rapid development cycle.

huggingface.co mlops

8d

Bridge Queries in Redpanda SQL

Redpanda SQL introduces bridge queries, a feature that enables querying both live streaming topics and historical Iceberg tables concurrently. This approach aims to eliminate the typical compaction overhead associated with integrating fresh and historical data.

redpanda.com streaming

8d

The end-to-end cost-performance of real-time analytics: Snowflake vs. ClickHouse Cloud

This article, titled 'CostBench', presents a comparison of Snowflake and ClickHouse Cloud for real-time analytics. It evaluates both platforms across the entire data path, including continuous ingest, query-ready data maintenance, freshness, query latency, and total cost.

clickhouse.com clickhouse

8d

What's New in pg_clickhouse v0.3.2: Postgres 19, TLS, Regex, and Memory

The article details updates in the latest pg_clickhouse releases, including support for Postgres 19, TLS, and regex functionalities. It highlights JSONB, date/time, and array function pushdown, along with HTTP result set streaming for reduced memory consumption.

clickhouse.com clickhouse

8d

Prompt Injection as Role Confusion

The post analyzes prompt injection attacks through the lens of 'role confusion' in large language models. It examines how adversarial prompts manipulate an LLM's perceived identity or function, leading to unintended behavior, and discusses methods to mitigate this vulnerability in agentic systems.

simonwillison.net llm

8d

Porting the Moebius 0.2B image inpainting model to run in the browser with Claude Code

This article details the process of porting the Moebius 0.2B image inpainting model to run directly within a web browser. It explores the technical steps involved in optimizing the model for client-side execution and how an AI assistant like Claude Code aided in the conversion process.

simonwillison.net ml

8d

How Netflix Simplified Batch Compute with Kueue

The article explains how Netflix utilized Kueue to streamline its batch compute infrastructure. It covers the architectural considerations and operational patterns employed to simplify the management of large-scale batch workloads.

netflixtechblog.com orchestration

8d

Snowflake Postgres Powers Low-Latency ML Feature Serving

Snowflake's ML team utilized Snowflake Postgres for their Online Feature Store, achieving 2.5 times lower latency and 7 times higher queries per second compared to Databricks Lakebase in production benchmarks. The article details this performance comparison for powering ML feature serving.

snowflake.com snowflake

8d

How we found a bug in the hyper HTTP library

Cloudflare uncovered a bug in the open-source hyper HTTP library across multiple major versions. This discovery happened while rearchitecting their Images binding. The article explains how the bug was found and its implications.

blog.cloudflare.com engineering

9d

Stop giving your agents database credentials

This article argues that AI agents fail in production due to a lack of structure rather than insufficient autonomy. It implies a need for robust frameworks around agent operations to ensure trust, particularly concerning sensitive resources like database credentials.

blog.crewai.com agents

9d

Adopting AV1 for Real-Time Communication (RTC) at Scale

Meta details its multi-year effort to adopt the AV1 codec for real-time communication at scale. The article covers technical and operational challenges encountered during deployment, including codec selection, device eligibility, rate control, and error resilience, and how these were addressed.

engineering.fb.com engineering

9d

The semantic debt crisis no one is talking about

The article introduces 'semantic debt' as a situation where different teams derive conflicting numbers for the same metric. It argues that the rise of AI will force organizations to address this inconsistency more urgently.

getdbt.com semantic-layer

9d

When RAG Users Ask Vague Questions: Clarify Once, Learn the Default

The article introduces a strategy for enhancing enterprise RAG systems' ability to handle ambiguous user questions. It advises designing RAG agents to ask a single focused clarifying question, learn from the user's response, and subsequently infer defaults for similar future queries.

towardsdatascience.com llm

9d

Can We Agree on a Storage/Workload Architecture Taxonomy?

This article proposes a taxonomy for categorizing modern data storage and workload architectures, addressing the increasing convergence of transactional, analytical, and hybrid systems. It details how systems, workloads, storage tiers, data visibility, and durable copies interact within these evolvi

jack-vanlightly.com architecture

9d

We got local models to triage the OpenClaw repo for FREE!*

This article explores the use of local machine learning models for triaging issues within the OpenClaw repository. It describes the implementation and effectiveness of using these models to automate parts of the repository management process, highlighting potential cost efficiency.

huggingface.co agents

9d

How Spyne simplified their CDC pipeline with ClickPipes and ClickHouse Cloud

Spyne migrated its Change Data Capture pipeline from a self-managed Debezium and Kafka stack to ClickPipes and ClickHouse Cloud. This transition reduced table onboarding time from over 1.5 hours to minutes and successfully eliminated schema drift issues.

clickhouse.com streaming

9d

Unpacking sandbox startup latency: why started ≠ ready

This article provides a detailed technical examination of sandbox startup latency, distinguishing between when a system has started and when it is truly ready for use. It delves into the underlying factors contributing to delays in application readiness. The content offers insights into performance

modal.com mlops

9d

sqlite-utils 4.0rc1 adds migrations and nested transactions

This release candidate for `sqlite-utils` introduces robust support for database migrations, enabling programmatic management of schema changes. It also adds nested transaction capabilities, allowing for more granular control over complex data operations and error handling within SQLite databases.

simonwillison.net data-engineering

9d

sqlite-utils 4.0rc1

This announces the release candidate for `sqlite-utils` version 4.0, a Python CLI and library for SQLite. Key new features include robust database migration capabilities and support for nested transactions, enhancing complex data management and schema evolution.

simonwillison.net data-engineering

9d

Temporary Cloudflare Accounts for AI agents

The article discusses a method for provisioning temporary Cloudflare accounts, designed to provide isolated environments for AI agents. It explores the architectural patterns and security considerations for enabling agents to interact with external services while minimizing risks and managing access

simonwillison.net agents

9d

Tool Calling, Explained: How AI Agents Decide What to Do Next

The article explains the concept of tool calling in AI agents, detailing how Large Language Models (LLMs) determine subsequent actions to interact with the external environment, whether by retrieving data or executing operations.

towardsdatascience.com llm

10d

Reconstructing the Table of Contents a PDF Forgot to Ship, So RAG Can Scope by Section

The article addresses the challenge of processing PDFs with visible but unstructured tables of contents for RAG systems, detailing two methods for reconstructing document structure and emphasizing a crucial, often-missed page-alignment step to enable section-level scoping.

towardsdatascience.com llm

10d

Patterns for Building Cybersecurity Evals

This article outlines effective patterns for constructing cybersecurity evaluations, focusing on key components such as a sandboxed target environment, inputs designed to modulate task difficulty, integrated tool usage, and a robust grading mechanism.

eugeneyan.com ml

10d

7 Crucial Barriers Between Data Teams and Self-Healing Data Architecture

This article examines seven significant obstacles preventing data teams from implementing self-healing data architectures. It discusses how AI technologies can be leveraged to overcome these barriers and make such autonomous data systems a reality.

towardsdatascience.com ml

11d

Making a PDF’s Images Searchable for RAG, Without Paying to Read Them All

This article describes techniques for making images within PDF documents searchable for RAG applications while minimizing processing expenses. It outlines an approach where image locations are identified, and only relevant images are converted into searchable text to control costs.

towardsdatascience.com llm

11d

VMAF v1: Good Is Not Good Enough

The article discusses the advanced development of VMAF (Video Multi-method Assessment Fusion) at Netflix, explaining why its initial version was deemed insufficient for evolving quality standards. It details the technical challenges and improvements made to enhance video quality assessment.

netflixtechblog.com engineering

11d

Speculation Is All You Need

This article details the concept and implementation of speculative decoding, a technique used to accelerate large language model inference. It explains how a smaller, faster model can generate draft tokens that a larger, more accurate model then verifies, significantly reducing latency and compute r

modal.com llm

12d

The Thundering Herd Problem in Agentic AI: Why Traditional Fixes Fall Short

This article examines the classic thundering herd problem as it manifests in agentic AI systems, highlighting how traditional mitigation strategies are insufficient. It discusses the unique characteristics of AI agent behavior that exacerbate this issue and proposes new considerations for designing

cockroachlabs.com agents

12d

Datasette Apps: Host custom HTML applications inside Datasette

This article introduces 'Datasette Apps,' a feature that allows users to embed and host custom HTML applications directly within a Datasette instance. It details how this functionality enables richer data exploration interfaces and custom dashboards alongside the core data publishing capabilities.

simonwillison.net embedded-analytics

12d

datasette-acl 0.6a0

This announcement details the 0.6a0 release of `datasette-acl`, a plugin providing Access Control List capabilities for Datasette. It covers new features for fine-grained permission management and policy enforcement, enhancing data security and compliance for published datasets.

simonwillison.net governance

13d

MosaicLeaks: Can your research agent keep a secret?

This article introduces MosaicLeaks, a framework designed to test the security and privacy capabilities of research agents. It investigates the potential for AI agents to inadvertently reveal sensitive information and discusses strategies for ensuring data confidentiality within agentic systems.

huggingface.co agents

13d

Build your own vulnerability harness

Cloudflare details the technical architecture of its multi-stage vulnerability discovery harness and automated triage loop. The post covers state control management, adversarial review to reduce false positives, and methods for routing around LLM context limits.

blog.cloudflare.com engineering

13d

Case Study: How CodeRabbit Leverages LanceDB for AI-Powered Code Reviews

This case study details CodeRabbit's implementation of LanceDB to power its AI-driven code review system. It describes how leveraging LanceDB for context engineering enhances the quality of code reviews.

lancedb.com vector-db

13d

High Performance Distributed Inference with Ray Serve LLM

This article explores methods for deploying high-performance distributed inference systems for large language models. It focuses on leveraging Ray Serve to manage and scale LLM inference across multiple computing resources efficiently.

anyscale.com ml

13d

Lance Blob V2: Late Materialization for Large Binary Data in Spark

This article explains the concept of late materialization within Lance Spark for handling large binary data. It details how this approach maintains lightweight references throughout query plans, only materializing bytes during the write phase.

lancedb.com spark

13d

Beyond LoRA: Can you beat the most popular fine-tuning technique?

This article investigates fine-tuning techniques for large language models beyond the popular LoRA method. It evaluates alternative approaches and their potential to surpass LoRA's performance and efficiency in various fine-tuning scenarios.

huggingface.co llm

13d

Is it agentic enough? Benchmarking open models on your own tooling

This article discusses methodologies for benchmarking the agentic capabilities of open large language models using custom evaluation tooling. It explores criteria for determining whether a model exhibits sufficient agency for specific tasks and provides guidance on developing relevant benchmarks.

huggingface.co agents

13d

Adaptive write request scheduling in Redpanda's Cloud Topics

This article details how Redpanda's Cloud Topics implement adaptive write request scheduling using the buddy allocator algorithm. The system balances batching efficiency with latency and cost considerations to optimize performance.

redpanda.com streaming

13d

Import & Vectorize Data with Weaviate at Scale

The article discusses strategies for importing and vectorizing data at scale with Weaviate. It covers server-side batching, retries, the blobHash data type, and multimodal ingestion, explaining when and how to use each with code examples.

weaviate.io vector-db

13d

Beyond the warehouse: How METRO Markets built a do-it-all data platform on ClickHouse Cloud

METRO Markets replaced a failing Hadoop-based data stack by building a unified data platform on ClickHouse Cloud. This new platform now supports various functions including data warehousing, real-time seller analytics, credit risk modeling, observability, and company-wide AI initiatives.

clickhouse.com clickhouse

13d

Appcues delivers personalized customer engagement with ClickHouse Cloud

Appcues migrated its real-time segmentation platform from Snowflake and Airflow to ClickHouse Cloud to manage 1.31 PB of data. This migration resulted in a 90% reduction in P95 query times, a 99% decrease in ingestion latency, and a 23% cut in overall analytics spending.

clickhouse.com clickhouse

13d

GLM-5.2 is probably the most powerful text-only open weights LLM

This article presents an evaluation of GLM-5.2, asserting its position as a leading text-only open-weights large language model. It likely includes performance benchmarks, architectural highlights, and a comparison against other prominent open-source LLMs in various text-based tasks.

simonwillison.net llm

13d

Payment Fraud Detection: How Banks and Businesses Stop Fraudulent Transactions

The article describes payment fraud detection as a significant data-intensive challenge within financial services. It explores methods employed by banks and businesses to prevent fraudulent transactions.

databricks.com ml

13d

Enterprise Agentic Analytics Explained

Enterprise agentic analytics involves enabling AI agents to perform multi-step analysis across diverse enterprise data sources, including data warehouses, databases, object stores, and SaaS tools, each with distinct access rules.

dremio.com agents

13d

Bringing more agent harnesses and frameworks to Cloudflare, starting with Flue

Cloudflare's Agents SDK is now a runtime any agent framework can build on. The company is opening up the Agents SDK primitives, with Flue introduced as a first framework targeting the SDK. Agents are also rolling out in the dashboard.

blog.cloudflare.com agents

14d

Snowflake and the Agentic Resource Discovery Specification

Snowflake supports the Agentic Resource Discovery (ARD) specification, an open protocol developed with Microsoft. This protocol aims to standardize how AI agents are cataloged, searched, and discovered across enterprises.

snowflake.com agents

14d

Introducing the Cloudflare One stack: agent-powered deployment

The Cloudflare One stack is a library of agent skills designed to equip AI agents with the knowledge needed for planning, deploying, and managing a Zero Trust environment. This approach eliminates the need for migration calls.

blog.cloudflare.com agents

14d

Semantic Memory for Hermes Agent with LanceDB

This article introduces a new LanceDB-backed memory plugin that provides durable, semantic recall across sessions for the Hermes Agent. It includes benchmarks and a hands-on walkthrough demonstrating remember, recall, and forget functionalities.

lancedb.com vector-db

14d

A Metadata Benchmark of Lance, Delta Lake, and Iceberg on S3

This article presents a Rust-based benchmark comparing the metadata performance of Lance, Delta Lake, and Apache Iceberg when used with S3 and S3 Express storage. It explains why Lance is optimized for object storage metadata operations.

lancedb.com iceberg

14d

From the Hugging Face Hub to robot hardware with Strands Agents and LeRobot

The post details deploying ML models from the Hugging Face Hub onto robot hardware. It discusses using Strands Agents and the LeRobot framework to bridge models to real-world robotic applications.

huggingface.co ml

14d

Transaction Processing in the Data Plane

This article details how writing transaction commit logic as a SQL view can achieve higher throughput compared to control-plane approaches. It explains how incremental view maintenance enables fast resolution, suitable for interactive timescales around 30ms.

materialize.com streaming

14d

Announcing DuckDB 1.4.5 LTS (Andium)

The article announces the release of DuckDB 1.4.5 LTS (Andium), differentiating it from the concurrent 1.5.4 (Variegata) release. It provides an overview of the key updates and fixes introduced in this long-term support version.

duckdb.org duckdb

14d

Announcing DuckDB 1.5.4 (Variegata)

This article announces the release of DuckDB 1.5.4 (Variegata), the latest non-LTS stable version. It highlights important updates and features distinguishing it from the 1.4.5 LTS (Andium) release.

duckdb.org duckdb

14d

Agentic Resource Discovery: Let agents search

This article discusses agentic resource discovery, enabling AI agents to autonomously search for and utilize relevant information. It covers how agents can expand their knowledge base and tool use capabilities through intelligent search.

huggingface.co ml

14d

The only scalable delete is DROP TABLE

This article presents a detailed post-mortem analysis and tactical guide on how a digital photo-frame company scaled its Postgres deployment to 226,000 transactions per second. It describes the issues encountered during a holiday peak and the specific strategies implemented to resolve them.

postgresweekly.com postgres

14d

92x faster queries: How Open Electricity uses ClickHouse to track Australia’s energy transition in real time

Open Electricity migrated approximately 1 billion rows of Australian energy data from Postgres and TimescaleDB to ClickHouse, achieving 92x faster queries on a fraction of the hardware.

clickhouse.com clickhouse

14d

Start fresh, don't lift and shift: a dbt migration guide

The article presents a dbt migration guide, cautioning against merely replicating legacy patterns in new tools. It instead advocates for a 'start fresh' approach to ensure migrations deliver better outcomes.

getdbt.com dbt

14d

How dbt makes agentic data pipelines trustworthy: the transformation layer's role in autonomous data systems

The article explores the role of dbt's transformation layer in establishing trustworthiness for agentic data pipelines. It posits that this layer defines data correctness within autonomous data systems, even when AI agents manage pipeline execution.

getdbt.com agents

14d

Context engineering is the new analytics engineering skill: a practical guide for dbt users

The article positions context engineering as an essential skill for analytics engineers, explaining how existing dbt projects can be structured to provide vital context for AI systems. It offers a practical guide for dbt users.

getdbt.com context-engineering

14d

Building the agentic data stack: A practical dbt guide for the AI era

This article presents a practical dbt guide for constructing an agentic data stack in the AI era. It outlines methods for preparing dbt projects to reliably support AI agents and prevent system instability, even as AI accelerates infrastructure development.

getdbt.com agents

14d

The trust-speed paradox: Governing AI-accelerated data work

The article discusses the 'trust-speed paradox' in data teams, noting that while many use AI for code generation, few adequately verify its output. It aims to provide strategies for bridging this gap and improving governance in AI-accelerated data workflows.

getdbt.com governance

15d

Snowflake Postgres Unifies Your Apps, Analytics and AI

Snowflake Postgres now includes data mirroring and pg_lake integration, enabling a native, pipeline-free method to synchronize OLTP and analytical data in near real time. This unifies applications, analytics, and AI capabilities.

snowflake.com postgres

15d

Securing the future of AI agents

Google DeepMind outlines an AI Control Roadmap for securing internal systems that utilize AI agents. This strategy combines established safeguards with real-time monitoring techniques to enhance the security posture of AI deployments.

deepmind.google agents

15d

Writing to an Apache Iceberg Table: How Commits and ACID Actually Work

This article, Part 6 of an Apache Iceberg Masterclass, details the precise steps an engine takes when writing data to an Iceberg table. It covers when a write becomes visible and how concurrent writers are managed, following a discussion on hidden partitioning.

dremio.com iceberg

15d

Agentic Lakehouse: The Architecture Built for AI-Native Analytics

This article proposes the Agentic Lakehouse as an architecture designed specifically for AI agents, differing from traditional lakehouses optimized for human analysts and predictable SQL. It identifies how current lakehouse designs, tuned by DBAs for known query patterns, are insufficient for AI-dri

dremio.com lakehouse

15d

4 ways we’re using our MCP server at Figma

The article explores four practical applications of Figma's MCP server, demonstrating its expanded role across the platform. It details how the server supports processes from updating dynamic content to facilitating design shipments to production.

figma.com agents

15d

Data Processing is Becoming a GPU Workload

This article argues that data processing is increasingly becoming a GPU-centric workload. It examines the underlying trends and technological advancements that are driving this transition in how data is processed.

anyscale.com data-engineering

15d

Predicting model behavior before release by simulating deployment

OpenAI describes Deployment Simulation, a method designed to predict AI model behavior before release. This technique uses real conversation data to enhance safety assessments and improve the accuracy of model evaluations in a pre-production environment.

openai.com mlops

15d

Build Compliant AI Agents With Stateful Stream Processing

This article details architectural patterns for building audit-ready, EU AI Act-compliant agents with stateful stream processing on Apache Kafka and Flink. It outlines 7 states, 4 patterns, and a phased rollout strategy.

confluent.io kafka

15d

Scalable Feature Engineering on Multimodal Datasets

The article describes how LanceDB utilizes the flexible data evolution features of the Lance format to facilitate scalable feature engineering for multimodal datasets.

lancedb.com vector-db

16d

OpenSearch vs LanceDB for Vector Search: Query Cost and Infrastructure

The post benchmarks OpenSearch and LanceDB for vector search, using COCO 2017 images with SigLIP embeddings. It measures ingestion throughput, query cost, storage layout, and overall infrastructure requirements.

lancedb.com vector-db

16d

A Guide to AI Inference Engineering

The article guides readers through the operational mechanisms of AI inference and explains the foundational reasons for the development of optimization techniques within this field.

blog.bytebytego.com mlops

16d

The Orchestration Maturity Model: Why Teams Move from Jobs to Assets

The article presents an Orchestration Maturity Model, explaining the evolution of data orchestration systems from focusing on jobs to managing data assets. It describes why enterprises are adopting systems like Dagster to transition to asset-aware data platforms that provide insights into data rathe

dagster.io orchestration

16d

Running local models is good now

The author shares their experience with running local models, noting that performance has significantly improved. They detail testing various models such as Mistral 7B, Gemma 3, OpenAI OSS-20B, and Qwen 3 MoE on an M2 Mac with 64 GB RAM.

vickiboykis.com ml

16d

Introducing neon.ts: infrastructure as code for your Neon projects

Neon has launched `neon.ts`, an infrastructure-as-code file designed for managing Neon projects. This tool enables users to declare Neon services, access type-safe environment variables, and configure branch settings, facilitating the provisioning of backend primitives for applications and agents.

neon.com postgres

16d

Architectural Decision Guide: When to Use Apache Kafka (And When You Shouldn't)

This architectural guide explores when to use Apache Kafka and when to avoid its implementation. It discusses the core differences between event streaming and task queues to design scalable systems reliably.

confluent.io kafka

19d

Scaling Security Insights: how we achieved a 10x increase in global scanning capacity

Cloudflare's Security Insights system now processes over 120 scans per second, achieving a 10x increase in global scanning capacity. This was accomplished by optimizing Kafka consumers, Postgres queries, and their API without adding hardware.

blog.cloudflare.com kafka

19d

Shipping psql without psql: a pure-TypeScript Postgres client in neonctl

Neon addresses `psql` availability issues across various operating systems and environments by reimplementing the `psql` client entirely in TypeScript. This new client is embedded directly within the `neonctl` command-line interface, ensuring its functionality even when the native `psql` is not inst

neon.com postgres

19d

Achieving Up to 67% Cost Savings with Prefill-Decode Disaggregation Using Ray + vLLM on AMD MI325X

This article details achieving up to 67% cost savings in LLM inference by implementing prefill-decode disaggregation. It describes how this technique is applied using Ray and vLLM on AMD MI325X hardware to optimize computational resources.

anyscale.com llm

19d

Inside FSDP with PyTorch and Ray: Scaling Model Training with Fully Sharded Data Parallel

This article provides an in-depth look at Fully Sharded Data Parallel (FSDP) for scaling machine learning model training. It explains how FSDP is implemented using PyTorch and Ray to efficiently distribute and manage model parameters across multiple devices.

anyscale.com ml

19d

Improving performance in the layers panel

The article describes the re-architecture of Figma's layers panel, implementing new computation and caching strategies. These changes resulted in a 30–50% improvement in interaction speed for large and complex files.

figma.com architecture

20d

The Pulse: a trend of trying to cut back on AI spend within eng departments?

The article explores a recent trend among engineering departments to reduce spending associated with AI initiatives. It examines the potential reasons behind this shift and its implications for technical teams.

blog.pragmaticengineer.com mlops

20d

Must- Know Deployment Strategies: From Big-Bang to Progressive Delivery

This article details major deployment strategies used in production, including their operational mechanisms, associated costs, and appropriate use cases. It covers methods ranging from big-bang to progressive delivery techniques.

blog.bytebytego.com engineering

20d

Text-to-SQL vs Agentic Analytics: What the Upgrade Requires

The article compares Text-to-SQL and Agentic Analytics, noting that leading large language models achieve 60-70% accuracy on complex SQL queries according to the BIRD benchmark. It discusses accuracy differences between simple and multi-join queries and examines the architectural requirements for ev

dremio.com semantic-layer

20d

Stop reading logs: Debugging Ray on Anyscale with Agent Skillsan

This article introduces Agent Skillsan as a method for debugging Ray applications on Anyscale without extensive log analysis. It explores how this agent-based approach streamlines the identification and resolution of issues in distributed systems.

anyscale.com agents

20d

Making FlashAttention-4 faster for inference

This post explores methods to accelerate FlashAttention-4 specifically for inference workloads in large language models. It details technical optimizations and algorithmic adjustments aimed at reducing computation time and memory footprint during the attention mechanism calculation, leading to impro

modal.com llm

20d

ASOF JOIN Benchmark: Apache Doris vs ClickHouse and DuckDB

The article presents a benchmark comparing Apache Doris 4.1's ASOF JOIN performance against ClickHouse and DuckDB. Apache Doris 4.1 demonstrates superior performance across all eleven tested scenarios for this specific join type.

doris.apache.org duckdb

20d

Profiling in PyTorch (Part 2): From nn.Linear to a Fused MLP

This is the second part of a series on profiling PyTorch models, focusing on optimizing performance from nn.Linear layers to fused MLPs. It details techniques for identifying and reducing bottlenecks in neural network execution.

huggingface.co ml

20d

Cloud Topics: the Metastore

This article explains the architecture of Redpanda's metastore for Cloud Topics, detailing how it enables features such as offset lookups, complete cluster restores, and cross-region read replicas. It positions the metastore as a foundational component for future system capabilities.

redpanda.com streaming

20d

Stripe Projects adds new agent integrations, more providers, and custom developer controls

Stripe Projects is expanding with new integrations, additional providers, and customizable developer controls to address challenges agents face beyond code writing. This initiative aims to enhance agents' ability to independently write code and integrate with APIs like Stripe's.

stripe.com agents

20d

How Adyen trains a Transaction Foundation Model (TFM) on 51 trillion tokens and other stories on scaling AI with Ray from Xoople, Criteo, and BMW

The article presents case studies from Adyen, Xoople, Criteo, and BMW detailing their approaches to scaling AI workloads using Ray. It highlights Adyen's experience training a Transaction Foundation Model on 51 trillion tokens. The discussion covers practical aspects of deploying and managing large-

anyscale.com ml

20d

Agentic AI Architecture: How CockroachDB Supports Memory, Context, and Control

The article explores architectural patterns for integrating a database, specifically CockroachDB, to manage the memory, context, and control aspects of autonomous AI agents. It details how the database can serve as a persistent store for agent state, conversational history, and operational parameter

cockroachlabs.com agents

20d

DiffusionGemma: 4x faster text generation

Google DeepMind introduces DiffusionGemma, a new model designed for text generation. The article claims this model achieves a 4x speed improvement compared to previous methods, offering significant advancements in text generation efficiency.

deepmind.google llm

21d

Kafka Share Groups and Parallelizing Consumption - Part 3: Client-local parallelism

This article, part of a series on Kafka consumption, explores client-local parallelism as a method for scaling. It differentiates between broker-visible and client-local approaches, detailing how execution mechanisms such as threads, virtual threads, or async tasks enable local parallelism.

jack-vanlightly.com kafka

21d

Encoding Your Domain Expert: The Context Layer Behind Spotify's Data Assistant

This article from Spotify Engineering outlines the development of a "Context Layer" to encode domain expertise, which powers their internal Data Assistant. It discusses how this system helps address complex data problems by providing contextual understanding and streamlining data access for users.

engineering.atspotify.com llm

21d

Semantic Layer vs Data Catalog: What’s the Difference?

This article clarifies the distinction between a semantic layer and a data catalog, two terms often used interchangeably despite serving different purposes. It explains their unique roles in a data architecture and how each handles metadata to improve data understanding for both human and machine co

dremio.com semantic-layer

21d

Postgres 19 Beta 1 is here

This article announces the release of PostgreSQL 19 Beta 1, highlighting key new features in this major version. These include graph query support, enhancements for faster data inserts, the introduction of pg_plan_advice, and capabilities for parallel-worker autovacuuming and online toggling of data

postgresweekly.com postgres

21d

How Torc hit 90% GPU utilization and other stories on scaling AI with Ray from Discord, Cubist, and Coinbase

The article features how companies like Torc, Discord, Cubist, and Coinbase manage and scale their AI systems using Ray. It specifically highlights Torc's achievement of 90% GPU utilization for their ML workloads. The content explores real-world strategies for optimizing resource use in AI deploymen

anyscale.com ml

21d

The Bill Arrives: How to Manage Agentic AI Costs at Scale

This article focuses on strategies for managing the operational costs associated with deploying and scaling agentic AI systems. It discusses factors contributing to cost blowouts, such as token usage multipliers and context window management, offering practical approaches to optimize spending in pro

cockroachlabs.com agents

21d

Scaling beyond one: How Airbnb evolved its data architecture for a multi-product world

This article details how Airbnb evolved its data architecture to support multiple distinct product lines, moving beyond a monolithic approach. It describes the architectural changes, challenges encountered, and solutions implemented to scale data infrastructure effectively.

medium.com data-engineering

22d

What Salesforce Learned from 20,000 Enterprise Agent Deployments

This article presents insights from John Kucera, Salesforce's CPO of Agentforce, on distinguishing successful enterprise agent deployments from those that fail to deliver sustained business value. It draws from the experience of 20,000 agent deployments within Salesforce.

blog.bytebytego.com agents

22d

Introducing Gemma 4 12B: a unified, encoder-free multimodal model

Google DeepMind presents Gemma 4 12B, a new unified multimodal model. A key architectural detail highlighted is its encoder-free design, representing a novel approach in multimodal AI systems.

deepmind.google llm

22d

Hidden Partitioning: How Iceberg Eliminates Accidental Full Table Scans

This article, part of an Apache Iceberg Masterclass, details the hidden partitioning feature within Iceberg. It explains how this capability eliminates the need for users to understand physical data organization and prevents costly accidental full table scans during queries.

dremio.com iceberg

22d

Semantic Layer Governance: Control What AI Agents Access

This article addresses the governance gap arising from AI agents executing hundreds of queries per minute without human review. It discusses how traditional access controls are insufficient and explains how a semantic layer can provide the necessary control for what AI agents can access.

dremio.com semantic-layer

22d

How an Agent Built a 3D Paris Gallery by Chaining Two Hugging Face Spaces

This article demonstrates an AI agent constructing a 3D gallery by chaining together two distinct Hugging Face Spaces. It illustrates the agent's ability to orchestrate multiple tools to achieve a complex creative task.

huggingface.co ml

22d

Defend against frontier cyber models: Cloudflare's architecture as customer zero

Cloudflare details the architecture behind Project Glasswing, emphasizing its importance in defending against vulnerabilities over rapid patching. The article describes the specific threats this architecture addresses and how Cloudflare implements it as its internal "customer zero."

blog.cloudflare.com engineering

22d

How We Moved Discord Voice to the Edge

The article describes Discord's project to migrate its voice and video services onto Cloudflare's edge network. It covers the technical process of achieving closer servers and reduced ping times across regions, along with specific bugs encountered during the implementation.

discord.com architecture

22d

The four pillars for AI agent governance at scale

This article outlines four essential pillars for effective AI agent governance at scale: identity, authorization, observability, and accountability. It emphasizes the necessity of robust governance infrastructure beyond just improving agent models for enterprise deployment.

redpanda.com agents

22d

Your AI isn't broken. Your data model is.

The article explores why AI proof-of-concepts often fail in production, attributing the gap to issues within the underlying data model rather than the machine learning model itself. It suggests that a robust data model is crucial for successful production AI deployments.

getdbt.com data-modeling

23d

Token Spend Out of Control? The Case for Smarter Routing

This article explores strategies for managing and optimizing LLM token spend in production environments, specifically focusing on smarter routing techniques. It includes insights from the co-founders of Kilo, an open-source coding agent that frequently encounters these challenges.

blog.bytebytego.com llm

23d

A Human-Augmenting Agentic Workflow for Causal Inference

The article describes a new human-augmenting agentic workflow developed at Netflix to enhance causal inference capabilities. It details the architecture and operational patterns that allow AI agents to collaborate with humans on complex analytical problems.

netflixtechblog.com agents

23d

Thinking Fast & Slow for a Personalized Notification System

The article details the architecture of Netflix's personalized notification system, which leverages the 'Thinking Fast & Slow' framework. It describes how both immediate and deliberative processing contribute to tailoring notifications for individual users.

netflixtechblog.com ml

26d

Per-User Identity Mode: New Security Features with Apache Doris and Polaris

Apache Doris 4.1 implements a per-user identity mode, forwarding individual user identities to Apache Polaris, an Iceberg REST Catalog. This design eliminates the need to route all queries through a single shared service account, enhancing security.

doris.apache.org iceberg

26d

Sitar-agent: Building a reliable dynamic configuration sidecar at scale

This article outlines the development of Sitar-agent, a dynamic configuration sidecar designed for high reliability and scalability at Airbnb. It covers the architecture, challenges, and solutions involved in building such a critical infrastructure component.

medium.com engineering

27d

Broker-Visible vs Client-Local Parallelism

Presented as a side-quest in a series about Kafka share groups and parallel consumption, this article focuses on the fundamental differences between broker-visible and client-local parallelism. It examines how various configurations and behaviors specifically influence parallel consumption within sh

jack-vanlightly.com kafka

27d

Multigres v0.1 Alpha: an operating system for Postgres

This article announces the release of Multigres v0.1 alpha to the open source community. Multigres aims to provide Vitess-grade horizontal scaling, high availability, and operational simplicity for Postgres.

supabase.com postgres

27d

What Breaks When Agentic AI Reaches Production?

The article investigates the typical points of failure and production incidents that arise when deploying agentic AI systems in real-world environments. It outlines common hurdles faced by enterprise AI teams beyond initial impressive prototypes, detailing the complexities of moving agents from deve

cockroachlabs.com agents

27d

Lights Out, Systems On: Validating Instant Power Loss Readiness

Meta is introducing Instantaneous PowerLoss Storm, a new testing paradigm within its infrastructure for handling and mitigating instant or zero-notice power loss in data centers. The article shares how Meta built readiness to tolerate instant failures into existing systems with defense-in-depth stra

engineering.fb.com engineering

28d

Coding Is No Longer the Constraint: Scaling Developer Experience to Teams and Agents at Spotify

Spotify's chief architect discussed strategies for enhancing the effectiveness of both engineering teams and AI agents. The presentation covered methods for scaling developer experience within the organization.

engineering.atspotify.com agents

28d

How OpenAI Built Its Data Agent

This article explores OpenAI's approach to building its data agent, emphasizing that the primary challenge in data analysis lies in discovering relevant tables and understanding their semantic usage, rather than SQL authoring. It delves into how they tackle these complex data discovery and interpret

blog.bytebytego.com agents

28d

I Slop Forked Neon. You Should Too.

The article argues that APIs are critical for AI agents, which interact more effectively with programmatic interfaces than graphical dashboards. It emphasizes that platform functionality visible in user consoles should also be accessible via open API endpoints. This approach supports agentic workflo

neon.com agents

28d

20x Faster Training Data Reads with Alluxio and Ray Data: A Cross-Region Benchmark

This article details a benchmark study on optimizing training data reads, achieving a 20x speedup by integrating Alluxio with Ray Data. It focuses on performance improvements for data access patterns, particularly in cross-region scenarios. The content covers the technical aspects of reducing data r

anyscale.com mlops

28d

Dynamic Repartitioning for Time Series Workloads

The article describes Netflix's approach to dynamic repartitioning specifically tailored for time series data workloads. It details the mechanisms and architectural considerations for optimizing data layout and query performance on time-ordered datasets.

netflixtechblog.com data-engineering

28d

Hybrid Modeling for JSON in Agent Observability: VARIANT and Inverted Indexes in Apache Doris

The article explores a hybrid modeling strategy within Apache Doris for managing dynamic, schema-evolving agent observability logs. It leverages the VARIANT data type and native inverted indexes to achieve high performance for this type of data.

doris.apache.org observability

28d

A chat with the creator of Postgres

Postgres 19 will introduce support for SQL/PGQ, enabling users to declare property graphs over existing tables. This allows for pattern matching with Cypher-like syntax within Postgres.

postgresweekly.com postgres

28d

Helping businesses optimize network costs with the Visa Digital Commerce Authentication Program (DCAP)

Stripe explains its rapid implementation to enable businesses to leverage the Visa Digital Commerce Authentication Program (DCAP). The initiative focused on helping businesses achieve interchange savings while maintaining strong authorization rates.

stripe.com engineering

28d

Reproducible Data Curation In The Multimodal Lakehouse

The article describes how LanceDB processes raw multimodal data to create reproducible, training-ready datasets. It covers features such as search, filtering, deduplication, sampling, and versioned curation workflows.

lancedb.com vector-db

29d

When history fails you, borrow from geography

This article presents a novel problem-solving methodology, drawing insights from geographical concepts to address limitations encountered with traditional historical data approaches. It discusses how applying these alternative frameworks led to effective solutions for complex engineering problems.

medium.com engineering

29d

How OmniNode uses Redpanda to scale AI agent workflows

OmniNode's founder discusses the development of their AI agent workflows, detailing how Redpanda is utilized for scaling these operations. The article specifically highlights how data contracts are employed to manage and prevent topic name drift within their streaming infrastructure.

redpanda.com agents

29d

How Brooklyn Data Uses Compass for Self-Service Analytics in Slack

This article describes how Brooklyn Data implements Compass to facilitate self-service analytics within Slack, aiming to decrease response times for data queries, enhance data discoverability, and establish governed access to operational and financial data across the organization.

dagster.io orchestration

30d

Reinforcement learning is an infrastructure problem

The article argues that scaling reinforcement learning applications in production primarily presents an infrastructure challenge, not just an algorithmic one. It explores the system design considerations and engineering requirements needed to effectively train and deploy RL agents at scale, covering

modal.com ml

30d

Real-time streaming for the agentic era with NVIDIA

NVIDIA Vera is launching with Redpanda as part of the ecosystem, aiming to deliver 5.5x lower latencies for agents running in mission-critical environments.

redpanda.com streaming

30d

Show HN: Streambed – Stream Postgres to Iceberg on S3, Supports Postgres Wire

Streambed is a tool that streams Postgres data to Iceberg on S3 and supports the Postgres wire protocol.

github.com postgres

31d

Show HN: 1B rows (14B cells) in the browser with DuckDB and Glide Data Grid

The author presents a demo of rendering 1 billion rows (14 billion cells) in the browser using DuckDB and Glide Data Grid.

analytics-grid.com duckdb

31d

Embeddings Aren’t Magic: The Predictable Failure Modes of RAG Retrieval

This article discusses the limitations of embeddings in RAG systems, noting that while they handle synonyms and paraphrasing well, they can fail on negations, exact identifiers, and company-specific acronyms. It suggests alternative approaches for when these failures occur.

towardsdatascience.com llm

32d

RAG Is Burning Money — I Built a Cost Control Layer to Fix It

The article presents a cost control layer for RAG systems that combines semantic caching, query routing, token budgeting, and circuit breaking. The approach reportedly achieves an 85% reduction in LLM costs without significantly impacting answer quality.

towardsdatascience.com llm

33d

Apache Iceberg Partition Evolution: Change Your Partitioning Strategy Without Rewriting Data

A 10 TB events table partitioned by month was the right call two years ago. Now your data volume has grown tenfold, your team runs daily SLA dashboards, and every query that touches "last 7 days" is scanning an entire month's worth of files. In a traditional Hive-style warehouse, fixing this means a

dremio.com iceberg

33d

Unifying the AV ML Stack: From Raw Data to Trained Model with LanceDB

The article provides a complete walkthrough for constructing an autonomous vehicle perception model training pipeline. It demonstrates building this pipeline from raw data to a trained model using LanceDB and the Multimodal Lakehouse.

lancedb.com mlops

33d

Make your SQL Workflows Multimodal With LanceDB × DuckDB

The article presents a hands-on walkthrough on integrating LanceDB and DuckDB to facilitate multimodal data querying using SQL. It covers joining data across multiple tables and materializing results back into LanceDB.

lancedb.com duckdb

33d

New DuckDB-Iceberg Features in v1.5.3

This blog post demonstrates the new features available in DuckDB v1.5.3 for the DuckDB-Iceberg extension, even while the team focuses on DuckLake v1.0 and Quack.

duckdb.org duckdb

33d

The Infrastructure Behind Making Local LLM Agents Actually Useful

The article discusses the infrastructure required to build a fast and reliable scientific agent using local open-weight models, vLLM, and long-context infrastructure.

towardsdatascience.com llm

34d

How we built Cloudflare's data platform and an AI agent on top of it

Here’s how we built Town Lake, Cloudflare's unified analytics platform, alongside Skipper, an internal AI agent running on top of it.

blog.cloudflare.com engineering

34d

Agentic Lakehouse vs Data Lakehouse: What Actually Changes

The traditional data lakehouse was designed for human analysts. Every architectural decision, from how performance is tuned to how business context is stored, assumed that a person would be sitting at the end of the pipeline, writing queries, interpreting results, and carrying those results into dec

dremio.com lakehouse

34d

Apache Polaris 1.5.0: Deep-Dive Into the Future of Open Data Catalogs

Catalog governance is the biggest bottleneck in building a multi-engine lakehouse. When you query the same Apache Iceberg tables with Spark, Flink, and Dremio, synchronizing permissions and access credentials across different engines is traditionally a manual, error-prone chore. Apache Polaris solve

dremio.com lakehouse

34d

How We Built Production Vector Search in Apache Doris

Apache Doris 4.1 integrates native Approximate Nearest Neighbor (ANN) vector indexes, including IVF and IVF_ON_DISK, directly into its OLAP engine. This integration achieves 900 queries per second at 97% recall when benchmarked using VectorDBBench.

doris.apache.org vector-db

34d

Solo founding is at an all-time high: Top performers have these traits in common

In 2025, solo founders in the top decile generated 61 times the revenue of the median solo founder in their first six months. We analyzed the data to understand what drives that gap.

stripe.com engineering

34d

Agentic Lakehouse Architecture: The Four Technical Layers

Choosing the right concept is only half the job. Plenty of teams have adopted the lakehouse model, picked open formats, and still built systems that fail when AI agents start querying them at scale. The Agentic Lakehouse architecture solves a specific problem: how do you structure a data platform so

dremio.com lakehouse

35d

Kafka Share Groups and Parallelizing Consumption - Part 2: Producer Batches and share.acquire.mode

The post explains how to set an appropriate value for max.poll.records when using Kafka share groups, revealing the relationship between group.share.partition.max.record.locks and the number of consumers per partition.

jack-vanlightly.com kafka

35d

Using LLMs to Secure Source Code

This article describes a systematic approach to leveraging LLMs for securing source code. The process involves constructing a threat model, identifying potential vulnerabilities, verifying findings, triaging discovered issues, and applying necessary patches.

eugeneyan.com llm

35d

Redpanda SQL is GA: the query engine that skips the pipeline

Redpanda SQL is generally available, allowing direct SQL queries against live Kafka topics and Iceberg history without requiring a separate ETL pipeline.

redpanda.com streaming

35d

How a Leading Fintech Cuts Weekly Compliance Reporting from 2 Days to 2 Hours

A fintech company automated its compliance reporting using agentic AI to extract data from multiple sources and synthesize reports, reducing weekly reporting time from 2 days to 2 hours.

blog.crewai.com agents

35d

How We Scaled Change Management to 30+ Environments Without a DevOps Team - WarpStream

WarpStream details their approach to safe shipping across 24 control plane regions, discussing deploy trains, shadow services, zonal Kubernetes clusters, and async reconciliation loops.

warpstream.com streaming

36d

How a Mid-Tier Enterprise SaaS Provider Automates Cloud Support Triage

A mid-tier SaaS provider automates cloud support triage using a 5-agent workflow, improving ticket validation, routing, and SLA compliance.

blog.crewai.com agents

36d

SilverTorch: Index as Model — A New Retrieval Paradigm for Recommendation Systems

We’re introducing SilverTorch, a reimagining of recommendation systems that unifies all retrieval components for user generated content under a unified architecture. SilverTorch shows up to 23.7x higher throughput compared to the state-of-the-art approaches. It’s also showing 20.9x more compute cos

engineering.fb.com engineering

36d

Performance and Apache Iceberg’s Metadata

This is Part 3 of a 15-part Apache Iceberg Masterclass. Part 2 covered the metadata structures of all five table formats. This article focuses on exactly how query engines use Iceberg's metadata to avoid reading data they don't need. The single biggest performance advantage of Iceberg over raw data

dremio.com lakehouse

36d

Kafka Share Groups and Parallelizing Consumption — Part 1: Tuning max.poll.records

The post examines the impact of tuning max.poll.records in Kafka share groups, using Kafka 4.2.0 and Dimster for testing.

jack-vanlightly.com kafka

37d

EP216: RAGs vs Agents

The article contrasts Retrieval-Augmented Generation (RAG) and agents as solutions for accessing company data with LLMs, highlighting their distinct problem-solving approaches.

blog.bytebytego.com llm

39d

Apache Iceberg V2 vs V3: What Changed and What It Means for Your Tables

Apache Iceberg is not a static format. The spec version number stamped into every table's metadata controls which features that table can use, which engines can read it, and how efficiently row-level changes are handled. The jump from Apache Iceberg V2 to V3 introduces deletion.

dremio.com lakehouse

39d

Apache Iceberg Machine Learning: Solving Data Versioning for AI

Models can lose accuracy after retraining, and reproducing the exact training dataset from months ago can be difficult due to data lake changes. Apache Iceberg solves this by providing data versioning capabilities, allowing you to track and reproduce specific datasets used for training.

dremio.com lakehouse

39d

Benchmarking Apache Kafka Consumer Groups vs Share Groups (overhead test)

The post presents a benchmark comparing the overhead of Kafka Consumer Groups and Share Groups using Dimster, a performance benchmarking tool for Kafka.

jack-vanlightly.com kafka

40d

Finding Bugs using LLMs

This article describes Materialize's success since February 2026 in using LLM-based coding agents, primarily Anthropic’s Opus 4.6 and 4.7, to find bugs in existing code and open pull requests. It will cover system considerations for implementing such an approach.

materialize.com llm

40d

Making User-Sequence Data More Cost-Efficient, Faster, and Easier to Use

This post describes how Pinterest engineers optimized their systems to handle user-sequence data more cost-efficiently, faster, and with improved usability. It likely covers specific architectural changes and engineering techniques implemented to achieve these gains.

medium.com data-engineering

41d

Reimagining ML Operations with Agent Skills: a new maturity model for on-call

This article explores a re-imagined approach to ML operations through the lens of "Agent Skills," proposing a new maturity model for on-call responsibilities. It discusses how AI agents can potentially transform MLOps practices. The content delves into conceptual frameworks for improving operational

anyscale.com agents

41d

Test-Driving the Lance Lakehouse Format in DuckDB

DuckDB users can now query Lance datasets using SQL through the CLI or SDKs, enabling AI and retrieval workload capabilities; this post highlights Lance as a good option for vector storage and querying.

duckdb.org duckdb

41d

Build a Coding Assistant with Weaviate MCP: RAG over Code & Docs

Use Weaviate's built-in MCP server to give Claude Code, Cursor, and VS Code hybrid search over your codebase and docs. No glue code.

weaviate.io vector-db

41d

Training SID-1 to beat GPT-5 at search with 1k+ QPS RL

SID-1 is an agentic search model that is 24x faster than GPT-5.1-high, 374x cheaper than Sonnet 4.5, and achieves 1.9x higher recall than traditional RAG pipelines. The article explains how it was trained using large-scale RL on turbopuffer.

turbopuffer.com agents

42d

DuckDB 1.5.3: Not an Ordinary Patch Release

DuckDB v1.5.3, while a patch release, includes several important new features; the complete release notes are available on GitHub, with installation instructions provided.

duckdb.org duckdb

42d

Scaling Airbnb’s identity graph with a unified knowledge graph infrastructure

This article describes how Airbnb built and scaled a unified knowledge graph infrastructure to manage its complex identity graph. It covers the architectural patterns, challenges, and solutions for entity resolution and connecting disparate identity information at scale.

medium.com knowledge-graphs

43d

Cloud Topics: Level Zero garbage collection

The post details how Redpanda Cloud Topics manages the lifecycle of temporary L0 objects and safely deletes them without data loss or excessive storage costs.

redpanda.com streaming

43d

AI-assisted analytics engineering: Docusign’s framework for scaling dbt unit testing

Docusign reports reducing dbt unit test authoring time from 5 hours to 30 minutes using a structured AI-assisted framework.

getdbt.com dbt

44d

How ChatFeatured migrated from PlanetScale Postgres to Postgres Managed by ClickHouse to power AI brand discovery

How ChatFeatured cut analytics query times from 2.5 minutes to under a second by migrating from PlanetScale Postgres to Postgres managed by ClickHouse — in just 30 minutes.

clickhouse.com clickhouse

44d

Relational Database Data Lineage Ontology

The paper proposes a novel ontology for relational database data lineage to address the challenges of modeling lineage, especially with incomplete or missing dependencies between database objects.

arxiv.org databases

44d

Towards Foundation Models for Relational Databases with Language Models and Graph Neural Networks

The paper explores the potential of language models and graph neural networks to serve as foundation models for relational databases, aiming to enhance deep learning applications on relational data.

arxiv.org databases

44d

Gradient-Based Join Ordering

The paper presents a gradient-based approach for join ordering, which is a computationally complex problem that critically impacts query execution performance in databases.

arxiv.org databases

44d

Every Voice and Video Call on Discord Is Now End-to-End Encrypted

The article announces that end-to-end encryption is now enforced for all voice and video calls on Discord, effective March 2026. Discord's VP of Engineering discusses the importance of this multi-year commitment.

discord.com architecture

44d

Designing Sovereignty in Real-Time Data Streaming

Digital sovereignty in streaming demands architectural guarantees, not policy promises. The post discusses how BYOC, schema controls, and open protocols satisfy global regulations.

confluent.io kafka

47d

How the D. E. Shaw group powers high-cardinality observability at scale with ClickHouse

The D. E. Shaw group replaced its previous observability platform with ClickHouse to handle high-cardinality metrics at scale, achieving 7x better query performance and enabling multi-year capacity planning across millions of compute workloads.

clickhouse.com clickhouse

47d

The Chunking and Embedding Cookbook for Production Context Engineering

This guide covers three critical decisions for production RAG systems: chunk shaping, embedding selection, and ANN index scaling, bridging the gap between demo retrieval and real-scale deployments.

doris.apache.org llm

47d

Event-Native Governance: An Architectural Guide to Secure, Compliant, and Reliable Streaming Systems

Learn how to design governance directly into event streaming systems. The post covers schemas, lineage, security, retention, and compliance patterns for reliable data platforms.

confluent.io kafka

48d

Postgres FDW: Pushdown is a negotiation

A deep dive into how pg_clickhouse's Foreign Data Wrapper decides what SQL to push down to ClickHouse versus execute locally in Postgres .

clickhouse.com clickhouse

48d

The Next AI Bottleneck Isn’t the Model: It’s the Inference System

The article argues that enterprise AI systems are entering a phase where inference design matters as much as model capability itself.

towardsdatascience.com ml

48d

ClickHouse vs Prometheus for High Cardinality, Part 1: Understanding the Problem

Why does high cardinality break Prometheus but not ClickHouse? In Part 1, we explore the architectural tradeoffs of Prometheus and other series-based systems, showing how cardinality impacts memory, ingestion, querying, and operational stability at scale.

clickhouse.com clickhouse

48d

The Counterintuitive Networking Decisions Behind OpenAI’s 131,000-GPU Training Fabric

The post presents a critical analysis of MRC's three counterintuitive design decisions behind OpenAI's 131,000-GPU training fabric, including the networking mathematics that make them work and their implications for the AI infrastructure community.

towardsdatascience.com ml

48d

Our billing pipeline was suddenly slow. The culprit was a hidden bottleneck in ClickHouse

Cloudflare's billing pipeline slowed down due to lock contention in ClickHouse's query planner after a partitioning change. The post details how they identified and fixed the bottleneck.

blog.cloudflare.com engineering

48d

Architecting Data Pipelines for Multimodal Datasets at Scale

This article explores the architectural challenges and solutions for designing data pipelines that can process multimodal datasets efficiently at scale. It delves into strategies for managing diverse data types and large volumes within machine learning workflows. The content covers system design pri

anyscale.com mlops

48d

Viaduct 1.0 and the future of Airbnb’s data mesh

This article introduces Viaduct 1.0 and outlines Airbnb's vision for its data mesh architecture. It details the principles, components, and future direction of their decentralized data management approach, highlighting how Viaduct serves as a key enabler.

medium.com data-engineering

49d

ClickStack SQL Charting and Alerting

Learn how ClickStack’s new SQL-powered charting and alerting unlock anomaly detection, rolling baselines, and advanced observability workflows directly on top of ClickHouse, without relying on external tooling.

clickhouse.com clickhouse

49d

The Metadata Structure of Modern Table Formats

This article breaks down exactly how each format organizes its metadata, which determines how fast queries start planning and how efficiently concurrent writes occur.

dremio.com lakehouse

49d

High Performance Rate Limiting at Databricks

The article explores Databricks' implementation of rate limiting at scale, focusing on shrinking the critical path and the necessary accuracy tradeoffs.

blog.bytebytego.com architecture

49d

An Engineer’s Guide to Better AI Skills: Implementing a Testing Process to Optimize Agent…

This article presents an engineer's guide from Pinterest on establishing a testing process to optimize AI agent performance and reliability. It likely details methodologies and best practices for evaluating and improving agent skills.

medium.com agents

50d

Migrating Data Ingestion Systems at Meta Scale

Meta's engineering teams revamped their data ingestion system to enhance reliability at scale, migrating from a legacy system to a new architecture.

engineering.fb.com engineering

50d

ClickHouse Cloud: Fast, Updatable Lookups with the Join Table Engine

Learn how ClickHouse Cloud's Join table engine enables fast, updatable in-memory lookups for dimensional modeling — with automatic upserts, deduplication, and data compaction powered by ReplacingMergeTree under the hood.

clickhouse.com clickhouse

50d

How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time

The article describes how Figma's engineering team evolved their data pipeline to handle growth, reducing latency from multiple days to real-time.

blog.bytebytego.com architecture

50d

When "idle" isn't idle: how a Linux kernel optimization became a QUIC bug

Cloudflare investigated a performance issue caused by CUBIC's congestion window getting stuck at its minimum, identifying the root cause as incorrect measurement of idle periods. The fix involved accurately distinguishing RTT wait times from application idleness.

blog.cloudflare.com engineering

50d

Cutting inference cold starts by 40x with LP, FUSE, C/R, and cuda-checkpoint

The post discusses techniques to cut inference cold starts by 40x using LP, FUSE, C/R, and cuda-checkpoint.

modal.com ml

50d

ClickHouse Release 26.4

ClickHouse 26.4 is here! In this release, more features become SQL compatible, COUNT DISTINCT gets faster, EXPLAIN gets even prettier, and more

clickhouse.com clickhouse

50d

Quack: The DuckDB Client-Server Protocol

This post introduces Quack, the new client-server protocol for DuckDB. It explains the motivation for a client-server architecture and outlines the design considerations for the Quack protocol, including security, efficiency, and extensibility.

duckdb.org duckdb

50d

Data Projects: Managing Data Assets at Netflix Scale

The title suggests a discussion on managing data assets at Netflix's scale.

netflixtechblog.medium.com engineering

50d

Powering self-driving vehicle analytics at Avride with ClickHouse Cloud

Avride replaced Apache Iceberg with ClickHouse Cloud, cutting index lookup latency from 20 seconds to under 100ms and ingestion from hours to seconds.

clickhouse.com clickhouse

50d

How Pinterest Built a Production MCP Ecosystem

The article focuses on the design and implementation of Pinterest's MCP ecosystem, outlining the key elements required for its successful operation.

blog.bytebytego.com architecture

51d

From Data Silos to Context Silos: What Database History Teaches Us About the AI Infrastructure Problem

The database industry is repeating a historical cycle where specialized systems create fragmentation that demands convergence. As AI agents become primary data consumers, organizations face a new challenge: context silos, where information exists but cannot be retrieved fast enough for autonomous sy

doris.apache.org agents

53d

Enhancing Ad Relevance: Integrating Real-Time Context into Sequential Recommender Models

This post details Pinterest's approach to enhancing ad relevance by integrating real-time contextual information into their sequential recommender models. It explains the methods and architectural considerations involved in processing and utilizing real-time data for these systems.

medium.com ml

54d

Scaling ArchUnit with Nebula ArchRules

The title suggests a discussion on scaling ArchUnit using Nebula ArchRules.

netflixtechblog.com engineering

54d

How Discord Automates ScyllaDB Clusters at Scale

The article describes Discord's approach to automating the setup and management of ScyllaDB clusters at scale. It explains the challenges faced when configuring and operating dozens of database nodes and the solutions implemented to streamline this process, significantly reducing deployment time.

discord.com architecture

54d

Apache Doris 4.1 Spill to Disk: Running Memory-Intensive Queries Without OOM

Apache Doris 4.1 introduces mature spill-to-disk capabilities, enabling Hash Join, Aggregation, and Sort operators to write intermediate state to disk when memory pressure rises so that memory-intensive analytical queries complete without OOM errors.

doris.apache.org analytics

54d

Announcing the Program of DuckCon #7 Amsterdam

The program for DuckCon #7 Amsterdam, a DuckDB user conference, has been announced. The event will be held on June 24, 2026, and will run from 15:00 to 20:00 CEST.

duckdb.org duckdb

54d

What Are Table Formats and Why Were They Needed?

This is Part 1 of a 15-part Apache Iceberg Masterclass. This article covers the fundamental question: what problem do table formats solve, and why does the choice between them matter? A data lake without a table format is a collection of files. It has no concept of a transaction, no mechanism to pre

dremio.com iceberg

55d

Container Design Patterns for Distributed Systems

This article presents container design patterns categorized by their coordination scope, providing a structured overview of common practices for distributed systems.

blog.bytebytego.com architecture

55d

Most agent reliability problems are data engineering problems

The article posits that many agent reliability issues stem from underlying data engineering problems.

sderosiaux.substack.com data-quality

55d

Using ClickHouse as a Kafka sink? Async inserts change the equation

The post discusses using ClickHouse as a Kafka sink, focusing on how async insert mode helps with high message rates but has buffering and dedupe behaviors that aren't always obvious.

reddit.com kafka

55d

Parloa builds service agents customers want to talk to

Parloa leverages OpenAI models to power scalable, voice-driven AI customer service agents, enabling enterprises to design, simulate, and deploy reliable, real-time interactions.

openai.com llm

55d

AI agents on Ray Serve: Single to multi-agent architecture

This Anyscale blog post discusses the architecture of AI agents on Ray Serve, covering both single-agent and multi-agent architectures. It explores how to build and deploy AI agents using Ray.

anyscale.com agents

55d

Everyone gets faster writes: Turning off FPW on Neon

Neon decoupled storage and compute to deliver up to a 5x performance increase on write-heavy workloads by disabling full-page writes.

neon.com postgres

55d

Delta Grows Up: Writes, Unity Catalog and Time Travel

DuckDB's Delta Lake and Unity Catalog extensions are no longer experimental. The post details the progress of these extensions.

duckdb.org duckdb

55d

Iceberg Default Column Values: Schema Evolution Without the Backfill

Adding a column to a large production table used to require a plan involving migration scripts, maintenance windows, and backfill jobs that rewrite every data file to include the new column. Iceberg default column values eliminate the need for backfills during schema evolution.

dremio.com lakehouse

55d

vLLM V0 to V1: Correctness Before Corrections in RL

The post details improvements to vLLM, focusing on correctness before corrections in reinforcement learning.

huggingface.co ml

56d

When DNSSEC goes wrong: how we responded to the .de TLD outage

On May 5, 2026, DENIC published broken DNSSEC signatures for the .de TLD, making millions of domains unreachable. Here's what 1.1.1.1 saw, how serve stale cushioned the impact, and how we restored resolution.

blog.cloudflare.com engineering

56d

Stop guessing in production: Full fidelity tracing at scale with ClickHouse and Odigos

How ClickStack and Odigos eliminate observability gaps with zero-code eBPF instrumentation and full-fidelity distributed tracing at scale.

clickhouse.com clickhouse

56d

DuckLake in Action: Manage Lakehouses, Run SQL & Build Notebooks with Rosetta DBT Studio

This post links to a video demonstrating the DuckLake workflow using Rosetta DBT Studio. The presentation covers creating and importing lakehouse instances, exploring metadata, running SQL queries, and building reusable SQL Notebooks.

reddit.com duckdb

56d

Agentic analytics starts with query-ready data: the write-side cost of Snowflake vs. ClickHouse

Agentic analytics makes query-readiness a write-side cost problem. This post compares Snowflake and ClickHouse under continuous ingest, showing how ClickHouse obtains query-ready data at 22× lower cost and delivers 31× better write-side cost-performance.

clickhouse.com clickhouse

56d

Singular Bank helps bankers move fast with ChatGPT and Codex

Singular Bank built Singularity, an internal assistant using ChatGPT and Codex to help bankers save 60–90 minutes daily on meeting prep, portfolio analysis, and follow-up.

openai.com llm

56d

Uber uses OpenAI to help people earn smarter and book faster

Uber uses OpenAI to power AI assistants and voice features that help drivers earn smarter and riders book faster across a global real-time marketplace.

openai.com llm

56d

Our AI started a cafe in Stockholm

Simon Willison describes how he used AI agents to launch and run a cafe in Stockholm, detailing the architecture and lessons learned.

simonwillison.net llm

56d

Integrating AI Into Apache Kafka Architectures: Patterns and Best Practices

The article compares three patterns for integrating AI inference with Apache Kafka: external RPC, embedded, and sidecar, focusing on avoiding consumer rebalances, cutting costs, and scaling LLM pipelines.

confluent.io kafka

57d

Monitoring reliably at scale

This article explores the challenges and solutions for establishing reliable monitoring systems in a large-scale production environment. It details architectural considerations and best practices for ensuring consistent and accurate observability data.

medium.com observability

57d

Jikkou 1.0 is out — Iceberg, multi-cluster orchestration, and Confluent Cloud RBAC

Jikkou 1.0 is out, featuring Apache Iceberg integration for declarative management of namespaces, tables, and views. It also includes multi-cluster orchestration and Confluent Cloud RBAC.

reddit.com kafka

57d

Benchmark demonstrates 5-37x improved performance for query on Iceberg tables

StarTree claims a 5-37x performance improvement for queries on Iceberg tables compared to Trino and ClickHouse, based on their benchmark.

startree.ai iceberg

57d

RAG Hallucinates — I Built a Self-Healing Layer That Fixes It in Real Time

This article presents a self-healing layer designed to detect and correct hallucinations in RAG systems before they reach users by addressing issues in reasoning.

towardsdatascience.com llm

57d

Comparing ClickHouse versions with clickhousectl

We use clickhousectl to spin up multiple ClickHouse versions side by side and benchmark two recent performance improvements.

clickhouse.com clickhouse

57d

Unlocking large scale AI training networks with MRC (Multipath Reliable Connection)

OpenAI introduces MRC (Multipath Reliable Connection), a new supercomputer networking protocol released via OCP to improve resilience and performance in large-scale AI training clusters.

openai.com llm

57d

Realtime or Pipelines? How to choose the right tool

This article compares 'Realtime' and 'Pipelines' data processing approaches, both leveraging Postgres logical replication. It explains how these two methods address different problems and guides users on selecting the right solution for their use case.

supabase.com postgres

57d

Little's Law in practice with Cloud Topics

From spinning disks to CPUs to cloud object storage, shifting bottlenecks have shaped Redpanda's architecture. Here’s what Cloud Topics revealed about today’s demand for high-latency storage.

redpanda.com streaming

57d

OpenAI and PwC collaborate to reimagine the office of the CFO

OpenAI and PwC are collaborating to help businesses automate finance workflows using AI agents, with the aim of improving forecasting, strengthening controls, and modernizing the CFO function.

openai.com llm

57d

PGKeeper: Figma's Postgres connection pooler Renaissance era

Figma details the architecture and implementation of PGKeeper, their custom Postgres connection pooler, explaining why existing solutions like PgBouncer didn't meet their needs.

figma.com postgres

58d

Goodbye limitations, hello data: How Qonto is rethinking observability with ClickHouse Cloud

How Qonto uses ClickHouse Cloud to power observability at scale — replacing sampling and hour-capped queries with two-week query windows, 99.84% compression on high-cardinality data, and an AI incident companion built on the ClickHouse MCP server.

clickhouse.com clickhouse

58d

The DuckLake Spec is so Simple, Even a Clanker Can Build One for Dataframes

This blog post discusses the simplicity of the DuckLake specification for dataframes.

duckdb.org duckdb

58d

How OpenAI delivers low-latency voice AI at scale

OpenAI describes how it rebuilt its WebRTC stack to power real-time Voice AI with low latency, global scale, and seamless conversational turn-taking.

openai.com llm

58d

Icestream – enabling efficient streaming writes in Apache Iceberg

Icestream is a project enabling efficient streaming writes in Apache Iceberg.

github.com iceberg

58d

I'm working on a conversational analytics agent builder with dedicated DuckDB support

A developer is building a no-code agent builder that uses DuckDB to create conversational analytics agents. These agents can respond to queries with interactive charts and UI, allowing users to query databases more easily.

reddit.com duckdb

59d

How to Work and Compound with AI

This post proposes a framework for leveraging AI, emphasizing context as infrastructure, taste as configuration, verification for autonomy, scaling through delegation, and closing feedback loops for continuous improvement.

eugeneyan.com ml

59d

[Extension] ducksmiles — community extension for chemistry data (SMILES / InChI / PDB)

A user shares a community extension for DuckDB called `ducksmiles` that enables processing chemistry data, including SMILES, InChI, and PDB formats, directly within DuckDB. The extension supports functions like `mol_formula` and `mol_weight` and integrates with `read_csv_auto`, `read_text`, and `htt

reddit.com duckdb

60d

Code Orange: Fail Small is complete. The result is a stronger Cloudflare network

Cloudflare completed an engineering effort to make its infrastructure more resilient using tools like Snapstone and the Engineering Codex. They implemented safer configuration changes and automated best practices to prevent future incidents.

blog.cloudflare.com engineering

60d

Optimizing ML Workload Network Efficiency (Part I): Feature Trimmer

This post from Pinterest Engineering focuses on optimizing network efficiency for machine learning workloads, presenting the first part of their strategy. It introduces and explains the 'Feature Trimmer,' a component designed to reduce network overhead in ML systems.

medium.com mlops

61d

How we accelerated transpilation by compiling SQLGlot with mypyc | Blog | Fivetran

Fivetran accelerated the transpilation of SQL dialects in SQLGlot by compiling it with mypyc, resulting in faster translation between different SQL dialects for query engines.

fivetran.com data-engineering

61d

Reliable Answers for Recurring Questions: Boosting Text-to-SQL Accuracy with Template Constrained Decoding

This arXiv paper introduces a method to improve Text-to-SQL accuracy by using template-constrained decoding, particularly for recurring questions. It addresses challenges in real-world deployment of Text-to-SQL models, especially in complex or unseen schemas.

arxiv.org semantic-layer

61d

DuckDB infers NULL-only columns as JSON type — here's how we fixed it with a canonical sample

DuckDB infers NULL-only columns as the generic JSON type, causing staging issues when real values appear later. The solution involves using a synthetic canonical sample to ensure correct type inference from the outset.

reddit.com duckdb

62d

Why AI Engineers Are Moving Beyond LangChain to Native Agent Architectures

This post discusses the shift of AI engineers from using frameworks like LangChain to building native agent architectures for production LLM applications.

towardsdatascience.com agents

62d

How LanceDB Accelerates Vector Search at 10 Billion Scale

The article explains how LanceDB scales vector search to 10 billion vectors and beyond. It covers the application of distributed indexing, distributed query execution, HNSW centroid routing, and fast RaBitQ rotation.

lancedb.com vector-db

62d

Where the goblins came from

The post discusses the timeline, root cause, and fixes behind "goblin outputs," which are personality-driven quirks in GPT-5 behavior.

openai.com llm

62d

A DuckDB extension for vector search indexes with pluggable quantization

This post links to a DuckDB extension for vector search indexes with pluggable quantization.

github.com duckdb

62d

Giving agents the ability to pay

Stripe introduces Link’s wallet for agents, offering programmatic access to generate one-time-use cards or Shared Payment Tokens, built on Stripe’s new Issuing for agents.

stripe.com engineering

63d

I turned recurring Kafka production failures into a practical troubleshooting guide

A user shares a practical troubleshooting guide derived from recurring Kafka production failures, covering issues like consumer lag, producers writing without consumers reading, and duplicate processing after restart due to offset commit problems.

reddit.com kafka

63d

Skipper: Building Airbnb’s embedded workflow engine

This article details the development of Skipper, Airbnb's custom-built embedded workflow engine. It covers the architectural decisions, design principles, and operational experiences involved in creating a specialized orchestration solution for internal use cases.

medium.com orchestration

64d

Lance Blob V2: Making Multimodal Data a First-Class Citizen in the Lakehouse

The article explains the redesign of blob storage within the Lance format to elevate multimodal data to a first-class citizen. It outlines four storage semantics (Inline, Packed, Dedicated, External) that automatically adjust to different workloads.

lancedb.com data-engineering

64d

How Stripe Detects Fraudulent Transactions Within 100 ms

This article explores Stripe's Radar system for detecting fraudulent transactions within 100 ms, detailing the architectural decisions behind its effectiveness.

blog.bytebytego.com architecture

64d

Building A Storage Format For The Next Era of Biology

The article explores how Lance can serve as a foundation for AI systems utilizing single-cell genomics atlases, paving the way for a new generation of biological modeling. It discusses the technical aspects of a storage format designed for these applications.

lancedb.com vector-db

64d

From Clicks to Conversions: Architecting Shopping Conversion Candidate Generation at Pinterest

This post describes the architecture Pinterest implemented for its shopping conversion candidate generation process, tracing the journey from user clicks to identifying potential conversions. It outlines the system design and components involved in this complex data pipeline.

medium.com ml

65d

Iceberg Deletion Vectors: The Better Way to Delete Rows

The post discusses how Iceberg deletion vectors offer a more efficient way to handle row deletions in data lakehouses, where deleting rows can be an expensive operation due to the immutable nature of Parquet files.

dremio.com lakehouse

65d

Mixing numeric attributes into text search for better first-stage relevance

turbopuffer now allows combining attribute values into the scoring function of text queries. Ranking by attribute helps achieve better relevance in the first stage with the same scalability characteristics as BM25.

turbopuffer.com vector-db

65d

Choco automates food distribution with AI agents

Choco used OpenAI APIs to streamline food distribution, boost productivity, and unlock growth, providing a customer story on the real-world impact of AI.

openai.com llm

65d

The Journey from Scattered Data to an Apache Iceberg Lakehouse with Governed Agentic Analytics

The article outlines a strategy for modernizing data platforms by migrating to an Apache Iceberg lakehouse. It suggests avoiding long ETL pipeline builds and focusing on faster time-to-value for analysts.

dremio.com lakehouse

66d

Pgrx: Build Postgres Extensions with Rust

Pgrx is a framework for building PostgreSQL extensions using Rust, enabling developers to leverage Rust's safety and performance features within the Postgres environment.

github.com postgres

67d

What's New in pg_clickhouse - JSONB Support, SQL value functions, Streaming, and more

Recent pg_clickhouse releases introduce JSONB, date/time, and array function pushdown, plus HTTP result set streaming for lower memory usage.

clickhouse.com clickhouse

68d

Use whisper.cpp within DuckDB to translate / transpile speech to text

The article discusses using whisper.cpp within DuckDB to translate speech to text.

github.com duckdb

68d

SQLyzr: A Comprehensive Benchmark and Evaluation Platform for Text-to-SQL

The paper introduces SQLyzr, a benchmark for evaluating text-to-SQL models, which have improved due to large language models. The platform addresses shortcomings in existing benchmarks.

arxiv.org semantic-layer

68d

An Alternate Agentic AI Architecture (It's About the Data)

The paper argues that the dominant approach in agentic AI, where large language models orchestrate information access by dynamically selecting tools, is misguided. It proposes an alternative architecture focused on data.

arxiv.org agents

68d

Research on the efficiency of data loading and storage in Data Lakehouse architectures for the formation of analytical data systems

The paper studies the efficiency of loading and storing data in Apache Hudi, Apache Iceberg, and Delta Lake using Apache Spark.

arxiv.org iceberg

68d

DeepSeek-V4: a million-token context that agents can actually use

The post discusses DeepSeek-V4, a model with a million-token context that agents can use.

huggingface.co ml

68d

Do you still need Elasticsearch for log analytics? ClickHouse says no.

ClickHouse positions itself as an alternative to Elasticsearch for log analytics by combining full-text search and large-scale analytics. The post alludes to a performance benchmark.

clickhouse.com clickhouse

69d

We mapped unauthenticated Vector DBs exposing corporate AI data

The article highlights a significant security vulnerability where misconfigured RAG pipelines are exposing vector databases to the public internet. A live map visualizes the scale of the leak, emphasizing the failure of perimeter security in the AI space.

news.ycombinator.com vector-db

69d

Background Coding Agents: Supercharging Downstream Consumer Dataset Migrations (Honk, Part 4)

The article describes how Spotify used Honk, Backstage, and Fleet Management to ease the pain of migrating thousands of datasets.

engineering.atspotify.com engineering

70d

The New GizmoSQL iOS App – DuckDB

The article introduces the new GizmoSQL iOS app, which uses DuckDB.

duckdb.org duckdb

70d

DuckDB 1.5.2 – SQL database that runs on laptop, server, in the browser

DuckDB 1.5.2 is a new release of the SQL database that runs on laptops, servers, and in the browser.

duckdb.org duckdb

70d

Using dbt with Databricks: Architecture decisions that determine success

A solution architect explains the architecture decisions critical for successful dbt implementation on Databricks, focusing on potential pitfalls and solutions.

getdbt.com dbt

70d

Why AI Agents Need Operational Context — Not Just Semantic Definitions

The article discusses the need for both semantic and operational context for AI agents to trust data. It presents how Dagster and Atlan can provide both halves of the context layer.

dagster.io agents

70d

ClickHouse Cloud on Google Cloud Now Powered by Google Axion Processors: 30–55% Faster Queries, ~15% Fewer Compute Credits

ClickHouse reports a 30–55% speedup in ClickBench queries and ~15% reduction in compute costs on Google Cloud by migrating to Axion C4A instances.

clickhouse.com clickhouse

70d

Benchmarking DuckDB From Java: Fast INSERT, UPDATE, and DELETE

The author compares different methods for efficiently modifying DuckDB from Java, highlighting performance improvements with the new UDF feature of the Java Drivers.

reddit.com duckdb

70d

Building a fault-tolerant metrics storage system at Airbnb

This article details the architecture and implementation of Airbnb's fault-tolerant system for storing operational metrics. It discusses the design choices made to ensure data durability, high availability, and scalability for critical observability data.

medium.com observability

71d

Why metric definitions matter for reliable AI agents

The article discusses how using dbt's semantic layer provides a foundation for building reliable and governed agentic analytics workflows.

getdbt.com semantic-layer

71d

Show HN: Transient – CLI Governance layer for AI agents

Transient is a CLI tool to provide a governance layer for AI agents, including permission policies and auditing. It helps answer the question of what an agent did, whether it was authorized, and if it can be proven. The tool wraps the agent process and installs quickly.

github.com agents

71d

Index sharding in ClickHouse Cloud: Petabyte-scale data needs petabyte-scale indexing

ClickHouse Cloud now shards indexes across replicas, which distributes memory usage and improves performance for petabyte-scale workloads. This change reduces memory usage, speeds up index analysis, and improves performance.

clickhouse.com clickhouse

71d

Apache Arrow 24.0.0 Release

Apache Arrow version 24.0.0 has been released with 259 resolved issues from 57 contributors. The announcement provides a link to the installation page.

arrow.apache.org arrow

71d

Me and my shadow (link!): Disaster recovery replication made easy

This Redpanda blog post introduces Shadow Linking, a feature designed to simplify disaster recovery through real-time replication.

redpanda.com streaming

71d

Databricks Cross-Workspace Orchestration with Dagster

Lakeflow Jobs cannot see across workspace boundaries. This post explains how Dagster unifies multiple Databricks workspaces into a single asset graph with real dependencies.

dagster.io orchestration

72d

Smarter URL Normalization at Scale: How MIQPS Powers Content Deduplication at Pinterest

This post from Pinterest Engineering details their approach to smarter URL normalization and content deduplication at scale. It explains how their proprietary MIQPS system contributes to efficiently identifying and handling duplicate content.

medium.com data-engineering

72d

The Security Architecture of GitHub Agentic Workflow

This article examines how GitHub built a security architecture that assumes the agent is already compromised.

blog.bytebytego.com architecture

72d

How Nava helped ELO cut infrastructure costs by 87% by migrating from Elasticsearch to ClickHouse

Nava migrated ELO's payments monitoring platform from Elasticsearch to ClickHouse, cutting storage from 12 TB to 2 TB, slashing annual infrastructure costs by 87%, and delivering sub-2-second end-to-end latency across 300 real-time dashboards.

clickhouse.com clickhouse

72d

Iceberg Row Lineage: Giving Every Row a Paper Trail

This article discusses lineage at the row level for Iceberg tables to track how specific rows were affected.

dremio.com lakehouse

72d

EvoRAG: Making Knowledge Graph-based RAG Automatically Evolve through Feedback-driven Backpropagation

The article presents EvoRAG, a Knowledge Graph-based Retrieval-Augmented Generation framework designed to improve LLM reasoning by retrieving multi-hop paths from knowledge graphs. The framework aims to address the underperformance of existing KG-RAG solutions in real-world scenarios through feedbac

arxiv.org knowledge-graphs

72d

Exploring Agentic Visual Analytics: A Co-Evolutionary Framework of Roles and Workflows

The article explores agentic visual analytics systems, where LLM-driven agents autonomously manage the full visual analytics pipeline. This approach seeks to shift users away from low-level tool manipulation towards higher-level, task-oriented interactions.

arxiv.org agents

72d

KV Cache Is Eating Your VRAM. Here’s How Google Fixed It With TurboQuant.

Explore the end-to-end pipeline of TurboQuant, a novel KV cache quantization framework. This overview breaks down how multi-stage compression achieves near-lossless storage through PolarQuant and QJL residuals, enabling massive context windows with minimal memory overhead

towardsdatascience.com ml

73d

A complete Event-Driven Architecture for Online Machine Learning (Kafka, Flink, and ClickHouse)

Hey folks. I find Online Machine Learning (OML) particularly appealing in data streaming environments, even though it hasn't yet seen widespread application across many domains. I wanted to build a complete Event-Driven Architecture that applies stateful stream processing to a real-world physical pr

reddit.com kafka

73d

Your RAG System Retrieves the Right Data — But Still Produces Wrong Answers. Here’s Why (and How to Fix It).

The article discusses a scenario where a RAG system retrieves the right data but still produces wrong answers and offers suggestions to fix this issue.

towardsdatascience.com ml

74d

AI Agents Need Their Own Desk, and Git Worktrees Give Them One

The article suggests using Git worktrees to provide AI agents with isolated workspaces for parallel coding sessions, discussing the benefits and setup considerations.

towardsdatascience.com agents

74d

How an Autonomous Driving Company Unified Multimodal Search on a Single Analytics Engine

An autonomous driving company consolidated fragmented data platforms by adopting Apache Doris as a unified analytics engine, enabling seamless search across text, vectors, labels, and metadata while reducing query times from minutes to seconds.

doris.apache.org analytics

74d

“A generational leap”: How Trio unified payment analytics and cut storage by 88% with ClickHouse Cloud

Brazilian fintech Trio cut storage by 88% and achieved a "generational leap" in speed by building a unified payment analytics platform on ClickHouse Cloud, handling 243M+ payments and 1B+ daily events with a sliding window approach for late and duplicate

clickhouse.com clickhouse

75d

What is pgvector?

pgvector is an open-source PostgreSQL extension that adds the ability to store, index, and search over vector embeddings, enabling similarity search and other vector-based operations directly within Postgres.

databricks.com postgres

75d

Agents of the Alley – Context Engineering OS for Claude Code Agents

Agents of the Alley is a Context Engineering OS for Claude Code Agents, available on Github.

github.com agents

75d

Stateful Kafka Streams on Kubernetes: 5 things that actually reduced our rebalancing

The article details five practices for reducing Kafka Streams rebalancing issues when running stateful processors with RocksDB on Kubernetes, including static membership configuration and session timeout tuning.

reddit.com kafka

75d

Go package to mount fs.FS as virtual file system on DuckDB

A Go package allows mounting fs.FS as a virtual file system in DuckDB.

reddit.com duckdb

75d

Capacity Efficiency at Meta: How Unified AI Agents Optimize Performance at Hyperscale

Meta's Capacity Efficiency Program uses an AI agent platform to automate the identification and resolution of performance issues across its infrastructure. The platform leverages encoded domain expertise and a standardized tool interface to improve efficiency.

engineering.fb.com engineering

76d

Ask Your Cluster Anything: The WarpStream MCP Server - WarpStream

The WarpStream MCP Server allows AI assistants to connect to WarpStream clusters for querying logs, diagnosing issues, and inspecting ACL events directly from the IDE.

warpstream.com streaming

76d

Post-Quantum Cryptography Migration at Meta: Framework, Lessons, and Takeaways

Meta shares lessons learned from their post-quantum cryptography (PQC) migration to assist other organizations in strengthening their resilience during the transition to post-quantum cryptography standards. They propose the idea of PQC Migration Levels to help teams manage the complex migration proc

engineering.fb.com engineering

76d

memweave: Zero-Infra AI Agent Memory with Markdown and SQLite — No Vector Database Required

The article introduces memweave, a system for agent memory that uses Markdown and SQLite, eliminating the need for a vector database.

towardsdatascience.com ml

76d

Artifacts: versioned storage that speaks Git

Cloudflare's Artifacts provides Git-compatible versioned storage for code and data, designed for agents, developers, and automations. It supports creating millions of repos and forking from any remote.

blog.cloudflare.com engineering

76d

Finding zombies in our systems: A real-world story of CPU bottlenecks

This post shares a real-world story from Pinterest about diagnosing and resolving critical CPU bottlenecks discovered within their systems. It describes the investigation process to uncover these 'zombie' processes and the strategies implemented to mitigate them.

medium.com engineering

77d

Prefill Is Compute-Bound. Decode Is Memory-Bound. Why Your GPU Shouldn’t Do Both.

The article proposes disaggregated LLM inference, where prefill (compute-bound) and decode (memory-bound) operations are handled by different GPUs, potentially leading to 2-4x cost reductions.

towardsdatascience.com ml

77d

Index-based pruning in ClickHouse

Learn how ClickHouse uses primary indexes, lightweight projections, and skip indexes to prune data before reading it. Demonstrated on a 243 million row UK property sales dataset.

clickhouse.com clickhouse

77d

Agentic Workloads on Airflow: Observable, Retryable, and Auditable by Design

The article outlines design considerations and patterns for implementing agentic workloads on Apache Airflow. It covers principles to ensure these systems are observable, retryable, and auditable in production environments.

airflow.apache.org orchestration

77d

Ask Your Survey Anything: Building AI Analysis Pipelines with Airflow 3

The article describes a system built with Apache Airflow 3 to analyze large survey datasets using AI. It enables users to pose natural language questions, which the system converts to SQL queries, executes, and returns results.

airflow.apache.org orchestration

77d

RAG Isn’t Enough — I Built the Missing Context Layer That Makes LLM Systems Work

Most RAG tutorials focus on retrieval or prompting, but the real problem starts when context grows. This article outlines a context engineering system built in pure Python that controls memory, compression, re-ranking, and token budgets to keep LLMs stable under realistic constraints.

towardsdatascience.com llm

78d

Privacy-first connections: Empowering social experiences at Airbnb

This article details the architectural and engineering approaches Airbnb uses to build privacy-first data systems that empower social experiences. It covers the design principles and technical implementations ensuring user privacy while fostering connections on the platform.

medium.com governance

78d

SHOW HN: DuckDB / DuckLake Server (With Arrow Flight SQL) for iOS

An iOS app, GizmoSQL, runs DuckDB as a server with Arrow Flight SQL and supports mounting a DuckLake, enabling a "data lake in your pocket." The app runs the TPC-H 1GB benchmark in under 2 seconds on an iPhone 16 Pro Max.

news.ycombinator.com duckdb

78d

Agent Harnesses Are Dead. Long Live Agent Harnesses.

This article discusses the evolving landscape of AI agent frameworks and harnesses, suggesting that while frameworks might be becoming cheaper, the underlying need for structured agent orchestration remains.

blog.crewai.com agents

78d

A Guide to Understanding GPUs and Maximizing GPU Utilization

This article explains how to optimize GPU efficiency, covering GPU architecture, performance bottlenecks, and optimization strategies using PyTorch and custom kernels.

towardsdatascience.com ml

78d

Context Graphs And Their Implementation: The Missing Layer Between Human Judgment and Machine Agency

The article discusses context graphs as a crucial layer between human decisions and AI agents.

mlops.community knowledge-graphs

78d

Mintlify boosts NPS 30% and saves 60% with real-time analytics on ClickHouse Cloud

Mintlify replaced PostHog with ClickHouse Cloud, resulting in faster dashboard load times, no rate limit errors, a 30% NPS improvement, and a 60% cost reduction.

clickhouse.com clickhouse

78d

I built a browser-based spreadsheet diff tool powered by DuckDB WASM — 42k rows × 14 cols in ~3 seconds, zero server (MaksPilot.com)

A developer built a browser-based spreadsheet diff tool using DuckDB WASM to compare Excel/CSV files, highlighting differences in 42k rows × 14 cols in approximately 3 seconds without a server. The tool handles date formats, floating point noise, and case inconsistencies.

reddit.com duckdb

78d

How a Global CPG Automates Supply Chain Demand Forecasting with Agentic AI

This blog post details how agentic AI can be used to automate demand forecasting in CPG supply chains, potentially improving accuracy and efficiency while reducing manual work.

blog.crewai.com agents

78d

Ducklake’s architecture makes so much sense, and really highlights the drawbacks of using the object store itself for metadata like Iceberg does. Ducklake+Motherduck seem well positioned to take Snowflake customers. What differentiates motherduck’s technical architecture from Snowflake’s?

This Reddit post discusses DuckDB's architecture, comparing Ducklake to Snowflake and highlighting potential drawbacks of using object stores for metadata like Iceberg does. The post explores what differentiates MotherDuck's technical architecture from Snowflake's.

reddit.com duckdb

78d

Introducing the Common AI Provider: LLM and AI Agent Support for Apache Airflow

The article introduces the `apache-airflow-providers-common-ai` package, which brings native LLM and AI agent capabilities directly into Apache Airflow. This provider package offers direct integration rather than wrapping external frameworks.

airflow.apache.org orchestration

78d

Scaling Recommendation Systems with Request-Level Deduplication

This post details Pinterest's strategy for scaling their recommendation systems through the implementation of request-level deduplication. It explains the architectural considerations and benefits of this optimization technique for high-throughput ML serving.

medium.com ml

79d

How LinkedIn Feed Uses LLMs to Serve 1.3 Billion Users

In this article, we will look at how the LinkedIn engineering team rebuilt the Feed and the challenges they faced.

blog.bytebytego.com llm

79d

DuckLake v1.0

DuckLake v1.0 has been released.

reddit.com duckdb

79d

Durable Objects in Dynamic Workers: Give each AI-generated app its own database

Cloudflare introduces Durable Object Facets, which allows Dynamic Workers to instantiate Durable Objects with isolated SQLite databases, enabling platforms that run persistent, stateful code generated on-the-fly.

blog.cloudflare.com engineering

79d

Agents have their own computers with Sandboxes GA

Cloudflare Sandboxes provide AI agents with persistent, isolated environments, including a shell, filesystem, and background processes.

blog.cloudflare.com engineering

79d

Dynamic, identity-aware, and secure Sandbox auth

Outbound Workers for Sandboxes provide a programmable, zero-trust egress proxy for AI agents. This allows developers to inject credentials and enforce dynamic security policies without exposing sensitive tokens to untrusted code.

blog.cloudflare.com agents

79d

Announcing DuckDB 1.5.2

DuckDB v1.5.2 is a patch release. DuckLake is released.

duckdb.org duckdb

79d

DuckLake 1.0

The article announces the release of DuckLake 1.0.

duckdb.org duckdb

79d

Follow up: I actually built the Kafka Streams state recovery thing. It works.

This post discusses a custom-built solution for Kafka Streams state store recovery during disaster recovery, addressing a previously unsolved problem in the ecosystem.

reddit.com kafka

80d

Your ReAct Agent Is Wasting 90% of Its Retries — Here’s How to Stop It

Most ReAct-style agents are silently wasting their retry budget on errors that can never succeed. In a 200-task benchmark, 90.8% of retries were spent on hallucinated tool calls — not model mistakes, but architectural flaws. This article shows why prompt tuning won’t fix it, and the three structural

towardsdatascience.com ml

80d

DuckDB Meets Data Lakes [video]

Walkthrough of querying data lake files with DuckDB, covering Parquet, Iceberg, and S3 integration patterns.

youtube.com duckdb

80d

Your harness, your memory

This LangChain blog post discusses the growing importance of agent harnesses in building AI agents and their connection to agent memory. It highlights the potential drawbacks of using closed harnesses, particularly those behind proprietary APIs, which can limit control over the agent.

blog.langchain.com agents

81d

LineageScope – static analyzer for SQL, dbt, Airflow, Spark, and data contracts

LineageScope is a static analyzer for SQL, dbt, Airflow, Spark, and data contracts. It aims to provide data lineage and enforce data contracts.

github.com lineage

81d

Why Every AI Coding Assistant Needs a Memory Layer

The article argues that AI coding assistants require a persistent memory layer to overcome the limitations of stateless LLMs. This memory layer improves code quality by providing systematic context across sessions.

towardsdatascience.com ml

81d

Evaluating Netflix Show Synopses with LLM-as-a-Judge

This Netflix tech blog post is about evaluating show synopses with LLM-as-a-Judge.

netflixtechblog.com llm

82d

Show HN: Formally Verified Leaderless Log Protocol for Kafka

This post announces the open-sourcing of a formally verified TLA+ specification for a leaderless log protocol for Kafka, highlighting the discovery of a design bug through verification. It also mentions using Claude Code to generate a working Rust implementation from the specification, demonstrating

github.com kafka

82d

Design and Implementation of DuckDB Internals

This article from the DuckDB website discusses the design and implementation of DuckDB internals, which is useful for understanding its architecture and performance characteristics.

duckdb.org duckdb

82d

Context Engineering for AI Coding Agents

This article discusses context engineering techniques for AI coding agents, specifically focusing on Claude code sub-agents. It explores how to structure prompts and context to improve the performance of AI coding assistants.

amux.io agents

83d

The golden rules of agent-first product engineering

This article outlines golden rules for agent-first product engineering. It explores considerations for designing products around AI agents and their capabilities.

newsletter.posthog.com agents

83d

Escaping the Fork: How Meta Modernized WebRTC Across 50+ Use Cases

Meta shares its approach to modernizing WebRTC, the technology powering real-time audio and video across their platforms. The article highlights the challenges of forking a large open-source project and how Meta addressed them to stay aligned with community upgrades.

engineering.fb.com engineering

83d

Deep Agents Deploy: an open alternative to Claude Managed Agents

Deep Agents deploy is being launched in beta as a way to deploy a model agnostic, open source agent harness in a production ready way. It is built on Deep Agents and designed for an open world.

blog.langchain.com llm

83d

Tutorial: How to build a simple text-to-SQL agent that can automatically recover from bad SQL

This Reddit post links to a DuckDB tutorial demonstrating how to build a text-to-SQL agent that can automatically recover from bad SQL queries. The approach leverages a tool-calling agent loop to inspect and correct errors.

reddit.com duckdb

83d

LASER: A Data-Centric Method for Low-Cost and Efficient SQL Rewriting based on SQL-GRPO

This paper presents LASER, a data-centric method for low-cost and efficient SQL rewriting based on SQL-GRPO, aimed at transforming queries into more efficient variants for database optimization.

arxiv.org databases

83d

AV-SQL: Decomposing Complex Text-to-SQL Queries with Agentic Views

This paper introduces AV-SQL, a method for decomposing complex Text-to-SQL queries using agentic views, aimed at improving the accuracy and efficiency of natural language to SQL translation.

arxiv.org databases

83d

SQLStructEval: Structural Evaluation of LLM Text-to-SQL Generation

This paper introduces SQLStructEval, a framework for evaluating the structural reliability of LLM-generated SQL queries, investigating the structural behavior of these queries to determine if they are sound.

arxiv.org databases

83d

What Chipotle Can Teach Us About Real-Time Data Products | Materialize

This article draws parallels from Chipotle's operational model to discuss real-time data product architectures. It explores a third option that balances fresh data and fast queries, applicable to building data products for modern applications and AI agents.

materialize.com streaming

83d

Oracle CDC now available in Redpanda Connect

Redpanda Connect now offers native CDC for Oracle, enabling real-time data access without requiring rearchitecting. The solution eliminates the need for a JVM, middleware, and related operational overhead.

redpanda.com streaming

83d

ClickHouse at FOSDEM 2026

This post recaps ClickHouse's involvement at FOSDEM 2026 in Brussels. It highlights the community's activities during the event.

clickhouse.com clickhouse

84d

Show HN: 500k+ events/sec transformations for ClickHouse ingestion

This post highlights GlassFlow's work on achieving high-throughput (500k+ events/sec) transformations for ClickHouse ingestion, particularly in observability and real-time analytics pipelines. It addresses challenges related to scaling throughput.

github.com clickhouse

84d

From 1 to 1 Million: How Agent Taskflow Built a Scalable AI Future with AWS and Confluent

This Confluent blog post details how Agent Taskflow built a production-grade AI orchestration platform using Confluent and AWS, avoiding self-managed Kafka. The article focuses on scalability and the benefits of using managed services.

confluent.io kafka

84d

Performance for Everyone

This post from Pinterest Engineering explores their initiatives and approaches to improving overall system performance for a wide range of users and services. It likely covers methodologies, tooling, or cultural shifts to foster a performance-first mindset.

medium.com engineering

84d

The three villains to agentic observability: retention, sampling and rollups

This blog post argues that retention limits, sampling, and metric roll-ups hinder AI-driven observability workflows. It suggests that these practices are inadequate for handling the demands of full-fidelity data required by modern AI.

clickhouse.com clickhouse

84d

From bytecode to bytes: automated magic packet generation

Cloudflare's blog post details how they automated the generation of malware trigger packets using symbolic execution on BPF bytecode. By leveraging the Z3 theorem prover, they significantly reduced analysis time, improving their ability to detect and respond to threats.

blog.cloudflare.com engineering

84d

DuckLake's 900x Speed Claim:A Database in Your Catalog Is Worth Two in the Cloud

This blog post analyzes DuckLake's claim of a 900x speed improvement by inlining data into the catalog, a topic of interest for data engineers optimizing query performance.

banandre.com duckdb

84d

Can You Trust the Vectors in Your Vector Database? Black-Hole Attack from Embedding Space Defects

This paper proposes a "Black-Hole Attack" against vector databases, where malicious vectors injected near the geometric center can compromise retrieval. It highlights the need for security considerations in vector database design.

arxiv.org vector-db

84d

Cortex AISQL: A Production SQL Engine for Unstructured Data

This paper introduces Cortex AISQL, a production SQL engine from Snowflake that integrates native semantic operations directly into SQL. This allows users to combine relational operations with semantic reasoning for querying unstructured data.

arxiv.org snowflake

84d

Managing the Context Window | Airbyte

The article discusses effective strategies for managing the context window in AI agents. It emphasizes improving performance, reducing costs, and maintaining relevant outputs, which is valuable for optimizing AI systems.

airbyte.com agents

84d

Run local AI queries on your data lake with DuckDB and Claude | Blog | Fivetran

This post describes how to combine Fivetran, DuckDB, and Claude to enable conversational analytics on a data lake, highlighting the potential for interacting with data through natural language queries using an MCP server.

fivetran.com duckdb

85d

Building a high-volume metrics pipeline with OpenTelemetry and vmagent

This article details the construction of Airbnb's high-volume metrics pipeline, outlining the integration of OpenTelemetry and vmagent. It covers the architectural considerations, implementation specifics, and operational insights for processing vast amounts of observability data.

medium.com observability

85d

Evolution of Multi-Objective Optimization at Pinterest Home feed

This post outlines the evolution of multi-objective optimization techniques implemented for the Pinterest Home feed, tracing how these complex systems have developed over time. It describes the challenges and solutions in balancing multiple optimization goals for user experience.

medium.com ml

85d

PFC-JSONL just merged into the DuckDB Community Hub — block-level timestamp filtering for compressed JSONL logs

The PFC-JSONL extension, which enables block-level timestamp filtering for compressed JSONL logs, has been merged into the DuckDB Community Hub. This allows users to query compressed JSONL files more efficiently using DuckDB.

reddit.com duckdb

85d

From 4 Weeks to 45 Minutes: Designing a Document Extraction System for 4,700+ PDFs

The post describes a hybrid PyMuPDF and GPT-4 Vision pipeline that significantly reduced the time required for document extraction from PDFs, replacing manual effort, and discusses why the latest models weren't always the optimal solution.

towardsdatascience.com ml

85d

Engineering An AI Agent To Navigate Large-scale Event Data – Part 2

This article delves into the design of an AI agent for navigating large-scale event data, focusing on transforming query patterns into intelligent tools and crafting an effective agent architecture, which offers practical insights into building agents for complex data environments.

mlops.community mlops

85d

Context Engineering for AI Agents: A Deep Dive

This article discusses techniques for optimizing context, a finite resource, when designing AI agents, focusing on how to best utilize available information to enhance agent performance.

towardsdatascience.com llm

85d

ClickHouse Release 26.3

ClickHouse version 26.3 introduces async inserts by default, improved JOIN reordering, and materialized CTEs. These features could improve query performance and data management for users.

clickhouse.com clickhouse

85d

Apache Arrow ADBC 23 (Libraries) Release

The Apache Arrow team announced the version 23 release of the Apache Arrow ADBC libraries, which includes 41 resolved issues from 20 contributors. This release focuses on the libraries, which are at version 23, with the API specification versioned separately.

arrow.apache.org arrow

85d

Apache Airflow 3.2.0: Data-Aware Workflows at Scale

The article announces the release of Apache Airflow 3.2.0, focusing on data-aware workflows. Key features include asset partitioning for granular pipeline orchestration and support for multi-team deployments at enterprise scale.

airflow.apache.org orchestration

85d

Stop Answering the Same Question Twice: Interval-Aware Caching for Druid at Netflix Scale

This Netflix tech blog post discusses interval-aware caching for Druid at scale. The article likely contains valuable insights into Druid performance and optimization strategies.

netflixtechblog.com engineering

85d

How Enterprise AI SaaS Closes Adoption Gaps with Multi-Agent Crews

Enterprise AI SaaS automates customer enablement with a 5-agent workflow to close adoption gaps, reduce churn, and scale training across industries

blog.crewai.com agents

85d

How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines

Meta shares how they used AI to map tribal knowledge within large-scale data pipelines, highlighting the challenges and limitations of AI coding assistants' understanding of complex codebases.

engineering.fb.com engineering

86d

A Guide to Context Engineering for LLMs

This ByteByteGo article explores context engineering for LLMs, explaining how LLMs process information and outlining strategies to improve context utilization.

blog.bytebytego.com architecture

86d

How Respan is scaling LLM observability with ClickHouse Cloud

Respan AI migrated to ClickHouse Cloud for high-throughput LLM observability after outgrowing Postgres. The company now processes 50 million daily events using incremental materialized views.

clickhouse.com clickhouse

86d

Eight years of wanting, three months of building with AI

This article discusses the author's experience building with AI over three months after eight years of wanting to, likely covering lessons learned and insights gained.

simonwillison.net llm

86d

Continual learning for AI agents

This LangChain blog post discusses continual learning for AI agents, highlighting that learning occurs at the model, harness, and context layers, not just model weight updates. Understanding these distinctions is crucial for building systems that improve over time.

blog.langchain.com llm

86d

Syntaqlite Playground

The article introduces a Syntaqlite Playground, which is related to dbxlite and Metastax. It's a useful tool for Staff+ level data engineers, ML engineers, and analytics practitioners.

simonwillison.net llm

87d

Powering Multimodal Intelligence for Video Search

Netflix details how they are using multimodal intelligence to improve video search capabilities. The article likely covers the engineering challenges and solutions involved in building and deploying such a system at scale.

netflixtechblog.com engineering

88d

Operationalize analytics agents: dbt AI updates + Mammoth’s AE agent in action

This dbt blog post discusses how to operationalize analytics agents by building context for LLM models using dbt and MCP servers. It explores practical ways to integrate LLMs into analytics workflows.

getdbt.com dbt

88d

What Is a Collation, and Why Is My Data Corrupt? – PG Phridays with Shaun Thomas

This article describes how a glibc update in 2018 silently invalidated Postgres text indexes, leading to incorrect query results. It highlights the importance of understanding collation settings for maintaining data integrity in Postgres.

news.ycombinator.com postgres

89d

How My Agents Self-Heal in Production

This post details a self-healing deployment pipeline for a GTM Agent. The system automatically detects regressions after each deploy, determines if the change caused the regression, and uses an agent to create a pull request with a fix, minimizing manual intervention.

blog.langchain.com llm

89d

From Argo to Temporal Migration: How We Rebuilt Atlan’s Workflow Orchestration

Atlan describes their migration of workflow orchestration from Argo to Temporal. The post details the reasons for the migration, the crossover architecture used during the transition, and lessons learned from rebuilding orchestration in production.

blog.atlan.com orchestration

89d

Towards Robustness: A Critique of Current Vector Database Assessments

This paper critiques the use of average recall as the dominant metric for evaluating vector databases, which are crucial in AI systems. It argues that relying solely on average recall can be problematic for users and researchers optimizing these systems.

arxiv.org vector-db

89d

Multi-Objective Agentic Rewrites for Unstructured Data Processing

This paper discusses DocETL, a declarative system for LLM-powered data processing that has gained traction across various domains. DocETL allows users to define complex data processing pipelines using LLMs, enabling tasks like information extraction and data transformation from unstructured document

arxiv.org llm

89d

You're building agent security in the wrong order

The agent security market is rapidly developing with companies offering runtime identity enforcement and permission revocation for autonomous agents.

blog.crewai.com agents

89d

Agentic Coding at ClickHouse

ClickHouse details their work on agentic coding. The article likely details the practical implementations and potential benefits of this approach within the ClickHouse ecosystem.

clickhouse.com clickhouse

89d

Smarter Live Streaming at Scale: Rolling Out VBR for All Netflix Live Events

Netflix details the implementation of Variable Bit Rate (VBR) for all their live events to improve streaming quality.

netflixtechblog.com engineering

89d

How to Orchestrate dbt with Dagster

This article describes how to use Dagster's dbt integration to run and monitor dbt models as part of a larger asset-driven pipeline, focusing on lineage and scheduling improvements.

dagster.io orchestration

89d

Debug Dagster Code with Docker

Learn step-by-step how to debug Dagster pipelines directly inside Docker, bridging development and deployment environments with practical tools.

dagster.io orchestration

89d

Dagster Cloud: 5X Faster Deployments

Warm Docker containers reduce cold starts, learn how Dagster Cloud deploys 5x faster using PEX-based builds.

dagster.io orchestration

89d

High-Performance Python for Pipelines

Use proven tips to make your Python code faster and more efficient, especially for data engineering and pipeline-heavy workloads.

dagster.io orchestration

89d

KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure

Meta details KernelEvolve, a Ranking Engineer Agent used to optimize AI infrastructure for ads ranking, focusing on autonomous design, execution, and analysis of ranking model experiments.

engineering.fb.com engineering

89d

Open Models have crossed a threshold

LangChain reports that open models like GLM-5 and MiniMax M2.7 are now comparable to closed frontier models on agent tasks like file operations and tool use. The article presents evaluation results and instructions for using these open models.

blog.langchain.com llm

90d

Why we're rethinking cache for the AI era

Cloudflare discusses the challenges and opportunities in cache design presented by the explosion of AI-bot traffic, detailing the differences between AI bot traffic and human traffic and providing some early ideas for system design.

blog.cloudflare.com engineering

90d

The Missing Interface in Data Platform Engineering

This article discusses how data leaders should design the interface between data platforms and the teams that rely on them. It emphasizes the importance of clear boundaries and well-defined responsibilities in data platform engineering.

dataengineeringweekly.com data-engineering

90d

Supercharging Redpanda Streaming with profile-guided optimization

Redpanda details how they used profile-guided optimization to improve performance and reduce latency in Redpanda 26.1. The article offers a behind-the-scenes look at their optimization process.

redpanda.com streaming

90d

Data Inlining in DuckLake: Unlocking Streaming for Data Lakes

This blog post from the DuckDB team introduces data inlining in DuckLake to enable streaming for data lakes. It details the motivation, implementation, and benefits of this approach, including improved performance and reduced latency.

duckdb.org duckdb

90d

Rill and ClickHouse: real-time operational BI for a metered world

Rill leverages ClickHouse to deliver real-time operational BI for over 100 billion daily events. The integration offers instant data exploration and conversational analytics via a declarative, BI-as-code workflow.

clickhouse.com clickhouse

91d

Redshift Files: The Hunt for Big Data

MotherDuck explores the performance and challenges of working with large datasets in Redshift.

motherduck.com duckdb

91d

Dagster 1.12: Refinement and Acceleration

Dagster 1.12 introduces a redesigned UI, Components GA, streamlined deployment workflows, and major orchestration upgrades. These enhancements aim to make data orchestration faster, simpler, and more reliable for users.

dagster.io orchestration

91d

Gradient Labs gives every bank customer an AI account manager

Gradient Labs is deploying AI account managers for banks using GPT-4.1 and GPT-5.4 models. These agents aim to automate banking support with low latency and high reliability.

openai.com llm

91d

Multimodal Embeddings and RAG: A Practical Guide

This blog post explains multimodal embeddings for searching across different data types (text, images, audio, video) in RAG systems. It provides practical implementations using Weaviate and Gemini.

weaviate.io vector-db

91d

DuckDB Now Speaks Dutch!

This DuckDB blog post humorously explores an alternate reality where Dutch, not English, became the dominant language for SQL. It poses the question of how this linguistic shift might have shaped the development and standardization of SQL.

duckdb.org duckdb

91d

The Missing Context Layer: Why Your LLM Agent Can't Do More Than Text-to-SQL | Airbyte

The Airbyte blog discusses why LLM agents often struggle to move beyond basic text-to-SQL functionality. It argues that a missing context layer is crucial for enabling truly intelligent, action-driven AI systems.

airbyte.com data-engineering

91d

Show HN: Dux, distributed DuckDB-backed dataframes on the Beam

The article introduces Dux, distributed DuckDB-backed dataframes on Apache Beam. This is a .

github.com duckdb

91d

ClickHouse BYOC on Google Cloud now Generally Available

ClickHouse has announced the general availability of its Bring Your Own Cloud (BYOC) offering on Google Cloud. This allows users to run ClickHouse within their own Google Cloud account while maintaining full data sovereignty and zero-trust networking.

clickhouse.com clickhouse

92d

Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads

Meta is scaling its Ads Recommender runtime models to LLM-scale and complexity using the Meta Adaptive Ranking Model. This article discusses how they are bending the inference scaling curve to deliver better experiences for people and better results for advertisers using AI recommendation systems.

engineering.fb.com engineering

92d

Secure cross-VPC and cross-account access to Amazon MSK Serverless — walkthrough on the AWS Big Data Blog

This blog post from AWS covers how to give Kafka clients in different VPCs and AWS accounts secure private access to MSK Serverless clusters. The solution addresses the limitation of MSK Serverless supporting PrivateLink connectivity for up to 5 VPCs in the same account.

reddit.com kafka

92d

Engineering the Memory Layer For An AI Agent To Navigate Large-scale Event Data

This article details the data schema, embeddings, and graph design for an agentic query engine on ApertureDB, focusing on managing mixed data types. It is a .

mlops.community agents

92d

Exqutor: Extended Query Optimizer for Vector-augmented Analytical Queries

This paper introduces Exqutor, an extended query optimizer designed for vector-augmented analytical queries, particularly in Retrieval-Augmented Generation (RAG) pipelines. It aims to improve the efficiency of retrieving relevant external knowledge for large language model inference.

arxiv.org databases

92d

How Padlet uses ClickHouse Cloud to power real-time classroom analytics

Padlet leverages ClickHouse Cloud to provide real-time analytics for classrooms, processing billions of events per month with sub-second query performance without requiring a dedicated data team, showcasing ClickHouse's scalability and ease of use.

clickhouse.com clickhouse

93d

kernel-anvil: 2x decode speedup on AMD by auto-tuning llama.cpp kernels per model shape

This post introduces kernel-anvil, a tool for auto-tuning llama.cpp kernels on AMD GPUs, resulting in a 2x decode speedup by optimizing kernel configurations per model shape. The tool profiles GGUF model layer shapes and generates optimal kernel configs loaded at runtime, without recompilation.

reddit.com llm

93d

Under the hood: Redpanda Cloud Topics architecture

This article describes the architecture of Redpanda Cloud Topics, a new replication mechanism that uses object storage to reduce costs. The discussion of internals is valuable for engineers working with streaming data.

redpanda.com streaming

93d

Making HNSW Work with JOINs and WHERE Clauses on DuckDB

This article explains how to use HNSW indexes effectively with JOINs and WHERE clauses in DuckDB, demonstrating how to combine approximate nearest neighbor search with standard SQL operations for efficient data retrieval.

cigrainger.com duckdb

93d

Self-Healing Neural Networks in PyTorch: Fix Model Drift in Real Time Without Retraining

The article describes a technique for creating self-healing neural networks using PyTorch. It demonstrates how to detect model drift and adapt in real time without retraining.

towardsdatascience.com ml

94d

Kafka Transactions Explained, Including Two Protocol Hacks - WarpStream

This WarpStream blog post explains Kafka transactions and contrasts them with WarpStream's implementation, which eliminates stateful brokers. It highlights how WarpStream achieves the same protocol guarantees with different internal architecture.

warpstream.com streaming

95d

How WarpStream Reduces Kafka Infrastructure Costs: A TCO Breakdown - WarpStream

WarpStream demonstrates a 4x reduction in cloud infrastructure costs compared to self-hosted Kafka in their benchmarks. The savings are attributed to eliminating inter-AZ fees and replacing EBS with object storage.

warpstream.com streaming

95d

Zero-Downtime Patching Part 1: Prewarming

This Neon blog post discusses their approach to zero-downtime patching using prewarming techniques to ensure continuous availability of customer databases. It details their system's redundancy and failover mechanisms.

neon.com postgres

96d

Agent Evaluation Readiness Checklist

The LangChain blog post offers a checklist for evaluating AI agents, covering error analysis, dataset construction, grader design, and offline/online evaluation. The checklist is intended to help ensure production readiness.

blog.langchain.com llm

96d

Zero-Downtime Patching in Lakebase Part 1: Prewarming

This Databricks blog post discusses techniques for ensuring database availability during patching in Lakebase. It focuses on prewarming as a method to minimize downtime during updates, which is crucial for maintaining service reliability in data platforms.

databricks.com databricks

96d

Qwen 3.5 27B at 1.1M tok/s on B200s, all configs on GitHub

This post shares the configurations used to push Qwen 3.5 27B to 1,103,941 tok/s on 12 nodes with 96 B200 GPUs using vLLM. The improvements came from changes to DP, context window, FP8 KV cache, and MTP-1 speculative decoding.

reddit.com llm

97d

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Model

This paper introduces the concept of Agentic Context Engineering for self-improving language models. It is focused on improving the performance of LMs through dynamic context management, which is relevant to AI agents and context-aware data systems.

arxiv.org agents

97d

Show HN: Vizier – A physical design advisor for DuckDB

Vizier is a physical design advisor for DuckDB. It analyzes queries and recommends changes to the database layout, such as sort orders and indexes, to improve performance.

news.ycombinator.com duckdb

97d

Top 10 best practices tips for ClickHouse

This article presents ten best practices for ClickHouse, covering topics like primary key design, data types, materialized views, and join optimization. Benchmarks on a 150M row dataset illustrate the impact of these practices.

clickhouse.com clickhouse

97d

A one-line Kubernetes fix that saved 600 hours a year

Cloudflare describes a Kubernetes fix involving fsGroupChangePolicy that reduced Atlantis instance restart times from 30 minutes to 30 seconds by addressing a bottleneck in volume permission handling.

blog.cloudflare.com engineering

97d

ClickHouse is data lake ready

ClickHouse now supports direct querying of Iceberg and Delta Lake formats across major cloud catalogs. This feature eliminates the need for data migration, improving data lake accessibility.

clickhouse.com clickhouse

97d

A physical design advisor for DuckDB

A physical design advisor called Vizier has been developed for DuckDB. It analyzes queries and suggests changes to the database's physical layout, such as sort orders and indexes, to improve query performance.

reddit.com duckdb

97d

ByteHouse: ByteDance's Cloud-Native Data Warehouse for Real-Time Multimodal Data Analytics

This paper details ByteHouse, ByteDance's cloud-native data warehouse designed for real-time multimodal data analytics. It addresses the need for efficient and cost-effective data analytics infrastructures in handling the demands of intelligent data services within ByteDance's production environment

arxiv.org architecture

97d

Agent Engineering Patterns: Dealing with large tool results

This blog post from Firetiger explores strategies for handling large tool results within AI agent workflows. It discusses approaches like summarization, pagination, and streaming to manage the volume of data returned by tools used by agents.

blog.firetiger.com agents

97d

Show HN: Pgsemantic – Point at your Postgres DB, get vector search instantly

Pgsemantic is a new open-source project that enables instant vector search capabilities within a Postgres database. The tool aims to simplify the process of integrating vector embeddings for semantic search applications directly within Postgres.

github.com postgres

98d

The Case for Shared Storage - WarpStream

Shared-nothing made sense when storage was slow, but shared storage flips that tradeoff. The architectural case for building Kafka directly on object storage.

warpstream.com streaming

98d

Announcing Bento, the open source fork of the project formerly known as Benthos - WarpStream

Bento is the MIT-licensed open source fork of Benthos — stateless stream processing, no feature gating, no license traps, full connector ecosystem intact.

warpstream.com streaming

98d

Hacking the Kafka PRoTocOL - WarpStream

Kafka assumes stateful, partition-owning brokers. How WarpStream reverse-engineered it for stateless Agents. A deep dive into diskless Kafka load balancing.

warpstream.com streaming

98d

Taking out the Trash: Garbage Collection of Object Storage at Massive Scale - WarpStream

Object storage does not GC itself; compaction and retention create orphaned data at scale. Five GC strategies for S3-native distributed systems compared.

warpstream.com streaming

98d

A Trip Down Memory Lane: How We Resolved a Memory Leak When pprof Failed Us - WarpStream

When pprof hit its limits, WarpStream used gcore and viewcore to trace a goroutine leak in our control plane. A Go debugging guide for distributed systems.

warpstream.com streaming

98d

How WarpStream Powers Grafana Labs’ Redesigned Architecture - WarpStream

Grafana Labs needed tens of GiB/s with zero inter-AZ costs for Cloud Metrics. How WarpStream diskless architecture met their scale without Kafka ops overhead.

warpstream.com streaming

98d

Real-Time Website Security Monitoring with WarpStream, RisingWave, and Grafana - WarpStream

Build real-time security threat monitoring with WarpStream, RisingWave, and Grafana: one materialized view per metric, no complex pipelines, no extra infra.

warpstream.com streaming

98d

Cloud Disks are (Really!) Expensive - WarpStream

Cloud block storage costs up to 24x more per GiB than S3. The exact numbers behind why disk-based Kafka clusters drain your budget and what diskless changes.

warpstream.com streaming

98d

Deterministic Simulation Testing for Our Entire SaaS - WarpStream

WarpStream uses Antithesis to deterministically simulate its full SaaS, from signup to Kafka workloads, surfacing correctness bugs random testing misses.

warpstream.com streaming

98d

Kafka Replication Without the (Offset) Gaps - WarpStream

WarpStream Orbit replicates any Kafka-compatible cluster offset-for-offset, preserving consumer groups, ACLs, and configs for zero-gap migrations and DR.

warpstream.com streaming

98d

Secure by default: How WarpStream’s BYOC deployment model secures the most sensitive workloads - WarpStream

WarpStream BYOC needs zero cross-account IAM access. Raw data stays in your VPC, only metadata touches WarpStream. Secure by design for sensitive workloads.

warpstream.com streaming

98d

Getting Rid of (Kafka) Noisy Neighbors Without Having to Buy a Mansion - WarpStream

Cluster quotas and mirroring do not fix Kafka noisy neighbors; they shift the cost. How WarpStream Agent Groups isolate workloads without dedicated clusters.

warpstream.com streaming

98d

Announcing Schema Validation with AWS Glue Schema Registry - WarpStream

WarpStream Agents validate records against AWS Glue Schema Registry, blocking malformed data at ingest without dead-letter queues or extra infrastructure.

warpstream.com streaming

98d

Fancy Stream Processing Made (even more) Operationally Mundane - WarpStream

WarpStream natively embeds Bento inside Agents — Kafka Connect-style integrations and stream processing with zero additional infrastructure inside your VPC.

warpstream.com streaming

98d

Minimizing S3 API Costs with Distributed mmap - WarpStream

S3 API call costs can silently inflate your streaming bill. How WarpStream uses distributed memory-mapped caching across Agents to slash GET request volume.

warpstream.com streaming

98d

Anatomy of a serverless usage based billing system - WarpStream

How WarpStream built usage-based billing for its core product, separating events from metrics to keep pricing logic auditable and updatable post-facto.

warpstream.com streaming

98d

Introducing WarpStream Managed Data Pipelines for BYOC clusters - WarpStream

WarpStream Managed Data Pipelines: fully-managed Bento inside your VPC — SaaS UX, data stays in your account, YAML-driven pipelines with rollback support.

warpstream.com streaming

98d

Kafka as a KV Store: deduplicating millions of keys with just 128 MiB of RAM - WarpStream

How WarpStream implemented Kafka compacted topics with only 128 MiB RAM — tracking millions of dedup keys without traditional broker memory overhead.

warpstream.com streaming

98d

The Road to 100PiBs and Hundreds of Thousands of Partitions: Goldsky Case Study - WarpStream

How Goldsky scaled blockchain data to 100 PiB and 100K+ partitions on WarpStream for 10x cheaper than their previous Kafka vendor with zero bottlenecks. This production war story shows the benefits of WarpStream for high-volume streaming data.

warpstream.com streaming

98d

Getting started with WarpStream on Tigris - WarpStream

Run WarpStream on Tigris for globally distributed, durable Kafka streaming. This setup eliminates region-specific bucket planning and hidden data transfer fees, offering a streamlined approach to managing streaming infrastructure.

warpstream.com streaming

98d

No record left behind: How WarpStream can withstand cloud provider regional outages - WarpStream

WarpStream Multi-Region Clusters deliver RPO=0, ensuring that every acknowledged write survives full regional cloud outages without manual failover or tuning. This post highlights the robustness of WarpStream for mission-critical streaming applications.

warpstream.com streaming

98d

Structured Logging in .NET with Serilog and ClickHouse

Learn how to send structured .NET logs directly to ClickHouse using Serilog — with full schema control, full-text search, and SQL queries over your log data. This post provides a step-by-step guide for setting up and using the integration.

clickhouse.com clickhouse

98d

Inside our approach to the Model Spec

Learn how OpenAI’s Model Spec serves as a public framework for model behavior, balancing safety, user freedom, and accountability as AI systems advance. This post details the considerations and mechanisms used to ensure responsible AI deployment.

openai.com llm

98d

I Used AI to Do Real Science. It Hallucinated the Data

This article details an experience using AI for scientific research where the AI hallucinated data. It underscores the importance of verifying AI outputs, especially in data-driven fields.

ryan.endacott.me ml

98d

flexvec: SQL Vector Retrieval with Programmatic Embedding Modulation

This paper presents flexvec, a retrieval kernel that exposes the embedding matrix and score array as a programmable surface. It is designed for AI agents and offers opportunities to expose more of the retrieval pipeline to the caller.

arxiv.org vector-db

98d

Show HN: DuckDB community extension for prefiltered HNSW using ACORN-1

A developer implemented ACORN for prefiltered approximate nearest neighbors on a fork of the DuckDB VSS extension, showing significant speedups over brute-force filtering on high-dimensional vector workloads.

github.com duckdb

98d

No Classification without Represention

This article explains how Materialize enhances query performance by compiling SQL's complex type system into simpler representation types. This process helps reduce unnecessary casts, enables more effective query optimizations, and generally increases processing efficiency.

materialize.com streaming

98d

Introducing the OpenAI Safety Bug Bounty program

OpenAI launches a Safety Bug Bounty program to identify AI abuse and safety risks, including agentic vulnerabilities, prompt injection, and data exfiltration. This program encourages community participation in enhancing the security and robustness of AI models.

openai.com llm

98d

Introducing the NUMBER data type

Trino is adding support for the NUMBER data type to handle high-precision numeric types beyond the existing DECIMAL limit. This will allow Trino to query data from sources that use these types without loss of precision, improving interoperability.

trino.io trino

98d

Smarter Auto-Scaling for ClickHouse: The Two-Window Approach

ClickHouse Cloud's two-window recommender and target-tracking CPU algorithm cut scale-down latency from 30 hours to 3 hours while eliminating oscillations and reducing infrastructure costs. The post details the algorithm and its impact on autoscaling performance.

clickhouse.com clickhouse

98d

Zero Ops Schema Migration: WarpStream Schema Linking - WarpStream

WarpStream Schema Linking mirrors Confluent-compatible schema registries into a BYOC instance, preserving schema IDs and compatibility rules for disaster recovery.

warpstream.com streaming

98d

Zero Disks is Better (for Kafka) - Diskless Kafka - WarpStream

WarpStream's diskless, S3-native Kafka implementation aims to replace traditional broker-based streaming by eliminating disks, brokers, and inter-AZ costs.

warpstream.com streaming

98d

Your WarpStream Questions, Answered - WarpStream

This article answers questions about WarpStream's architecture, BYOC vs. Serverless options, pricing, Kafka compatibility, performance trade-offs, and zero-disk streaming.

warpstream.com streaming

98d

WarpStream S3 Express One Zone Benchmark and Total Cost of Ownership - WarpStream

After S3EOZ's 85% price drop, WarpStream benchmarks show 3x better latency vs. standard S3 at just 15% higher TCO. Full methodology and real workload numbers are presented, offering a practical comparison for those considering cloud-based streaming solutions.

warpstream.com streaming

99d

WarpStream Diagnostics: Keep Your Data Stream Clean and Cost-Effective - WarpStream

WarpStream Diagnostics continuously scans clusters for cost inefficiencies and health issues, surfacing actionable fixes before they become incidents. This proactive approach helps data engineers maintain optimal performance and prevent costly disruptions in their streaming data pipelines.

warpstream.com streaming

99d

WarpStream + Materialize: Simpler Streaming for Operational Data Products - WarpStream

WarpStream + Materialize centralize streaming business logic in SQL with dbt-style version control, enabling operational data products without brittle ETL pipelines. This combination offers a streamlined approach for building and managing real-time data applications.

warpstream.com streaming

99d

Unlocking Idempotency with Retroactive Tombstones - WarpStream

Kafka idempotent producers without stateful brokers require rethinking deduplication. WarpStream uses retroactive tombstones to separate data from metadata, providing a technical solution for ensuring data integrity in streaming applications.

warpstream.com streaming

99d

Tiered Storage Won’t Fix Kafka - WarpStream

Tiered storage still runs stateful brokers with expensive disks and inter-AZ replication. It does not solve the real cost problem at the heart of Kafka, offering a critical analysis of a common architectural pattern.

warpstream.com streaming

99d

The Original Sin of Cloud Infrastructure - WarpStream

OSS big data tools like Kafka were built for hyper-scalers, then given to everyone. The article discusses why on-prem assumptions in open source infra cause pain in the cloud, offering a high-level perspective on cloud infrastructure design.

warpstream.com streaming

99d

The Kafka Metric You're Not Using: Stop Counting Messages, Start Measuring Time - WarpStream

Kafka offset-based consumer lag is misleading when message sizes vary. The post shows how to instrument time-based lag metrics for an accurate view of consumer group health, offering a practical solution for monitoring streaming applications.

warpstream.com streaming

99d

The Hitchhiker's Guide to Disaster Recovery and Multi-Region Kafka - WarpStream

The article provides a practical guide to Kafka disaster recovery and multi-region data sharing. It discusses how WarpStream Active-Active clusters achieve RPO=0 with no tuning required, showcasing an approach to building highly available streaming infrastructure.

warpstream.com streaming

99d

How Moda Builds Production-Grade AI Design Agents with Deep Agents

Moda leverages a multi-agent system built on Deep Agents and LangSmith to enable non-designers to create professional-grade visuals. The article highlights a specific use case of AI agents in a design context.

blog.langchain.com llm

99d

How Netflix Live Streams to 100 Million Devices in 60 Seconds

This article outlines the architecture that allows Netflix to live stream to 100 million devices in 60 seconds. It focuses on the challenges and solutions involved in building a large-scale live streaming system.

blog.bytebytego.com architecture

99d

Production-Ready LLM Agents: A Comprehensive Framework for Offline Evaluation

This article focuses on offline evaluation frameworks for production-ready LLM agents. It tackles the challenge of proving that these complex systems work reliably before deployment.

towardsdatascience.com ml

99d

Sandboxing AI agents, 100x faster

Cloudflare introduces Dynamic Workers for executing AI-generated code in secure, lightweight isolates. This technique achieves millisecond startup times, significantly faster than traditional container-based sandboxing for AI agents.

blog.cloudflare.com engineering

99d

Building high-performance full-text search for object storage

The ClickHouse blog details the design of their new text index for high-performance full-text search, especially when data is stored in object storage. The post explains how the design maintains speed at scale.

clickhouse.com clickhouse

99d

Intelligent security at ClickHouse speed: How Cogent Security built an AI-native vulnerability management platform

Cogent Security explains how they built an AI-native vulnerability management platform using ClickHouse. They emphasize ClickHouse's speed as crucial for countering AI-enabled attacks.

clickhouse.com clickhouse

99d

BubbleRAG: Evidence-Driven Retrieval-Augmented Generation for Black-Box Knowledge Graphs

This paper introduces BubbleRAG, an evidence-driven retrieval-augmented generation approach for black-box knowledge graphs. It aims to address the limitations of existing graph-based RAG approaches related to recall and precision.

arxiv.org knowledge-graphs

99d

How Stripe Radar helps prevent free trial abuse

Stripe Engineering details how Radar uses machine learning to prevent free trial abuse. The system predicts abusive behavior with 90% accuracy, based on common trial terms violations.

stripe.com engineering

99d

How Agentic RAG Works?

In this article, we will look at how agentic RAG works, how it improves upon standard RAG, and the trade-offs that should be considered.

blog.bytebytego.com architecture

100d

Inside Gen 13: how we built our most powerful server yet

Cloudflare's Gen 13 servers introduce AMD EPYC™ Turin 9965 processors and a transition to 100 GbE networking to meet growing traffic demands. In this technical deep dive, we explain the engineering rationale behind each major component selection.

blog.cloudflare.com engineering

100d

Launching Cloudflare’s Gen 13 servers: trading cache for cores for 2x edge compute performance

Cloudflare’s Gen 13 servers double our compute throughput by rethinking the balance between cache and cores. Moving to high-core-count AMD EPYC ™ Turin CPUs, we traded large L3 cache for raw compute density. By running our new Rust-based FL2 stack, we completely mitigated the latency penalty to unlo

blog.cloudflare.com engineering

100d

Process Faster, Pay Less: Functional Isolation for Stream Processing

This arXiv paper presents a novel approach to stream processing by exploring functional isolation to reduce infrastructure costs. It discusses how concurrent workloads can extract insights from real-time data streams while optimizing resource utilization.

arxiv.org streaming

100d

ReViSQL: Achieving Human-Level Text-to-SQL

The paper introduces ReViSQL, an approach to translating natural language to SQL, aiming to achieve human-level performance. The research focuses on enhancing SQL reasoning by utilizing large language models and AI agents to decompose complex queries.

arxiv.org semantic-layer

100d

A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge

This paper provides a comprehensive survey on vector databases, covering storage and retrieval techniques, as well as challenges in the field. Vector databases are increasingly important due to their integration with large language models and applications in machine learning.

arxiv.org vector-db

100d

Announcing DuckDB 1.5.1

DuckDB 1.5.1 is released, including fixes and Lance support. The release notes are available on GitHub, and the new version can be installed from the installation page.

duckdb.org duckdb

100d

How we give every user SQL access to a shared ClickHouse cluster | Trigger.dev

Trigger.dev describes their architecture for giving every user SQL access to a shared ClickHouse cluster. This pattern can be useful for connecting agents to platforms and building ETL systems.

reddit.com clickhouse

102d

The Math That’s Killing Your AI Agent

This article uses compound probability to illustrate how seemingly accurate AI agents can fail in multi-step tasks. It also proposes a pre-deployment framework to mitigate such failures in production.

towardsdatascience.com ml

103d

Kafka Streams with 300M+ keys in RocksDB - DR rebuild takes 45+ mins to 2hrs even from changelog. Anyone solved this?

A Kafka Streams application with a large RocksDB state store (300M+ keys) experiences slow DR rebuild times (45+ minutes to 2 hours). The author is looking for solutions to improve rebuild performance, which is a common challenge in Kafka Streams.

reddit.com kafka

103d

Kafka Isn’t a Database, But We Gave It a Query Engine Anyway

WarpStream has added a built-in observability layer that stores structured operational events directly as Kafka topics in object storage, allowing users to query those topics.

reddit.com kafka

103d

Kafka Isn’t a Database, But We Gave It a Query Engine Anyway - WarpStream

This blog post describes WarpStream Events, an observability layer that captures Agent logs, ACL decisions, and pipeline execution logs, and allows users to search and visualize them with zero ops.

warpstream.com streaming

103d

Agentic RAG Failure Modes: Retrieval Thrash, Tool Storms, and Context Bloat (and How to Spot Them Early)

This article discusses potential failure modes in agentic RAG systems, including retrieval thrash, tool storms, and context bloat, and suggests methods for early detection to avoid high cloud costs.

towardsdatascience.com ml

103d

Speeding up Timely Dataflow by 100x

This article presents a detailed example of how timely dataflow's approach to progress tracking can achieve orders of magnitude more efficiency than other stream processors. It explains the mechanisms behind achieving a 100x speedup.

materialize.com streaming

103d

From ClickHouse + Elasticsearch to Apache Doris: How Kwai Unified Trillion-Scale Ad Analytics

Kwai, a short-video platform, unified its advertising analytics by migrating from ClickHouse and Elasticsearch to Apache Doris. This resulted in up to 90% latency reduction and a 3x increase in write throughput.

doris.apache.org analytics

103d

DuckDB.ExtensionKit: Building DuckDB Extensions in C#

DuckDB has a flexible extension mechanism that allows extensions to be loaded dynamically at runtime, and this post shows how to build them in C#. This extension mechanism can add support for new file formats, introduce custom types, or provide specialized analytical functions.

duckdb.org duckdb

103d

How we monitor internal coding agents for misalignment

OpenAI details their approach to monitoring internal coding agents using chain-of-thought analysis to detect and mitigate risks of misalignment, focusing on AI safety.

openai.com llm

104d

PondDB – Self-hosted agent memory database built on DuckDB

PondDB is presented as a self-hosted agent memory database built on DuckDB.

github.com duckdb

104d

Show HN: Blobsearch – Object storage and DuckDB based Elasticsearch alternative

The article introduces Blobsearch, an Elasticsearch alternative based on object storage (like S3) and DuckDB for querying logs rapidly. It focuses on using a durable storage solution (S3 with Parquet) combined with the analytical capabilities of DuckDB for cost-effective log analysis and monitoring

github.com duckdb

104d

Ten years late to the dbt party (DuckDB edition)

A late-adopter walkthrough of using dbt with DuckDB for local development, covering the setup, tradeoffs, and workflow differences compared to cloud warehouses.

rmoff.net duckdb

104d

Friend Bubbles: Enhancing Social Discovery on Facebook Reels

Friend bubbles in Facebook Reels highlight Reels your friends have liked or reacted to, helping you discover new content and making it easier to connect over shared interests. This article explains the technical architecture behind friend bubbles, including how machine learning estimates relationshi

engineering.fb.com engineering

105d

Beam Metrics in ClickHouse

This article explores using Apache Beam to ingest metrics into ClickHouse; this provides insights into how to leverage a data processing framework for efficient metric storage and analysis in a columnar database.

andrealeopardi.com clickhouse

105d

Kafka Is Dead, Long Live Kafka: The Case for Diskless, S3-Native Streaming - WarpStream

WarpStream eliminates disks, brokers, and inter-AZ costs by using an S3-native architecture. This article makes the case for this new approach to streaming.

warpstream.com streaming

105d

Show HN: Parsing hostile industrial data in 64MB WASM sandboxes

Ingelt is a Rust/Axum gateway that parses 33 legacy industrial protocols inside 64MB WebAssembly sandboxes. The WASM isolation approach prevents malformed input from crashing the host process.

ingelt.com community

105d

How ClickStack makes ClickHouse faster for observability

This post details how ClickStack integrates with ClickHouse to optimize queries for observability workloads. It covers techniques like progressive time window pagination, chunked charts, and automated use of materialized views, offering insights into performance tuning.

clickhouse.com clickhouse

105d

How one query ate 2 TB of RAM

This Postgres Weekly article discusses how a badly written query caused an OOM (Out-Of-Memory) killer issue, even with ample RAM. The culprit was `work_mem` exceeding expectations; this is a cautionary tale regarding resource allocation and query optimization in Postgres.

postgresweekly.com postgres

105d

Introducing Redpanda AI SDK for Go

Redpanda is open-sourcing their AI SDK for Go, designed for observable, resilient, and production-grade AI tooling.

redpanda.com streaming

105d

Nemotron 3 Nano 4B: A Compact Hybrid Model for Efficient Local AI

Nemotron 3 Nano 4B is presented as a compact LLM suitable for local AI, offering an efficient option for running inference on resource-constrained devices. Staff+ ML engineers working on edge deployment or low-latency applications should investigate this model's architecture and performance characte

huggingface.co ml

105d

How (some) good corporate engineering blogs are written (2020)

This article analyzes the characteristics of well-written corporate engineering blogs. It provides valuable insights for creating engaging and informative technical content within organizations.

danluu.com community

105d

Show HN: System that rediscovers physics laws from raw data autonomously

ProtoScience is a deterministic pipeline designed to autonomously discover governing equations from raw numerical data. The system uses sparse regression and statistical validation without relying on LLMs.

protoscience.ai llm

105d

Get Shit Done: A Meta-Prompting, Context Engineering and Spec-Driven Dev System

The Get Shit Done framework uses meta-prompting and spec-driven development to structure LLM-powered system builds. It emphasizes generating detailed specifications before code, reducing iteration cycles.

github.com agents

105d

How a semantic layer prevents AI hallucinations in analytics

This dbt blog post discusses how a semantic layer provides a consistent and governed foundation for AI systems used in analytics.

getdbt.com semantic-layer

105d

Ranking Engineer Agent (REA): The Autonomous AI Agent Accelerating Meta’s Ads Ranking Innovation

Meta’s Ranking Engineer Agent (REA) autonomously executes key steps across the end-to-end machine learning (ML) lifecycle for ads ranking models. This post covers REA’s ML experimentation capabilities: autonomously generating hypotheses, launching training jobs, debugging failures, and iterating on

engineering.fb.com engineering

105d

Orchestrating Self-Evolving Agents with CrewAI and NVIDIA NemoClaw

The post discusses a shift towards autonomous, continuously evolving AI agents and integrations with tools like NVIDIA NemoClaw.

blog.crewai.com agents

106d

Show HN: I replaced Postgres and ClickHouse with one binary for web analytics

The author implemented a single binary solution to replace both Postgres and ClickHouse for web analytics.

github.com clickhouse

106d

Context Engineering from the Inside Out

This article explores context engineering, a topic critical for building AI-ready data systems. The post discusses designing data systems for AI consumption, machine-readable metadata, and contextual memory, providing insights into creating effective data pipelines for AI applications.

blog.yellowday.day community

106d

Introduction to Data-Centric Query Compilation

An introduction to data-centric query compilation, covering how modern engines like HyPer and Umbra generate machine code from query plans by pushing data through tight loops rather than pulling through iterator trees.

duckul.us duckdb

106d

Underrated Postgres: Create (Extended) Statistics

This article highlights the importance of extended statistics in Postgres for query optimization. It likely covers how to create and use extended statistics to improve query performance, especially for complex queries or datasets with skewed data distributions.

vela.simplyblock.io postgres

106d

How Reddit Migrated Petabyte-Scale Kafka from EC2 to Kubernetes

Reddit's engineering team migrated their petabyte-scale Kafka deployment from EC2 to Kubernetes. The article details the challenges they faced and the solutions implemented for a successful migration.

blog.bytebytego.com kafka

106d

DataOps Best Practices with Dagster: CI/CD, Monitoring & Data Quality

This Dagster blog post details CI/CD workflows using branch deployments, automatic retries, and backfill strategies; it also covers data quality via asset checks and monitoring with Dagster Insights, offering actionable advice for managing production data pipelines.

dagster.io orchestration

106d

Lower your warehouse costs via DuckDB transpilation

This article explores using DuckDB transpilation to reduce warehouse costs. It could involve techniques for rewriting SQL queries to leverage DuckDB's efficient execution or using DuckDB as a local processing layer before data warehousing, offering a practical method for cost optimization.

maxhalford.github.io duckdb

106d

Subagents

Covers subagent patterns for building composable AI agents that delegate tasks to specialized sub-agents, with practical implementation details.

simonwillison.net llm

106d

How Socialpruf built a faster, more reliable data stack by replacing Neon with Postgres managed by ClickHouse

Socialpruf replaced Neon with Postgres managed by ClickHouse, resulting in up to 5x faster query performance. The migration eliminated network transfer costs and improved the speed of real-time social analytics dashboards.

clickhouse.com clickhouse

106d

How a Neural Network Learned Its Own Fraud Rules: A Neuro-Symbolic AI Experiment

This article explores a neuro-symbolic system where a neural network learns fraud rules automatically, extracting IF-THEN rules during training; the experiment uses a hybrid neural network with a differentiable rule-learning module on the Kaggle dataset.

towardsdatascience.com ml

106d

Building a product analytics warehouse on vanilla Postgres

This article discusses building a product analytics warehouse directly on Postgres. The article likely details schema design choices, performance optimization strategies (indexing, partitioning), and extension usage (like pgvector) relevant for those using Postgres beyond traditional transactional w

xata.io postgres

106d

How 5 Databases Scale Across Concurrency, Data, and Nodes

The article compares Exasol, ClickHouse, StarRocks, Trino, and DuckDB across concurrency, data volume, and node scaling. While, the comparison could highlight architectural differences, performance trade-offs, and suitability for different analytical workloads across these popular SQL engines.

exasol.com duckdb

106d

We just released Flock v0.7.0: A native DuckDB extension to run RAG, Claude, and LLM metrics directly in SQL

Flock v0.7.0 is a DuckDB extension that allows users to run RAG, Claude, and LLM metrics directly in SQL. The extension aims to eliminate the need to move data into Python scripts for semantic tasks.

reddit.com duckdb

106d

Show HN: Avalon - Synthetic FHIR R4 patient data as OMOP CDM 5.4 views

Avalon Synthetic clinical data pipeline , generate realistic FHIR R4 patient data, normalize it through Forge, and query it as OMOP CDM 5.4 views. What is Avalon? Avalon is an end-to-end pipeline that turns Synthea-generated FHIR bundles into clean, documented, queryable tables in BigQuery , then la

github.com community

106d

materialize.com llm

114d

Object storage-native database for search

This article details the architectural design of a vector database built natively on object storage, focusing on how this approach enables efficient search capabilities. It explores the underlying principles and engineering trade-offs of such a design.

turbopuffer.com vector-db

114d

Ulysses Sequence Parallelism: Training with Million-Token Contexts

huggingface.co ml

How AI improves data lineage at scale

Discover how AI accelerates data lineage with automated docs, testing, and scalable governance.

getdbt.com dbt

119d

The Decline of RAG in Agentic AI | Airbyte

Explore the decline of traditional RAG in the era of agentic AI, and how autonomous agents are reshaping retrieval, reasoning, and knowledge workflows.

airbyte.com data-engineering

119d

This article covers agent development with CockroachDB using the LangChain framework, highlighting the integration's support for building production-ready agentic AI applications.

cockroachlabs.com agents

125d

Netflix discusses scaling LLM post-training. The article explains their approach and challenges.

netflixtechblog.com engineering

138d

Supabase incident on February 12, 2026

Supabase provides a detailed account of the February 12 outage in us-east-2, explaining the root cause and the steps taken to prevent it from happening again. The article provides insight into the incident and the measures implemented to improve system reliability.

supabase.com postgres

138d

How dbt Labs reduced dbt-related compute costs by 64% with Fusion and state-aware orchestration

With intelligent orchestration and optimization, we achieved a 64% reduction in compute costs and simplified our job architecture.

getdbt.com dbt

138d

Automating RDS Postgres to Aurora Postgres Migration

This Netflix Tech Blog post discusses automating the migration of RDS Postgres to Aurora Postgres. The article likely details the challenges, solutions, and lessons learned during this process, offering insights for others undertaking similar migrations.

netflixtechblog.com postgres

139d

How to build a distributed queue in a single JSON file on object storage

The article outlines a method for constructing a global distributed queue using a single JSON file stored on object storage. It describes the evolution of this system, starting with basic file usage and progressing to incorporate write batching, a stateless broker component, and high-availability.

turbopuffer.com architecture

139d

Apache Doris + Paimon: A Faster Lakehouse for Web3 On-Chain Analytics

This Apache Doris blog post covers how Apache Doris and Apache Paimon can be used to build a unified lakehouse for Web3 on-chain analytics, claiming 5x faster ETL than Spark and 2x faster data lake queries than Trino.

doris.apache.org analytics

139d

Stop Guessing: A Systematic Guide to Fixing CUDA Out of Memory Errors in GRPO Training

A practical guide to diagnosing GPU memory issues instead of randomly changing hyperparameters until something works Last week, I was building a reinforcement learning model for a customer using GRPO.. View article

mlops.community mlops

141d

What We Learned Building OAuth Flows with MCP Apps | Airbyte

Discover practical lessons from building OAuth flows with MCP apps, including OAuth 2.0 patterns, security issues, and implementation tips.

airbyte.com data-engineering

141d

High-Throughput Graph Abstraction at Netflix: Part I

This Netflix Tech Blog post covers high-throughput graph abstraction. The article likely describes the architecture, implementation, and performance considerations of their graph abstraction system, offering practical insights for building similar systems.

netflixtechblog.com knowledge-graphs

141d

Building Prometheus: How Backend Aggregation Enables Gigawatt-Scale AI Clusters

This article shares details of the role backend aggregation (BAG) plays in building Meta’s gigawatt-scale AI clusters like Prometheus. BAG allows Meta to seamlessly connect thousands of GPUs across multiple data centers and regions. Their BAG implementation is connecting two different network fabric

engineering.fb.com engineering

142d

The Data Canary: How Netflix Validates Catalog Metadata

This Netflix Tech Blog post details how Netflix validates catalog metadata using a 'Data Canary' system. The article likely explains the architecture, implementation, and benefits of this system for ensuring data quality and reliability.

netflixtechblog.medium.com data-engineering

145d

How to Build a Real-Time Web3 Analysis Infrastructure with Apache Doris and Flink

This Apache Doris blog post explains how to build a real-time Web3 analytics platform using Apache Flink and Apache Doris, enabling sub-second queries on billions of blockchain transactions.

doris.apache.org analytics

146d

The Missing Layer in Your AI Stack: Context, Not Just State

This article discusses how context graphs can improve AI agent performance, emphasizing the shift from simple state management to incorporating semantic understanding of the data; this is.

dataengineeringweekly.com data-engineering

151d

Data Bridge: How Netflix simplifies data movement

This Netflix Tech Blog post discusses 'Data Bridge', a system Netflix uses to simplify data movement. The article likely explains the architecture, implementation, and benefits of this system for improving data pipeline efficiency and reducing complexity.

netflixtechblog.com data-engineering

151d

I replaced a $120/year micro-SaaS in 20 minutes with LLM-generated code

This post explores how an individual replaced a paid SaaS subscription with LLM-generated code in just 20 minutes; this highlights the potential for LLMs to disrupt simple SaaS business models, especially for products that are not actively maintained.

blog.pragmaticengineer.com engineering

153d

The AI Evolution of Graph Search at Netflix

This Netflix Tech Blog post covers the AI evolution of graph search at Netflix. The article likely describes how they're using AI to improve graph search capabilities, offering insights into building intelligent search systems.

netflixtechblog.com knowledge-graphs

156d

Lessons From 2 Billion Agentic Workflows

The article shares lessons learned from observing billions of agentic workflows, focusing on the challenges of moving from a working demo to a production system.

blog.crewai.com agents

158d

Announcing Vortex Support in DuckDB

I think it is worth starting this intro by talking a little bit about the established format for columnar data. Parquet has done some amazing things for analytics. If you go back to the times where CSV was the better alternative, then you know how important Parquet is. However, even if the specific

duckdb.org duckdb

159d

A Missing Layer in Agentic Systems?

The article argues that human-in-the-loop is a missing layer in agentic systems.

blog.crewai.com agents

160d

Inside StarRocks: Why Joins Are Faster Than You’d Expect

This StarRocks blog post dives into the details of join optimization within the StarRocks database, explaining why joins can perform faster than expected. The author is a StarRocks committer and engineer at Celerdata.

starrocks.io clickhouse

161d

ANN v3: 200ms p99 query latency over 100 billion vectors

This article introduces the latest version of an Approximate Nearest Neighbor (ANN) system, highlighting its capability to handle over 100 billion vectors within a single search index. It reports achieving a p99 query latency of 200ms at 1,000 queries per second (QPS) while maintaining 92% recall.

turbopuffer.com vector-db

161d

How I Reduced AI Token Costs by 91% with Semantic Tool Selection and Redis

Building Semantic Tool Selection with Multi-Component Embeddings Last quarter, our enterprise AI platform hit a wall. We had built an impressive suite of 70+ automated tools covering everything from database.. View article

mlops.community mlops

162d

Apache Doris 4.0: Native Hybrid Search for AI Workloads

Apache Doris now supports native hybrid search for AI workloads. The new functionality allows vector search, full-text search, and structured analytics within a single SQL engine, enabling AI-powered applications to leverage a unified data platform.

doris.apache.org analytics

162d

Implement dbt Data Quality Checks with dbt-expectations

Deep technical guide to dbt-expectations covering regex validation, freshness/SLA checks, completeness validation within time windows, JSON schema validation, statistical distribution checks, and cross-column logic. Shows integration with production monitoring.

datadoghq.com data-quality

162d

The 2026 Data Mandate: Is Your Governance Architecture a Fortress or a Liability?

Examines how the EU AI Act, Cyber Resilience Act, and Data Act turn messy data from a performance tax into a legal liability. Covers the August 2026 deadline for High-Risk AI system compliance and argues governance must shift from reactive cleanup to embedded-by-design architecture.

towardsdatascience.com governance

167d

Designing inverted indexes in a KV-store on object storage

The article describes the redesign of an inverted index structure, detailing the adoption of fixed-sized posting blocks within a key-value store built on object storage. This architectural change resulted in a tenfold reduction in index size and a dramatic increase in system throughput.

turbopuffer.com vector-db

168d

Architecting the AI Agent Platform: A Definitive Guide

The velocity of Generative AI has been nothing short of relentless. In the span of just 24 months, the industry has shifted paradigms three times. We started with the raw.. View article

mlops.community mlops

169d

A Critique of Iceberg REST Catalog: A Classic Case of Why Semantic Spec Fails

This article presents a critique of the Iceberg REST catalog, arguing that a semantically correct API can become operationally unreliable at scale;

dataengineeringweekly.com data-engineering

173d

Why We Use Separate Tech Stacks for Personalization and Experimentation

This Spotify Engineering blog post explains the technical and practical rationale for using separate tech stacks for personalization and experimentation. The article likely details the benefits of this separation, such as improved agility and scalability.

engineering.atspotify.com engineering

175d

Why BM25 queries with more terms can be faster (and other scaling surprises)

This article presents an analysis of how BM25 query latencies vary with document count and the top_k parameter. It explores surprising scaling characteristics, noting that longer queries may scale less efficiently and that the presence of essential terms can impact performance in unexpected ways.

turbopuffer.com vector-db

175d

Build a real-time lakehouse architecture with Redpanda and Databricks

This post outlines building a real-time lakehouse architecture using Redpanda's Iceberg Topics and Databricks Unity Catalog for analytics-ready tables, eliminating the need for batch processing and orchestration, which is of interest to practitioners.

redpanda.com streaming

176d

Weaviate 1.35 Release

Weaviate 1.35 introduces Object Time-to-Live (TTL), zstd compression support, flat index RQ quantization, multimodal support with Weaviate Embeddings, and runtime configurable OIDC certificates.

weaviate.io vector-db

184d

Fast JSON Analytics in Apache Doris: 100x Faster Than PostgreSQL and MongoDB

Apache Doris introduces the VARIANT data type for high-performance JSON analytics. Optimizations such as dynamic subcolumns, sparse columns, schema templates, lazy materialization, and path-based indexing allow it to outperform PostgreSQL and MongoDB in JSON handling.

doris.apache.org analytics

187d

10 Things You Need to Know to Optimize OLAP Query Performance

This StarRocks blog post provides 10 tips for optimizing OLAP query performance. The author is a StarRocks TSC member and query engine team lead at CelerData.

starrocks.io clickhouse

197d

Build production‑grade agentic AI with Redpanda Connect and broker auditing

Learn how to build secure, observable agentic AI on Redpanda using Connect for AI tools and data access, plus broker audit logs to capture every agent action.

redpanda.com streaming

197d

How ByteDance Solved Billion-Scale Vector Search Problem with Apache Doris 4.0

ByteDance uses Apache Doris 4.0 to solve billion-scale vector search problems. The company leverages Doris's hybrid search capabilities to build a system that balances accuracy, low latency, and cost-efficiency when handling over 1 billion vectors.

doris.apache.org analytics

197d

Iceberg in the Browser

In this post, we describe the current patterns for interacting with Iceberg Catalogs, and pose the question: could it be done from a browser? After elaborating on the DuckDB ecosystem changes required to unlock this capability, we demonstrate our approach to interacting with an Iceberg REST Catalog.

duckdb.org duckdb

197d

How to build Agentic Systems: The Missing Architecture for Production AI Agents

The article identifies architecture as the key challenge in building production multi-agent systems, based on insights from 1.7 billion workflows.

blog.crewai.com agents

198d

Show HN: Dbxlite -- Query 100M+ rows in a browser tab, no install

Browser-native SQL workbench built on DuckDB WASM. Query CSV, Parquet, Excel, JSON files locally with zero install. Supports AI SQL assistant, BigQuery connector, and shareable URLs. Your data never leaves your machine.

news.ycombinator.com duckdb

201d

The Three Durable Function Forms

This article proposes a model extending generic durable functions into three forms: stateless functions, stateful function objects, and linear function chains. It aims to standardize terminology in durable execution engines by linking concepts like 'workflows' and 'activities' to underlying executio

jack-vanlightly.com architecture

203d

Vectorized MAXSCORE over WAND, especially for long LLM-generated queries

The article describes how text search performance has been improved by up to 20x through the adoption of a vectorized variant of the block-max MAXSCORE algorithm, a technique also employed by Apache Lucene. This enhancement is particularly relevant for handling long queries generated by large langua

turbopuffer.com vector-db

204d

Context Engineering - LLM Memory and Retrieval for AI Agents

This article discusses context engineering, focusing on how AI agents manage LLM memory by selecting, retrieving, and organizing context from short-term and long-term memory. Context engineering is important for improving the reliability of AI agents in production.

weaviate.io vector-db

204d

The Durable Function Tree - Part 2

This post delves into the architecture of durable function trees, exploring their integration within larger systems and the advantages they offer for durable execution.

jack-vanlightly.com architecture

209d

The Durable Function Tree - Part 1

This article explores constructing workflows using durable function calls arranged in trees, built on durable promises and continuations.

jack-vanlightly.com architecture

209d

FTS v2: up to 20x faster full-text search

This article announces a substantial upgrade to a full-text search engine, promising up to a 20x improvement in search performance. The upgrade reflects significant enhancements made to the underlying search architecture.

turbopuffer.com vector-db

209d

Writes in DuckDB-Iceberg

Over the past several months, the DuckDB Labs team has been hard at work on the DuckDB-Iceberg extension, with full read support and initial write support released in v1.4.0. Today, we are happy to announce delete and update support for Iceberg v2 tables is available in v1.4.2! The Iceberg open tabl

duckdb.org duckdb

215d

Demystifying Determinism in Durable Execution

This article explains the concept of determinism within durable execution frameworks, focusing on identifying code sections that must be deterministic.

jack-vanlightly.com architecture

219d

Bringing RAG to Life with Dify and Weaviate

This article explains how to leverage the Dify and Weaviate integration for building Retrieval Augmented Generation (RAG) applications. This integration can be valuable for enhancing LLM applications with external knowledge.

weaviate.io vector-db

223d

The Growing Apache Polaris Ecosystem: The Iceberg Catalog Standard

Technical overview of Apache Polaris as the emerging open catalog standard for Iceberg. Covers multi-engine interoperability (Spark, Flink, Trino, StarRocks), built-in RBAC with table-level security, short-lived credential vending via cloud provider integrations, and Snowflake's managed Polaris offe

dremio.com governance

223d

Have your Iceberg Cubed, Not Sorted: Meet Qbeast, the OTree Spatial Index

This article introduces Qbeast, an OTree spatial index designed for data lakehouses, and discusses its integration with Apache Iceberg and Delta Lake.

jack-vanlightly.com iceberg

224d

Weaviate 1.34 Release

Weaviate 1.34 introduces flat index support with RQ quantization, server-side batching improvements, new client libraries, and Contextual AI integration. These features offer potential performance and functionality improvements for the vector database.

weaviate.io vector-db

232d

Apache Doris Tops JSONBench in Cold Queries and Data Quality

Apache Doris achieves top performance in the JSONBench benchmark, particularly in cold query performance and data quality. The benchmark measures query performance and data handling capabilities when processing JSON data.

doris.apache.org analytics

237d

How Would You Like Your Iceberg Sir? Stream or Batch Ordered?

This article discusses how stream and batch analytics can be built on Apache Iceberg, highlighting potential conflicts due to their differing requirements.

jack-vanlightly.com iceberg

238d

Billion-scale vector storage for RAG

This article explores the architectural considerations and engineering approaches necessary for building vector storage systems capable of scaling to billions of vectors. It specifically addresses these challenges within the context of Retrieval Augmented Generation (RAG) applications.

turbopuffer.com vector-db

239d

Apache Doris Achieves 70% Better Price-Performance on ARM-based AWS Graviton

Apache Doris achieves 70% better price-performance on ARM-based AWS Graviton instances compared to x86. The benchmark results, gathered from standard OLAP tests such as ClickBench, SSB, TPC-H, and TPC-DS, demonstrate the efficiency of Doris on ARM architecture.

doris.apache.org analytics

243d

New trend: programming by kicking off parallel AI agents

This article highlights the emerging trend of developers utilizing multiple AI agents in parallel to generate code. It explores the potential benefits and challenges of this approach to programming.

blog.pragmaticengineer.com engineering

244d

He built a new database in his bedroom

The article describes the process of building a new vector database, detailing the architectural choices, design philosophy, and implementation challenges encountered. It outlines how specific technical hurdles were addressed during its development.

turbopuffer.com vector-db

244d

A Fork in the Road: Deciding Kafka’s Diskless Future

This article examines the Kafka community's efforts to lower replication costs by discussing KIP-1150, KIP-1176, and KIP-1183.

jack-vanlightly.com kafka

252d

Apache Airflow CTL aka airflowctl 0.1.0

The article announces the initial major release of `airflowctl` 0.1.0, a new secure and API-driven command-line interface for Apache Airflow. This CLI is designed to align with modern API communication and auditability standards.

airflow.apache.org orchestration

259d

The 2026 Open-Source Data Quality and Data Observability Landscape

Comprehensive landscape of open-source data quality tools including Soda Core, Elementary Data, dbt Tests, and DataKitchen TestGen. Explores how the community is democratizing observability capabilities previously locked behind expensive platforms, and how AI is being used to automate test generatio

datakitchen.io data-quality

259d

Beyond Indexes: How Open Table Formats Optimize Query Performance

This article examines query performance optimization techniques in open table formats such as Apache Iceberg, highlighting methods beyond standard indexing.

jack-vanlightly.com iceberg

266d

Apache Doris Up to 34x Faster Than ClickHouse in Real-Time Updates

Apache Doris is shown to be significantly faster than ClickHouse in real-time updates, according to benchmark results. Using ClickBench and SSB (Star Schema Benchmark), Apache Doris outperforms ClickHouse by 18-34x in SSB and 2.5-4.6x in ClickBench.

doris.apache.org analytics

273d

Your Data Contracts Are in the Wrong Spot

Argues that most organizations place data contracts in the wrong part of the lifecycle, causing enforcement gaps. Makes the case for contracts closer to the producer, not the consumer, with practical guidance on where they should sit architecturally.

dataproducts.substack.com data-quality

273d

Apache Airflow 3.1.0: Human-Centered Workflows

The article announces the release of Apache Airflow 3.1.0, an update that integrates human decision-making into automated processes. It also introduces comprehensive internationalization support and substantial developer experience enhancements.

airflow.apache.org orchestration

279d

Search Mode Benchmarking

Learn how Search Mode compares against Hybrid Search on the BEIR, LoTTe, BRIGHT, EnronQA, and WixQA Information Retrieval benchmarks.

weaviate.io vector-db

281d

Training an LLM-RecSys Hybrid for Steerable Recs with Semantic IDs

This article describes training an LLM that can converse in English and item IDs, making recommendations without retrieval or tools.

eugeneyan.com ml

290d

Deep Dive: Data Pruning in Apache Doris

Apache Doris utilizes various data pruning techniques to optimize query performance by skipping unnecessary data processing. This article dives into the implementation and strategies behind these data pruning techniques within the Doris architecture.

doris.apache.org analytics

296d

Apache Doris Up To 40x Faster Than ClickHouse | OLAP Showdown Part 2

Apache Doris demonstrates superior performance over ClickHouse in various benchmarks including CoffeeBench, TPC-H, and TPC-DS. The benchmarks show that Doris consistently outperforms ClickHouse, showcasing its efficiency and speed in OLAP workloads.

doris.apache.org analytics

297d

Column-Level Lineage in Fabric Spark with OpenLineage, Stashed in Delta Lake

Production-oriented guide showing how to capture column-level lineage in Microsoft Fabric Spark (which ships with OpenLineage pre-installed). Describes a Spark Plugin architecture where a REST API collects lineage events from an OpenLineage Listener, buffering them into Delta Tables for queryable li

rakirahman.me lineage

299d

Chunking Strategies to Improve LLM RAG Pipeline Performance

Learn how chunking strategies improve LLM RAG pipelines, retrieval quality, and agent memory performance across production AI systems.

weaviate.io vector-db

300d

Understanding Apache Fluss

This post delves into the internal workings of Apache Fluss, offering a detailed exploration for those interested in data system internals.

jack-vanlightly.com architecture

302d

8-bit Rotational Quantization: How to Compress Vectors by 4x and Improve the Speed-Quality Tradeoff of Vector Search

Get spun around by our new vector quantization algorithm that utilizes the power of random rotations to improve the speed-quality tradeoff of vector search with Weaviate.

weaviate.io vector-db

309d

A Conceptual Model for Storage Unification

This article introduces a conceptual model for storage unification, designed to present diverse storage systems and formats as a unified resource.

jack-vanlightly.com architecture

314d

Iceberg Catalogs 2025: Exploring Emerging Metadata Solutions

Compares next-generation Iceberg catalogs: Nessie (Git-style branching for data), Apache Polaris, Apache Gravitino, Lakekeeper, and Unity Catalog. Explains how these move beyond simple table-name resolution to provide version control, federated views, fine-grained policies, and multi-engine freedom.

e6data.com governance

320d

Data Quality Frameworks Comparison: Great Expectations, Soda Core, dbt, Deequ

Side-by-side technical comparison of Great Expectations, Soda Core, dbt tests, and Deequ across expressiveness, scalability, integration patterns, and ease of adoption. Provides a decision framework for which tool fits which use case, and discusses layering multiple tools across pipeline stages.

nurbolsakenov.com data-quality

325d

Transcribe speech 100x faster and 100x cheaper with open models

Modal claims that open models can transcribe speech 100 times faster and 100 times cheaper than previous methods.

modal.com ml

343d

Data Pipeline Troubleshooting: Root Cause Analysis Through Lineage Metadata

Builds a complete order processing pipeline with Debezium CDC, Apache Flink transformations, and OpenLineage/Marquez for lineage tracking. Demonstrates how lineage metadata enables root cause analysis when pipeline failures occur, showing practical troubleshooting patterns with end-to-end visibility

debezium.io lineage

345d

Apache Iceberg and the Catalog Layer

Features Russell Spitzer (Apache Iceberg/Polaris PMC) discussing the distinction between business catalogs (discovery/listing) and system catalogs (governing access by understanding table layout). Covers how Polaris vends short-lived credentials scoped to exact table directories.

getdbt.com governance

358d

How Constella Uses Weaviate for Vector Search (RAG) and Cross-Platform Syncing across Devices for Consumer Apps

Constella built a cross-platform thinking tool using Weaviate, RAG, and a multi-tenant architecture. The post details how they implemented vector search and syncing across devices.

weaviate.io vector-db

363d

Evaluating Long-Context Question & Answer Systems

This article covers evaluation metrics, how to build eval datasets, evaluation methodology, and a review of several benchmarks for long-context question and answer systems.

eugeneyan.com ml

374d

Data Contracts and Data Observability: Whatnot's Full Circle Journey to Data Trust

Production case study from Whatnot (live shopping marketplace) on combining data contracts with Monte Carlo observability. Their stack uses Snowflake, dbt, and Dagster. Shows how enforcing contracts while layering automated observability kept data incidents flat despite exponential data growth.

montecarlodata.com data-quality

381d

Operating Flink Is Hard: What does this really mean? And how to go about it?

This article delves into the difficulties of operating Apache Flink in production environments. It explores the reasons why Flink is considered challenging and provides insights into how to address these operational complexities.

decodable.co flink

383d

Native Data Lineage in Debezium with OpenLineage

Technical walkthrough of Debezium's built-in OpenLineage integration for automatic CDC lineage tracking. Explains how Debezium Server emits OpenLineage events natively using the Java SDK, modeling run/job/dataset entities without manual instrumentation, with Marquez as a lineage backend.

debezium.io lineage

383d

What's New with Databricks Unity Catalog at Data + AI Summit 2025

Covers Unity Catalog announcements: Iceberg catalog federation for governing tables in AWS Glue/Hive/Snowflake without copying data, Unity Catalog Metrics as first-class governed assets, column-level permissions for PII, and the new Discover experience for certified data products with AI-driven reco

databricks.com governance

386d

More efficient multi-vector embeddings with MUVERA

Weaviate version 1.31 introduces the MUVERA encoding algorithm for multi-vector embeddings. The post explains the algorithm's details, including its functionality and use cases.

weaviate.io vector-db

391d

AI Engineer 2025 - Improving RecSys & Search with LLM techniques

This article discusses how Recsys and search are converging with LLMs via semantic IDs, data augmentation, and unified foundation models.

eugeneyan.com ml

392d

DuckLake: A Metadata Store for Data Lakes

DuckLake stores data lake metadata in a SQL database instead of files. 22-table schema replaces manifest files, enabling instant snapshot queries and ACID transactions without file listing overhead.

duckdb.org duckdb

420d

Building News Agents for Daily News Recaps with MCP, Q, and tmux

This article details automating agentic workflows using Amazon Q CLI, Anthropic MCP, and tmux to build news agents for daily news recaps.

eugeneyan.com agents

423d

The Current State of Column-level Lineage

Explains the columnLineage dataset facet introduced in OpenLineage 0.9.0 for Spark integration. Covers how column-level lineage tracks which input fields produce each output field, its applications for GDPR/HIPAA/CCPA compliance, and the roadmap for extending support beyond Spark.

openlineage.io lineage

426d

Apache Airflow® 3 is Generally Available!

The article announces the general availability of Apache Airflow 3.0, marking the project's largest release in its history. This milestone release culminates four years of development and introduces substantial changes to the Airflow platform.

airflow.apache.org orchestration

435d

Testing Custom Flink Jobs on Decodable

This article provides guidance on testing custom Flink jobs on Decodable, focusing on modular implementations to improve testability when dealing with external service dependencies. It addresses a common challenge in Flink development and offers practical solutions.

decodable.co flink

447d

Integrate Qdrant and Neo4j to Enhance Your RAG Pipeline

This article demonstrates integrating Neo4j with Qdrant to enhance RAG pipelines by enabling external vector searches; it guides users through a local setup with preloaded data, illustrating the practical aspects of this integration.

neo4j.com databases

520d

Building Knowledge Graph Agents With LlamaIndex Workflows

The article explains how to build knowledge graph agents using LlamaIndex workflows, offering a blueprint for constructing Text2Cypher agentic interfaces; this integration provides practical insights into developing agentic data pipelines.

neo4j.com databases

530d

Common pitfalls when building generative AI applications

This article identifies common pitfalls encountered when building generative AI applications and provides examples.

huyenchip.com ml

531d

Claude Converses With Neo4j Via MCP

This Neo4j Developer Blog post explains how to use Anthropic's Model Context Protocol (MCP) to give LLMs like Claude access to knowledge graphs in Neo4j.

neo4j.com knowledge-graphs

558d

Building Effective Agents

Anthropic's guide to building reliable AI agents: tool use patterns, prompt chaining, evaluation frameworks, error recovery, and when NOT to use agents.

anthropic.com ml

559d

LangChain-Neo4j Partner Package: Officially Supported GraphRAG

Integrate Neo4j knowledge graphs with LangChain for powerful GraphRAG applications that deliver deeper, more insightful answers.

neo4j.com knowledge-graphs

561d

GraphRAG in Action: From Commercial Contracts to a Dynamic Q&A Agent

Explore how GraphRAG can be used to streamline the process of ingesting commercial contract data and building a Q&A Agent.

neo4j.com knowledge-graphs

583d

Model Context Protocol: Open Standard for AI Tool Use

MCP standardizes how AI models connect to data sources and tools. Client-server architecture with typed resources, tool definitions, and prompts that any LLM application can implement.

modelcontextprotocol.io ml

583d

Introducing the Fine-Tuned Neo4j Text2Cypher (2024) Model

Dive into the impact of fine-tuning models for the Text2Cypher task of transforming natural language questions to Cypher queries.

neo4j.com knowledge-graphs

593d

GenAI Stack Walkthrough: Behind the Scenes With Neo4j, LangChain, and Ollama in Docker

Learn how to build a support agent that relies on information from Stack Overflow using the GenAI Stack – Neo4j, LangChain & Ollama in Docker.

neo4j.com knowledge-graphs

595d

Benchmarking Using the Neo4j Text2Cypher (2024) Dataset

Explore performance benchmarks of LLM models on Neo4j Text2Cypher (2024) Dataset, comparing foundational vs. fine-tuned models for Cypher query translation.

neo4j.com knowledge-graphs

596d

Effortless RAG With Text2CypherRetriever

The Text2CypherRetriever allows users to retrieve data from Neo4j using natural language, simplifying query generation for GenAI applications.

neo4j.com knowledge-graphs

607d

The DuckDB Local UI

DuckDB ships a built-in web UI for interactive SQL exploration, schema browsing, and result visualization -- no install needed beyond the CLI.

duckdb.org duckdb

607d

Efficiently Monitor Neo4j and Identify Problematic Queries

The post describes how to monitor Neo4j in a clustered environment using tools like Dynatrace and Kibana, along with best practices.

neo4j.com knowledge-graphs

621d

Why Do I Need CDC?

This technical blog post explores the importance of Change Data Capture (CDC) for developers. It covers the fundamentals of CDC, its common use cases, and the advantages of log-based CDC compared to other approaches. Understand how CDC can improve operational performance, enable real-time analytics,

decodable.co streaming

625d

New GraphAcademy Course: Building Knowledge Graphs With LLMs

The new GraphAcademy course teaches how to convert unstructured data into graphs using GenAI, LLMs, and Python.

neo4j.com knowledge-graphs

629d

Detecting Bank Fraud With Neo4j: The Power of Graph Databases

Neo4j’s graph database enables real-time analysis to uncover hidden fraud rings and protect financial assets, aiding in bank fraud detection.

neo4j.com knowledge-graphs

635d

Turn Your CSVs Into Graphs Using LLMs

The post details how to turn CSV files into graph models using LLMs, simplifying data relationships and enhancing insights.

neo4j.com knowledge-graphs

635d

Neo4j Graphs, Acceleration Frameworks, and Recommendations: A Winning Trio

Learn to build accurate, explainable recommendation systems with minimal code using Neo4j graph database and Keymaker framework.

neo4j.com knowledge-graphs

639d

Building a GraphRAG Agent With Neo4j and Milvus

Learn how to build a GraphRAG agent using Neo4j and Milvus, combining graph and vector search for enhanced retrieval, better context, and accurate answers.

neo4j.com knowledge-graphs

642d

The Limitations of Text Embeddings in RAG Applications

Learn how to overcome the challenges of structured data operations in text embeddings in RAG applications using knowledge graphs.

neo4j.com knowledge-graphs

643d

Enhancing Hybrid Retrieval With Graph Traversal Using the GraphRAG Python Package

Enhance GraphRAG applications by combining hybrid search and graph traversal with Neo4j’s HybridCypherRetriever, improving retrieval for complex queries.

neo4j.com knowledge-graphs

645d

Understanding CDC with Debezium Server and Debezium Engine

Learn how Debezium, the de-facto standard for open-source change data capture (CDC), has evolved to support deployments without the need for Kafka-related infrastructure.

decodable.co streaming

646d

Building Enterprise AI with Knowledge Graphs and LLMs

How enterprises combine knowledge graphs with LLMs: grounding responses in structured facts, reducing hallucinations, enabling explainable AI, and the architectural patterns for graph-augmented generation.

thenewstack.io ml

651d

GraphRAG Field Guide: Navigating the World of Advanced RAG Patterns

Explore advanced GraphRAG retrieval patterns and how graph structures enhance RAG systems. Learn actionable strategies to implement and optimize GraphRAG.

neo4j.com knowledge-graphs

652d

Prefect 3.0: Workflow Orchestration Without the DAG

Prefect 3.0 drops DAGs entirely: Python-native flows with dynamic task creation, automatic retries, event-driven triggers, and a hosted platform that eliminates scheduler management.

prefect.io orchestration

654d

Building a Movie Recommendation System With Neo4j

Recommend movies to users based on their reading histories and ratings. Learn the setup of Neo4j, mapping data into Java with Neo4j Object Graph Mapper (Neo4j-OGM), and crafting Cypher queries for recommendations.

neo4j.com knowledge-graphs

656d

Why Every AI Application Needs a Semantic Layer

LLMs generating SQL without a semantic layer produce inconsistent, wrong metrics. How the dbt Semantic Layer provides guardrails: metric definitions, entity relationships, and governed access for AI agents.

getdbt.com dbt

664d

Direct Preference Optimization with Synthetic Data on Anyscale

The article details the application of Direct Preference Optimization (DPO) techniques utilizing synthetic data. It explores how these methods are implemented and leveraged within the Anyscale platform for machine learning model refinement.

anyscale.com ml

679d

Why Polars is Faster Than Pandas

Architecture-level comparison: Polars' Rust-based columnar engine with lazy evaluation, query optimization, and Apache Arrow memory vs Pandas' eager NumPy-backed row operations. Benchmarks on real workloads.

blog.jetbrains.com data-engineering

680d

Adventures with Apache Flink and Delta Lake

A comprehensive guide on troubleshooting and configuring Flink SQL to write to Delta Lake on S3 or MinIO.

decodable.co streaming

680d

How Snowflake Builds Its Query Optimizer

Inside Snowflake's Cascades-style query optimizer: join reordering, pruning with micro-partition statistics, adaptive execution, and how they test optimizer correctness at scale.

snowflake.com snowflake

685d

Ontologies for AI: Why Structure Still Matters

Ontologies provide the structured backbone that LLMs lack: taxonomies, controlled vocabularies, entity disambiguation, and how combining ontological reasoning with neural approaches produces more reliable AI systems.

poolparty.biz ml

688d

Apache Arrow DataFusion: A Fast Query Engine in Rust

DataFusion as a modular query engine: how it powers InfluxDB 3.0, Comet Spark accelerator, and Ballista distributed queries. Extensible optimizer, custom table providers, and user-defined functions in Rust.

arrow.apache.org arrow

690d

Declarative Resource Management for Real-time ETL with Decodable

This article explores declarative resource management in Decodable, highlighting its benefits for SDLC best practices, resource management, environment migration, and resource cleanup.

decodable.co streaming

690d

Why We Switched from Airflow to Dagster

The asset-centric paradigm shift: why defining what data should exist (Dagster assets) is better than defining how to compute it (Airflow tasks). Software-defined assets, IO managers, and testability.

dagster.io orchestration

695d

Why Iceberg Won the Table Format War

Analysis of how Iceberg's catalog-agnostic design, hidden partitioning, and multi-engine support gave it an architectural advantage over Delta Lake and Hudi.

blog.det.life iceberg

711d

Data Governance Without the Bureaucracy

Practical data governance: automated PII detection, column-level lineage, data contracts between teams, freshness SLAs, and how to implement governance incrementally without blocking teams.

montecarlodata.com data-engineering

716d

Real-time Feature Engineering with Flink and Feature Stores

Building real-time ML feature pipelines with Flink: window aggregations, CDC ingestion, point-in-time joins, and integration with Feast and Tecton feature stores.

ververica.com flink

721d

DuckDB Extensions: Building Your Own

How DuckDB's community extension system works: writing C++ extensions, the extension repository, signed distribution, and examples of spatial, httpfs, and Iceberg extensions.

duckdb.org duckdb

726d

GraphRAG: Knowledge Graph-Enhanced Retrieval for LLMs

Microsoft's GraphRAG approach: automatically building knowledge graphs from document corpora, community detection for topic summarization, and how graph-based retrieval answers global questions that vector search cannot.

microsoft.github.io ml

729d

Building an LLM Router for High-Quality and Cost-Effective Responses

This post describes a method for building an LLM router that dynamically selects the optimal LLM for a given request based on configurable criteria. It covers techniques for evaluating LLM performance, implementing routing logic, and optimizing for cost-effectiveness.

anyscale.com ml

730d

Building a Knowledge Graph from Unstructured Data with LLMs

Using LLMs to extract entities and relationships from documents, resolve coreferences, and populate a Neo4j knowledge graph. Includes schema design, prompt engineering for extraction, and evaluation metrics.

neo4j.com ml

739d

Dynamic Tables in Snowflake: Declarative Data Pipelines

Snowflake Dynamic Tables: define a pipeline as a SQL query and let Snowflake handle scheduling, incremental refresh, and dependency management. Replaces streams + tasks for most use cases.

docs.snowflake.com snowflake

741d

How Stripe Builds Reliable Data Pipelines

Stripe's ledger system for financial data: immutable event log, double-entry accounting in the data warehouse, reconciliation pipelines, and how they ensure every cent is accounted for.

stripe.com engineering

743d

Replacing Pandas with DuckDB for Data Engineering

DuckDB's relational API as a replacement for Pandas in ETL pipelines, with benchmarks showing 10-100x performance improvements on larger-than-memory datasets.

duckdb.org duckdb

746d

Ray Spotlight Series: Multitenant Serve Applications with Runtime Envs as Containers

The article explores the architecture of multitenant serving applications built on Ray, emphasizing the use of containerized runtime environments. It likely covers strategies for isolating different workloads, managing dependencies, and ensuring efficient resource utilization in production ML deploy

anyscale.com mlops

748d

Unity Catalog: The Open Source Data Catalog from Databricks

Databricks open-sources Unity Catalog, providing a universal governance layer across Delta, Iceberg, and Hudi tables with fine-grained access control and lineage tracking.

databricks.com databricks

749d

ClickHouse vs Snowflake: A Practitioner's Perspective

Honest comparison of ClickHouse and Snowflake architectures for real-time analytics workloads, covering query latency, ingestion throughput, cost models, and operational complexity.

clickhouse.com clickhouse

751d

What We Learned from a Year of Building with LLMs

Hard-won lessons from practitioners: prompt engineering diminishing returns, when to fine-tune vs RAG, evaluation beyond vibes, cost optimization, and the reliability gap between demo and production.

oreilly.com ml

756d

Data Quality at Scale: Lessons from Airbnb

Airbnb's Midas data quality framework: automated anomaly detection, lineage-based impact analysis, SLA tracking, and self-healing pipelines at petabyte scale.

medium.com data-engineering

760d

LinkedIn's Real-Time Data Infrastructure

How LinkedIn processes 7 trillion events per day: Kafka for event transport, Samza for stream processing, Venice for derived data serving, and Brooklin for cross-DC replication.

engineering.linkedin.com streaming

770d

Dagster vs Airflow: An Honest Comparison

Asset-centric vs task-centric orchestration: how Dagster's software-defined assets, type system, and built-in IO managers compare to Airflow's DAG paradigm.

dagster.io orchestration

772d

ClickHouse vs PostgreSQL for Analytics: When to Switch

When Postgres analytics hits a wall: column compression, vectorized execution, and approximate query processing in ClickHouse vs row-oriented scans in Postgres. Migration patterns and hybrid architectures.

clickhouse.com clickhouse

774d

Kimball is Dead, Long Live Kimball

Why dimensional modeling still matters even though the ELT era made star schemas seem obsolete. The semantic layer as the modern replacement for physical dimension tables.

benn.substack.com data-engineering

777d

Feature Stores Explained: Building ML Features at Scale

Why feature stores exist: the training-serving skew problem, online vs offline stores, feature computation patterns, and how Feast, Tecton, and Hopsworks compare architecturally.

hopsworks.ai ml

780d

How Netflix Migrated from Hive to Iceberg

Netflix's migration from Hive to Iceberg at exabyte scale, including incremental processing patterns with Maestro orchestrator and Spark.

netflixtechblog.com data-engineering

784d

Knowledge Graphs for RAG: Beyond Vector Search

Why vector similarity alone fails for complex reasoning. Using Neo4j knowledge graphs alongside embeddings: entity extraction, relationship mapping, graph traversal for multi-hop queries, and hybrid retrieval.

blog.langchain.dev llm

784d

How Figma Scaled to Multiple Databases

Figma's horizontal sharding journey: from a single Postgres instance to 100+ shards using PgBouncer, application-level routing, and their custom migration tooling for zero-downtime resharding.

figma.com postgres

794d

Text-to-SQL is Harder Than You Think

Why LLM-generated SQL fails in production: schema ambiguity, implicit business logic, multi-table joins, aggregate semantics, and why a semantic layer is the real solution instead of better prompting.

numbersstation.ai ml

797d

Data Contracts: The Missing Link in Data Mesh

How data contracts formalize the interface between producers and consumers, with practical schema enforcement patterns using protobuf, JSON Schema, and dbt tests.

dataproducts.substack.com data-engineering

800d

Retrieval Augmented Generation: Beyond the Basics

Advanced RAG patterns: multi-query retrieval, recursive summarization, parent-child chunk linking, self-RAG with reflection, and corrective RAG that verifies its own retrievals.

blog.langchain.dev llm

802d

Cube.js: The Headless BI Semantic Layer

How Cube's semantic layer sits between databases and consumers: pre-aggregations, access control, caching, and serving consistent metrics to dashboards, notebooks, and LLMs via API.

cube.dev analytics

804d

Practical Guide to RAG Pipeline Evaluation

End-to-end guide for building production RAG systems: chunking strategies, embedding model selection, retrieval metrics (MRR, NDCG), reranking, and hallucination detection.

anyscale.com ml

807d

Kafka Without ZooKeeper: KRaft Production Guide

Practical guide to running Kafka with KRaft consensus, covering migration from ZooKeeper, operational considerations, and performance characteristics.

confluent.io kafka

810d

The dbt Semantic Layer: Metrics as Code

How the dbt Semantic Layer works: MetricFlow engine, semantic models, dimension/measure definitions, and querying metrics from any BI tool via the JDBC/GraphQL API.

getdbt.com dbt

812d

The Modern Data Stack is Dead, Long Live the Modern Data Stack

The original modern data stack (Fivetran + Snowflake + dbt + Looker) matured into a commodity. What comes next: embedded analytics, semantic layers, and AI-native data tools.

benn.substack.com data-engineering

814d

pgvector: Embeddings and Vector Search in Postgres

Production-grade vector search with pgvector: HNSW vs IVFFlat index tradeoffs, optimal dimensionality, bulk loading strategies, and benchmarks against dedicated vector databases.

supabase.com postgres

820d

Spark Connect: Decoupling Spark Applications from the Cluster

Spark Connect introduces a thin client protocol that separates Spark applications from cluster infrastructure, enabling remote execution without shipping JARs or managing classpaths.

spark.apache.org spark

821d

LLM Inference at Scale: Techniques That Actually Matter

Production LLM serving: continuous batching vs static batching, PagedAttention (vLLM), speculative decoding, KV cache optimization, and how to maximize GPU utilization.

anyscale.com ml

825d

RDF, SPARQL, and the Semantic Web in 2024: Still Relevant?

The semantic web stack (RDF, OWL, SPARQL) is quietly powering enterprise knowledge management. How knowledge graphs, linked data, and ontologies are being integrated with LLMs and modern data architectures.

stardog.com databases

825d

How We Built a Cost-Effective Data Lake with S3, Iceberg, and Trino

Spotify's migration from proprietary data warehouse to an open lakehouse stack: partition pruning strategies, compaction scheduling, and how they cut compute costs by 40%.

engineering.atspotify.com lakehouse

828d

DuckDB as the New jq

Using DuckDB as a command-line JSON processor, replacing jq for complex data transformations with SQL syntax.

pgrs.net duckdb

832d

Airflow 2.x: Best Practices for Production Deployments

Production Airflow patterns: KubernetesExecutor vs CeleryExecutor, DAG serialization, connection management, XCom anti-patterns, and monitoring with StatsD and Prometheus.

airflow.apache.org orchestration

833d

Flink 2.0: Unified Batch and Stream Processing

Flink's roadmap toward true batch-stream unification, materialized tables, and the new ProcessFunction API for stateful event processing.

flink.apache.org flink

835d

Postgres Performance Tuning: The Definitive 2024 Guide

Deep dive into Postgres internals: shared_buffers vs OS cache, parallel query tuning, JIT compilation tradeoffs, connection pooling with PgBouncer, and VACUUM strategies for write-heavy workloads.

crunchydata.com postgres

838d

Postgres is Enough

The case for using Postgres as your only database: JSONB for documents, pg_cron for scheduling, pgvector for embeddings, logical replication for CDC, and extensions for everything else.

amazingcto.com postgres

841d

The Semantic Layer: A New Foundation for Data and AI

What a semantic layer actually is beyond marketing: universal metric definitions, entity relationships, access policies, and why it matters more in the age of LLM-generated SQL.

atscale.com data-engineering

843d

How Discord Stores Trillions of Messages with ScyllaDB

Discord's migration from Cassandra to ScyllaDB for their message store: hot partition detection, consistent hashing, compaction tuning, and achieving P99 reads under 1ms at 2T messages.

discord.com engineering

847d

Delta Lake vs Iceberg vs Hudi: A Hands-On Comparison

Side-by-side comparison of the three major open table formats covering ACID semantics, schema evolution, time travel, compaction, and ecosystem support.

onehouse.ai lakehouse

848d

dbt Mesh: Multi-Project Architectures at Scale

How dbt Mesh enables cross-project dependencies, model contracts, and versioning for organizations running hundreds of dbt projects across teams.

getdbt.com dbt

852d

ClickHouse MergeTree Internals

How ClickHouse's MergeTree engine works: LSM-tree-inspired sorted parts, sparse primary index, data skipping indexes, background merges, and why it achieves sub-second queries on billions of rows.

clickhouse.com clickhouse

852d

Fine-tuning LLMs Is Not As Hard As You Think

Practical fine-tuning guide using TRL, QLoRA, and Flash Attention 2. Covers dataset preparation, hyperparameter selection, evaluation, and deployment with real cost breakdowns.

philschmid.de ml

854d

Photon: The Next Generation Spark Engine at Databricks

Photon is a C++ vectorized execution engine that replaces Spark's JVM-based Catalyst for scan-heavy workloads, achieving 3-8x speedups through SIMD, memory-mapped I/O, and adaptive execution.

databricks.com databricks

862d

The Rise of the Analytics Engineer

How the analytics engineer role evolved from a dbt power user to a critical bridge between data engineering and business intelligence, with practical career guidance.

getdbt.com dbt

867d

WarpStream: Kafka Without the Disks

WarpStream's architecture: a Kafka-compatible broker that writes directly to S3 instead of local disks. No inter-broker replication, no partition reassignment, and 80% cheaper than self-hosted Kafka.

warpstream.com kafka

867d

How Uber Serves Over 40 Million Reads Per Second from Online Storage

Uber's integrated caching layer that combines Docstore with an in-process cache, handling 40M reads/sec with sub-millisecond P99 latency.

uber.com engineering

868d

Apache Iceberg: The Definitive Guide (O'Reilly)

Companion post to the O'Reilly book covering Iceberg's hidden partitioning, schema evolution, time travel, and compaction strategies for production lakehouses.

dremio.com iceberg

872d

Data Vault 2.0 in Practice: When and Why

Practical guide to Data Vault 2.0: hub-link-satellite patterns, hash keys for parallelism, point-in-time tables, and when Data Vault makes sense vs One Big Table or dimensional modeling.

scalefree.com data-engineering

874d

Apache Arrow: The Universal Columnar Format

How Arrow's in-memory columnar format enables zero-copy data exchange between Spark, DuckDB, Pandas, Polars, and databases via ADBC and Flight SQL.

arrow.apache.org duckdb

877d

Exactly-Once Semantics in Kafka: How It Actually Works

The mechanics behind Kafka's exactly-once guarantees: idempotent producers, transactional messaging, consumer group coordination, and the performance cost of EOS vs at-least-once.

confluent.io kafka

881d

Designing Data-Intensive Applications in 2024

Martin Kleppmann's reflections on how the landscape has changed since DDIA: new consensus protocols, CRDTs in production, the shift to event streaming, and what he'd write differently today.

martin.kleppmann.com engineering

883d

Querying Parquet Files on S3 with DuckDB

DuckDB's multi-database support: attach Postgres, MySQL, and SQLite databases alongside local files, and query across them with standard SQL joins.

duckdb.org duckdb

887d

Emerging Architectures for LLM Applications

Reference architecture for LLM applications covering RAG pipelines, embedding models, vector databases, orchestration frameworks, and evaluation patterns.

a16z.com ml

893d

RAG at Scale: 10x Cheaper Embedding Computations with Anyscale and Pinecone

Anyscale's blog post describes a technique to reduce the cost of embedding computations in retrieval-augmented generation (RAG) pipelines by 10x using Anyscale and Pinecone. This approach is .

anyscale.com ml

897d

The GPU Poor: How to Run LLMs Without Breaking the Bank

Quantitative analysis of GPU options for LLM inference and fine-tuning: comparing H100 vs A100 vs consumer GPUs, quantization tradeoffs (GPTQ, AWQ, GGUF), and cost-per-token calculations.

timdettmers.com ml

897d

Friendly SQL in DuckDB

DuckDB's SQL dialect extensions that make queries more readable: GROUP BY ALL, SELECT * EXCLUDE, implicit column aliases, and string slicing.

duckdb.org duckdb

898d

Apache Kafka vs Apache Pulsar: A Technical Comparison

Deep comparison of Kafka and Pulsar replication models: ISR vs BookKeeper quorum writes, tail latency characteristics, data loss scenarios, and how each handles broker failures.

jack-vanlightly.com kafka

899d

The Illustrated Stable Diffusion

Visual walkthrough of how Stable Diffusion works: the latent space, the denoising U-Net, CLIP text encoder, classifier-free guidance, and how LoRA fine-tuning adapts the model.

jalammar.github.io ml

901d

Snowflake's Architecture: A Deep Dive

Internal architecture of Snowflake's multi-cluster shared data architecture, covering storage layer, virtual warehouses, metadata store, and query optimization.

snowflake.com snowflake

903d

Mixture of Experts: How Sparse Models Scale

The MoE architecture behind Mixtral and Switch Transformer: expert routing, load balancing, training instability, and why sparse models achieve better performance per FLOP than dense models.

huggingface.co ml

905d

Attention Is All You Need (Explained)

The definitive visual explanation of the Transformer architecture: self-attention, multi-head attention, positional encoding, and how information flows through encoder-decoder layers.

jalammar.github.io ml

908d

Data Contracts and Data Observability: Whatnot’s Full Circle Journey to Data Trust

Whatnot went from no modern data stack to processing tens of millions of events across hundreds of event types each day. Zack Klein explains how Whatnot leverages data contracts and data observability to achieve high quality data at scale for stakeholders, focusing on a small team's approach to data

montecarlodata.com data-engineering

909d

One Billion Row Challenge in SQL

Solving the viral 1 Billion Row Challenge using DuckDB SQL instead of Java -- demonstrating that a single SQL query on a laptop can process 1B rows in under 4 seconds.

rmoff.net duckdb

910d

The Log: What every software engineer should know about real-time data

Jay Kreps' foundational essay on the unified log abstraction, how it connects databases, distributed systems, and real-time data -- the intellectual basis for Kafka.

engineering.linkedin.com kafka

912d

Credit Karma’s Journey to Reliable Generative AI Models with Data Observability

Vishnu Ram from Credit Karma discusses data reliability for LLMs and offers best practices related to data observability in AI.

montecarlodata.com ml

939d

How Pie Insurance Created a Self-Serve Incident Triaging & Resolution Workflow with Monte Carlo and Slack

Ed Presz from Pie Insurance explains how his team built an incident detection and notification workflow to drive ownership of data quality for business stakeholders.

montecarlodata.com data-engineering

944d

Data Observability: Reliability In The AI Era

The article emphasizes that for GenAI, data observability must prioritize resolution, pipeline efficiency, and streaming/vector infrastructures.

montecarlodata.com ml

947d

LLM-based summarization: A case study of human, Llama 2 70b and GPT-4 summarization quality

This article provides a case study focused on LLM-based summarization, comparing the quality of summaries generated by humans, Llama 2 70b, and GPT-4. It evaluates their respective performances in summarization tasks.

anyscale.com ml

965d

Evaluation & Hallucination Detection for Abstractive Summaries

Reference, context, and preference-based metrics, self-consistency, and catching hallucinations.

eugeneyan.com ml

1032d

Llama 2 is about as factually accurate as GPT-4 for summaries and is 30X cheaper

The article presents findings indicating that Llama 2 achieves comparable factual accuracy to GPT-4 in summarization tasks. It further highlights that Llama 2 is approximately 30 times more cost-effective for these operations.

anyscale.com ml

1043d

Patterns for Building LLM-based Systems & Products

Evals, RAG, fine-tuning, caching, guardrails, defensive UX, and collecting user feedback.

eugeneyan.com ml

1067d

Exploring PyTorch and Open-Source Communities: Interview with Soumith Chintala

Discover PyTorch's journey in an episode with Soumith Chintala, its Co-Creator and Meta's VP/Fellow. Learn about TensorFlow's impact, community-guided innovation, and the open vs. closed-source debate.

wandb.ai ml

1083d

Fine tuning is for form, not facts

This article explores the purpose and impact of fine-tuning in large language models. It posits that fine-tuning primarily influences the stylistic and structural "form" of an LLM's output rather than imparting new factual information.

anyscale.com ml

1092d

Announcing Aviary: Open Source Multi-LLM Serving

Describes Aviary, an open-source solution designed for multi-LLM serving. The post introduces the project's capabilities and its role in managing diverse large language models within a single serving infrastructure.

anyscale.com mlops

1127d

Numbers every LLM Developer should know

Presents essential numerical data that LLM developers should be aware of, covering performance metrics, operational costs, or scaling factors relevant to deploying and managing large language models. The article provides practical insights for optimizing LLM systems.

anyscale.com llm

1141d

Building a Q&A Bot for Weights & Biases' Gradient Dissent Podcast

In this article, we explore how to utilize OpenAI's ChatGPT and LangChain to build a Question-Answering bot for Weights & Biases' podcast series, Gradient Dissent.

wandb.ai ml

1161d

How to fine tune and serve LLMs simply, quickly and cost effectively using Ray + DeepSpeed + HuggingFace

Explores methods for fine-tuning and serving LLMs efficiently and cost-effectively, leveraging technologies such as Ray, DeepSpeed, and HuggingFace. The article is the fourth in a series, building upon previous discussions of Ray's capabilities for generative AI infrastructure and performance optimi

anyscale.com llm

1178d

Data Contracts for the Warehouse

This article focuses on data contracts for data warehouses, emphasizing programmatic accountability in batch data processing. It outlines the importance of defining and enforcing data contracts to improve data quality and reliability.

dataproducts.substack.com data-quality

1253d

Deep Dive: Data Ingest in a Third Generation ML Architecture

This Anyscale blog post dives into data ingest in a third-generation ML architecture, specifically using Ray Data. It provides code samples to illustrate how distributed libraries can improve performance by exploiting distributed memory bandwidth.

anyscale.com ml

1674d

The Third Generation of Production ML Architectures

Discusses the evolution of production machine learning architectures, categorizing them into generations. The article describes the shift from fixed-function pipelines to programmable pipelines, and then speculates on the characteristics of the emerging third generation of ML architectures.

anyscale.com architecture

1750d

Introducing Distributed XGBoost Training with Ray

XGBoost-Ray is a new backend for distributed XGBoost training that supports multi-node and multi-GPU setups. It includes distributed data loading, fault tolerance with elastic training, and integrates with the Ray Tune hyperparameter optimization framework.

anyscale.com ml

1841d

Introducing Collective Communication Primitive APIs in Ray

Ray 1.2.0 introduces a new library of collective communication primitives designed to streamline information exchange across numerous distributed processes. These primitives aim to simplify distributed operations within Ray programs and provide substantial speedups, potentially by an order of magnit

anyscale.com mlops

1860d