Orchestra release state-aware orchestration, SAP acquires Dremio+Prior Labs, announces $1bn Frontier AI Lab #132 w/e 1 May 2026
Join the 6,800-strong data herd getting all you need to know about Data for your Friday roundup
The market continues to move so fast we’ve had to combine multiple round-ups into one!
Firstly we released state-aware orchestration for dbt at Orchestra. This is a huge feature capable of making running dbt even easier , cheaper and faster than before.
SAP have also acquired dremio and Prior Labs. They are not going down without a fight. We wrote a lot about this pattern before with Salesforce and INformatica - SAP appears to be going gangbusters and muscling their way into the data business too.
This is likely a great acquisition for them. In dremio there is an incredible team, a bunch of iceberg expertise, a lakehouse, a query engine, and in Prior Labs something very shiny indeed.
Nuts times!
Orchestra state-aware orchestration
SAO is an Orchestra-managed python package built for dbt Core™ that allows users to store and use the state of models. This helps to orchestrate dbt Core™ more intelligently and only run models when certain conditions are met.
With Orchestra SAO, Orchestra users can:
Save cost and speed-up runs: the state of dbt models is stored using
source_freshnesschecks which means models are skipped if there is no new data and no code has changedStop wasting time tagging models: adding
build_afterconfigurations allow for declarative scheduling. There is no need to manual taggingMove to real-time: dbt state-aware can run every 10 or 5 minutes, enabling close-to-realtime patterns with leading ELT tools like Estuary
Users of dbt do not need to upgrade to a proprietary engine. Orchestra leverages the Sao Paolo repo. Simply enable Orchestra SAO with one line of code:
tasks:
execute_dbt:
integration: DBT_CORE
integration_job: DBT_CORE_EXECUTE
parameters:
commands: dbt build
package_manager: PIP
python_version: ‘3.12’
use_state_orchestration: true // <-- this line
.
.
.Some of the biggest and most advanced data teams are already leveraging State-Aware Orchestra in Orchesta. Experian use Orchestra State-Aware Orchestration to power dozens of analysts across multiple domains, to easily build SQL models powering marketing analytics.
Orchestra Product Features
Scaling sequential Tasks in the MetaEngine - one config, infinite possibilities. Metadata-driven frameworks in ADF are a thing of the past!
Large Task Outputs - you can now pass data between tasks up to many Megabytes
Force Cancel - you can force cancel any task, pushing it into a cancelled state
Omni Integration
Pipeline run concurrency is live! Read more here
Monitor your Estuary Syncs from Orchestra (read more here)
MCP Video is live. This is the only MCP you need or API you need to reliably fix your data pipelines automatically with AI.
You can now run claude agents in Orchestra. These can automate anything, especially things you can design as code submission tasks.
AI Assistant is in Orchestra too - ask our assistant questions without having to leave to the docs!
Released our EKS operator - you can now trigger and monitor flows on Kubernetes in AWS
UI RESKIN! DARK MODE AND LIGHT MODE! TELL YOUR FRIENDS!
Improved Docs Site — CHECK OUT OUR DOCS AND ASK IT QUESTIONS
Workspaces - say goodbye to multiple airflow instances
Improved Lineage Filtering
Medium 🧠
🧠 Announcing State-Aware Orchestration (“SAO”) in Orchestra (link)
🧠 Snowflake Cortex AI — Complete Developer Guide 2026 (link)
🧠 Getting Data Into Snowflake Shouldn’t Be the Hard Part (link)
🧠 Ghost: A Database for Our Times? (link)
🧠 How to Get Hired in the AI Era (link)
🧠 State of Routing in Model Serving (link)
🧠 dbt State-Aware Orchestration: dbt Source Freshness (link)
🧠 Your Data Products Need a Product Manager. Here Is What That Actually Means (link)
🧠 Migrating Your Shopify Data Pipeline from REST to GraphQL with Openflow and Snowflake (link)
🧠 Enterprise-Ready ML — Modular Repos, One Snowflake Platform (link)
🧠 You Don’t Need Many Labels to Learn (link)
🧠 Claude is my new favourite intern | Tips for getting Claude Promoted (link)
🧠 The Human Infrastructure: How Netflix Built the Operations Layer Behind Live at Scale (link)
🧠 Dynamic Tables + dbt: A powerful combination (link)
🧠 Incremental Data Loading in Snowflake Done Right: Streams, Tasks, and Real-Time Pipelines (link)
🧠 Predicting Risk in Content Launches: How Data-Driven Insights can Transform Launch Planning (link)
🧠 From Messy Data to Clean Categories: Snowflake Batch Cortex Search for High-Throughput Entity Resolution (link)
LinkedIn 🕴
🕴 When ‘Serverless’ Isn’t Enough: Compute Control and Efficiency Compared (link)
🕴 Snowflake vs Databricks: Faster Spark with Just Two Lines of Code? (link)
🕴 Beyond External Tables: How BigLake Elevates BigQuery Data Access (link)
🕴 Mixed Model Arts (Book 1): Final Edits Underway (link)
🕴 Small World Moments: Running Into Dr. Joe Perez in Tokyo (link)
🕴 Messy Data, Real Problems: The Case for State-Aware Orchestration (link)
🕴 Introducing State-Aware Orchestration: A Smarter Way to Run dbt Core (link)
🕴 Live in Utah: Rare Talk on Data Modeling, AI & My New Book (link)
🕴 Why AI Won’t Deliver 10x Gains (link)
🕴 Marketing vs Reality: Who’s Actually Winning the Open Data Race? (link)
🕴 Running Multi-Tenant dbt Projects Without the Headaches (link)
🕴 Snowflake Adaptive Warehouses: The End of Cluster Sizing Guesswork (link)
🕴 From Open Source to Paywalls: The Pricing Shift No One Talks About (link)
🕴 Stop Fighting Dynamic DAGs: A Simpler Way to Build Scalable Pipelines (link)
News 📰
📰 Does Snowflake’s New AI Workflow Upgrades Reshape The Bull Case For Snowflake (SNOW)?… Read More
📰 Databricks Is On An M&A Roll With $1B Neon Acquisition…Read More
📰 Databricks Looks To Exceed $100B Valuation With New Funding Round…Read More
📰 Data: The Seed Funding Boom Is Concentrating Capital In The San Francisco Bay Area…Read More
📰 Databricks announces UK expansion and investment in data and AI operations… Read More
📰 Tiger Global-backed Upscale AI eyes $200M raise at $2B valuation: report…Read More
📰 The Week’s 10 Biggest Funding Rounds: Transportation And Biotech Take The Lead…Read More
📰 SAP acquires Dremio
📰 SAP acquires Prior Labs
YouTube and Podcast 🎥
Editor’s Pick
🎥 Most Microsoft Fabric Workspaces Are Organized Wrong (link)
🎥 Intro to State Aware Orchestration For Analytics Engineers (link)
🎥 Setting Up dbt build_after in dbt | Orchestra State Aware Orchestration (link)
🎥 Configuring dbt Source Freshness | Orchestra State Aware Orchestration (link)
🎥 Meaning and Metrics - The Evolution of the Semantic Layer (link)
Special 💫
💫 Data Engineering Weekly #267 (link)
Jobs 💼
💼 Senior Data Engineer at Fusion Risk Management (link)
💼 Analytics Lead at Create Wellness (link)
💼 Senior Consultant, AI & Data Engineering at Castleton Tower Consulting (link)
💼 Analytic Engineer / Modeler at Bose (link)
💼 Technical Architect, Analytics Engineering (AI-Augmented) at Mammoth Growth (link)
💼 Analytics Engineer at Amplify (link)
Run dbt models cheaply and easily?
If you’re looking for an easy way to run your dbt core models, look no further than Orchestra.
dbt, dbt core and dbt labs are all trademarks of dbt labs inc


