Snowflake/Databricks @ it again, Fivetran dbt core no longer free, DQ is still low ⚡#58
Join the 4,400-strong data herd getting all you need to know about Data for your Friday roundup
The latest and greatest for w/e 10 Nov 2024. As always if this was helpful, please do subscribe. If not, please let us know why not. We hate generic-ness. Decent week this week, people focussing on quality and processes and a bit of moving and shaking from Databricks / Fivetran.
Want to keep an eye on other news? Hit subscribe below.
Or read on Medium https://medium.com/@hugolu87
Orchestra Product Updates - More Azure More Microsoft
Azure Fabric Data Quality Testing → execute T-SQL or stored procs on Fabric and run data quality tests
dbt-fabric improvements (now your data catalog for fabric will be auto populated by dbt-fabric)
Reconciling data across bigquery, databricks, SQL Server, Snowflake or Fabric? These are all now supported (try it here).
Parameters - if you work with a METADATA FRAMEWORK you can set global parameters that are retained for task retrying - check out the docs.
Still use Shipyard? You won’t for much longer
Shipyard will be turned off on the 15 of November. Check out our migration guide here.
Winter Data Conference
Excited to share that anyone using our special code HUGO50 can get a 50% discount to the Winter Data Conference in Zell Am See - check it out here.
Meme Drop
For all my Airflow users:
Medium 🧠
**Editors Pick** - so it looks like people are really cottoning on to the “Gold Rush Paradox and starting to demand much higher levels of data quality. Will Data Leaders follow? See
- The Quality Paradox
- From Ben Rogojan;why you need to understand the business metrics
- The Gold Rush Paradox
🧠 So I have a Data product - now what? (link)
🧠 Should you build a custom orchestration system (link))
**Editors Pick**
🧠 Reflections on Palantir (link)
🧠 Tutorial: setting up dbt with Microsoft Fabric (link)
🧠 Databricks vs. (Optimised) Snowflake by the numbers (link) (Seriously though, why so much focus on benchmarking? They are obviously similarly costed (link)
🧠 Snowflake Horizon — solving data governance challenges for Snowflake AI & Data cloud (link)
LinkedIn🕴
🕴People really need better dbt documentation (link)
🕴Improved flexibility in Orchestra vs. dbt Cloud RE Git integration (link)
🕴Why Airflow still beats Dagster (link)
News 📰
**Editors Pick**
📰 Fivetran dbt core is no longer free (link)
📰 Databricks have launched SAAS (link)
YouTube and Podcast 🎥
**Editors Pick**
🎥 Ingesting Data in Apache Iceberg format for Dremio (link)
🎥 How to Use Publish-Audit-Merge Workflow in Apache Iceberg: A Beginner’s Guide (link)
Jobs 💼
💼 Analytics Engineer at Loop returns (link)
💼 Senior Product Analyst at ProductBoard
💼 Very exciting data engineering role at Landmarc (recommended, (link))
💼 Interested in Data Engineering for one of the best charitable orgs in the UK? Follow Enthuse or get in touch (link)
💼 Interested in building the future of Data in VC at Dawn? Get in touch to learn more about this one.
Some great data roles around platform and architecture at Lundbeck Pharma (link)
💼 Senior BI Developer at SERB Pharma (link)
Other 💫
**Editors Pick**
💫 So you wanna run dbt on Databricks (link) ←- Yes you could do that, or you could just run it in Orchestra.
Running dbt Core™️ ? You’ll love this
💡 Read more about the Orchestra dbt Core™️ integration here
dbt Core obviously needs to run in an orchestrator - if you’re not doing this already, what are you doing? Many Data Teams are realising that their 99% uptime isn’t actually enough to get stakeholders to trust the Data in BI Use-cases; uptime needs to be much higher, so that’s why you need orchestration and visibility of pipelines.
Orchestra supports running dbt, with some great features:
Enhanced debugging! Identify dbt model/test cost bottlenecks easily
Simplification! One less platform to manage; let Orchestra be your dbt™️ HQ
Price! Simple and lightweight usage-based pricing where the unit costs decrease as your models increase
Worth talking? Chat here.
Gotta link to the Shipyard shutdown announcement?