Meta joins Databricks funding, $500bn Stargate, Orchestra releases Git #68 w/e 24 Jan 2025
Join the 4,900-strong data herd getting all you need to know about Data for your Friday roundup
Swiftly closing in on 5,000 subscribers. This was a huge week in the Data and AI space
Open AI announcing the launch of the $500bn Stargate AI fund to help the US build AI dominance.
Meta were announced as investors in Databricks December $10bn Series J funding round. Interesting as the choice quotation for us is “"Thousands of customers are using Llama on Databricks and we have been working closely with Meta on how to best serve those enterprise customers with Llama," Databricks co-founder and CEO Ali Ghodsi said.” — If Llama is the IP then Databricks is the bestower of power.
We have seen it time and time again that no matter how many competing open source frameworks there are, the one that gets adoption wins. Therefore, Meta clearly believe Databricks are the way people are going to start using Llama. So can they keep it up?Popular GTM Tool Clay has raised $40m Series B and are absolutely flying - congrats guys
Want to get more news like this delivered into your inbox? Subscribe now
Announcing Git Control
Today we’re excited to announce one of Orchestra’s biggest ever features — GitBridge.
GitBridge is a bi-directional sync in between the Orchestra UI and your Git Provider. By providing this capability, Data Teams can get all the benefits of git while leveraging the power of the Orchestra UI to rapidly build data pipelines.
Data Teams can now
Directly edit the code generated by the Orchestra UI
Create different branches using Git that are synchronised with the Builder View in Orchestra
Collaborate together on the same pipelines
Store an audit trail of user-level commits
Review scheduling and orchestration changes in PR Review
Implement CI/CD checks as part of pipeline deployment.
This is a huge development for us as a team. We have always wanted to provide data teams with a best-in-class developer experience for building data pipelines. As such, the core pipeline has always been code-first using .yml files which were not editable or viewable to the user until now.
Winter Data Conference
Excited to share that anyone using our special code HUGO50 can get a 50% discount to the Winter Data Conference in Zell Am See - check it out here.
Meme Drop
When you try to build your own orchestration tool
Medium 🧠
🧠 A Declarative Orchestration Framework: GitBridge in Orchestra (link)
🧠 Behind the Scenes of a Successful Data Analytics Project (link)
🧠 How to implement thousands of Data Quality Tests using AI (link)
🧠 Accessing the Unity Catalog in Databricks using Duck DB (link)
🧠 Metrics for Data Engineering: Why, How, and Which Metrics to Track (link)
🧠 How to Build an MFA Audit System with Streamlit in Snowflake Notebooks (link)
LinkedIn🕴
🕴 TPCH Snowflake vs. Databricks vs. Fabric (link)
🕴 AI Engineering with Chip Huyen: Podcast Conversation (link)
🕴 Snowflake Query performance on Iceberg reaches 95% vs. Snowflake (link)
🕴 SQL Flag Check with UNNEST and LOGICAL_OR (link)
🕴 Chris Tabb Heads to Austin for Data Day Texas – See You There! (link)
🕴 Snowflake Accelerates GenAI: Cheaper, Faster Models Now Available! (link)
🕴 Snowflake estimates the TAM for Data is $170bn (link)
News 📰
📰 EY-Databricks Alliance | EY - Netherlands (link)
📰 Meta backs Databricks in a historic $15 billion funding round (link)
📰 Popular Lead Generation SAAS start-up Clay raise $40m (link)
YouTube and Podcast 🎥
Editor’s Pick
🎥 How to build you first GIT BACKED DATA PIPELINE using Orchestra and Github (link)
🎥 Where Data Science Meets Shrek: How BuzzFeed uses AI (link)
Special 💫
Keeping these two below from last week in case they were missed
💫 This incredible podcast series on data and AI innovation in the manufacturing and industrial sectors in Europe - HIGHLY RECOMMENDED (link)
Jobs 💼
💼 Associate Data Engineer at Aimpoint Digital (link)
💼 Lead Data Engineer at Radicle Health (link)
💼 Senior Analytics Engineer at Grafana Labs (link)
💼 Principal Analytics Engineer at Bluecore (link)
💼 Senior Data Engineer at Swapcard (link)
Want to save on your ingestion bills? You’ll love this
You can leverage Python for lightweight ELT integrations. Here you’re only paying for compute and not being penalised by row-based pricing models. Pretty neat right? Check it out below / head to Orchestra and start today.
The best place to run dbt?
Don’t believe us? Watch the video below.