BACK TO ALL POSTS

DataHub Community Update

Project Updates

Open Source

Metadata

DataHub

Data Engineering

Maggie Hays

Dec 13, 2021

Project Updates

Open Source

Metadata

DataHub

Data Engineering

November 2021 Edition

DataHub town hall

Hello, DataHub Enthusiasts!

Welcome to another round of DataHub Project updates. Let’s get you up to speed on what happened in November! Curious about what’s happened in prior months? Head over to the Project Updates section to read all about it.

Community Updates

Every week, we’re welcoming between 50–60 new members to the DataHub Slack Community. Folks joining from all over the world, eager to learn from one another and to make the most of DataHub within their organizations. Things are moving fast, so we’ve introduced a couple of ways to help you stay up to speed.

DataHub Community Newsletter

We recently announced a new DataHub Community Newsletter to keep you informed about:

📥 DataHub Project Updates, summarizing the most important news from the OSS Project and Community

Feature Highlights & Tutorials, helping you make the most of DataHub

🗓️ Community Event Invitations, keeping you informed of upcoming opportunities to hear from Community Members directly

Excited? Subscribe here!

DataHub Feature Request Portal

During the November Town Hall, I announced our new Feature Request Portal. Our goal is to provide transparency into what the community is looking for and what we are working on.

See it in action here!

Project Updates

During the month of November, we saw 130 commits from 20+ people across 16 companies to the DataHub Project; this is surpassing our 100 commits per month trend.

The v0.8.17 release was jam-packed with features and functionality, including:

  • Recommendations are now available on Landing Page (watch the demo here!)
  • Apple M1 support for Docker images
  • Improvements to ingestion sources, including Kafka Connect, Trino, dbt, and OpenAPI

Read the full release notes to learn more!

We also starting to see more & more folks in the community contribute to — and take the lead on — discovery & design inititaves, including:

  • Tableau Integration
  • Prefect Integration
  • Data Quality Design

DataHub Basics — Lineage 101

During the November Town Hall, John Joyce and Surya Lanka (Acryl Data) presented the basics of Lineage & as well as net-new functionality to auto-extract lineage from Snowflake.

The goal of Lineage in DataHub is to provide full visibility into end-to-end data flow, spanning across data platforms, tools, transformation steps, presentation layers, etc. This enables:

  • proactive impact analysis — understanding who will be affected when you make a change to a particular data asset
  • reactive debugging something once something has gone wrong
DataHub provides end-to-end data lineage to enable proactive impact analysis & reactive data debugging

DataHub provides end-to-end data lineage to enable proactive impact analysis & reactive data debugging

Check out the presentation below to learn all about how Lineage in DataHub and how you can easily get started on your own.

Sneak Peek! No-Code UI Extension

One of the most compelling reasons that people adopt open-source software like DataHub is the ability to customize it for your organization’s specific use cases. The downside is that it can require writing complex code & maintaining a fork of the Project, which can be grow burdensome over time.

We know that many teams within the Community are eager for ways to fine-tune the DataHub UI to address their end-users use cases, so we’re taking steps to make UI Extensions easy to manage.

Soon you will be able to extend the DataHub metadata model specific to your organization’s needs & surface those details in the UI without having to write a single line of code or maintaining a forked repo.

Sounds magical, right? Learn what to expect from Shirshanka Das and Gabe Lyons (Acryl Data), including a live demo that had us all on the edge of our seats 😱


DataHub API Authentication

As you may have seen in our Blog, we rolled out announced Authentication in the Metadata Service.

Prior to this rollout, there was no formal support for making authenticated requests to the APIs exposed by the Metadata Service, including to the GraphQL API and the Rest.li Ingestion APIs. This meant that anyone with network access to the DataHub services could successfully make unauthenticated requests.

Want to learn more? Read John’s tech deep dive into DataHub Metadata Service Authentication and watch his presentation from the November Town Hall below.

Community Case Study: DataHub at LinkedIn

Curious about how the LinkedIn Team continues to engage with the open-source DataHub Community?

Check out this update from Joshua Shinavier and Aikepaer Abuduweili to hear how they are leveraging UI components from the open-source project within their internal DataHub instance.

That’s it for this round! Questions? Comments? Post them below — I can’t wait to hear from you!

Join us on Slack, follow us on Twitter, subscribe to our YouTube channel, and subscribe to the DataHub Community Newsletter!

Project Updates

Open Source

Metadata

DataHub

Data Engineering

NEXT UP

Governing the Kafka Firehose

Kafka’s schema registry and data portal are great, but without a way to actually enforce schema standards across all your upstream apps and services, data breakages are still going to happen. Just as important, without insight into who or what depends on this data, you can’t contain the damage. And, as data teams know, Kafka data breakages almost always cascade far and wide downstream—wrecking not just data pipelines, and not just business-critical products and services, but also any reports, dashboards, or operational analytics that depend on upstream Kafka data.

When Data Quality Fires Break Out, You're Always First to Know with Acryl Observe

Acryl Observe is a complete observability solution offered by Acryl Cloud. It helps you detect data quality issues as soon as they happen so you can address them proactively, rather than waiting for them to impact your business’ operations and services. And it integrates seamlessly with all data warehouses—including Snowflake, BigQuery, Redshift, and Databricks. But Acryl Observe is more than just detection. When data breakages do inevitably occur, it gives you everything you need to assess impact, debug, and resolve them fast; notifying all the right people with real-time status updates along the way.

John Joyce

2024-04-23

Five Signs You Need a Unified Data Observability Solution

A data observability tool is like loss-prevention for your data ecosystem, equipping you with the tools you need to proactively identify and extinguish data quality fires before they can erupt into towering infernos. Damage control is key, because upstream failures almost always have cascading downstream effects—breaking KPIs, reports, and dashboards, along with the business products and services these support and enable. When data quality fires become routine, trust is eroded. Stakeholders no longer trust their reports, dashboards, and analytics, jeopardizing the data-driven culture you’ve worked so hard to nurture

John Joyce

2024-04-17

TermsPrivacySecurity
© 2025 Acryl Data