BACK TO ALL POSTS

Redshift Lineage, Incubating Mode Integration, and More!

Release Notes

Metadata

Open Source

Community

Maggie Hays

Dec 10, 2021

Release Notes

Metadata

Open Source

Community

Happy Friday, DataHub Enthusiasts! We just made v0.8.18 available — let’s get you up to speed on what’s in store.

Vulnerability Alert!

We heard today about a vulnerability in log4j2 ; big thanks to @frsann for quickly pushing a fix. We encourage all DataHub users to update to the latest version as soon as possible.

Automatic Redshift Lineage

We now support automatically ingesting Dataset → Dataset lineage for Redshift Datasets! This includes Tables, Views, Late-binding Views, External Tables (ie. Redshift Spectrum), and COPY statements from S3.

Check out this great walkthrough from

Tamas Nemeth:

Metadata Service Authentication

You can now make authenticated requests to the Metadata Service APIs (GraphQL + Restl.li). Interested in a technical deep dive? Check out this post from

John Joyce or give his demo a watch:

Incubating Ingestion Sources

This release contains two new ingestions sources — these are still incubating, but we encourage you to take them for a spin & let us know if you encounter any issues!

Apache Nifi — Now available for extracting metadata about DataJobs and DataFlows. Find the source docs here.

Mode Analytics — Use this to extract reports, charts, and more from your Mode Analytics instance. Read the source docs here.

Add New Aspects without a Fork

This is a major milestone towards No-Code UI — we’re super excited for you to start digging in! ICYMI, we demoed this upcoming functionality during the November Town Hall — watch it below!

Glossary Term Transformer

This transformer allows users to add tags or glossary terms to entities based on a regex match filter. Shoutout to @ecooklin for the first-time contribution to DataHub!

Bug Fixes

Metadata Service
  • Empty search query fails to resolve
  • Log4j vulnerability addressed
  • Improve search & recommendations performance by ~50%, homepage load by ~50%.
  • Fix invalid Tag creation policy
  • Fix Spring injection of Entity Client inside datahub-upgrade
Metadata Ingestion
  • BigQuery: Fix handling of partitioned & snapshotted tables for lineage usage and basic table indexing.
  • Recommendations: Fix issue where recently viewed and most popular recommendations were not showing up when user urn contains special characters.
  • Add config to specify ca certificate path for datahub-rest sink
  • Snowflake: Handling for special characters in Snowflake databases and schemas. Map “geo” type to NullType to prevent errors.
UI
  • Fix Groups page not showing asset ownership correctly
  • Fix issue where markdown links were not clickable.
  • Fix deletes by search cannot accept auth token

Backward Incompatible Changes

The standalone Spring GraphQL Service has been removed. (Replaced by Metadata Service GraphQL API)

Community Contributions

Congrats on first-time contributions! @adriangb @anshbansal @bartlomiejolma @robscriva @ecooklin

Big thanks for your ongoing support @arunvasudevan @aseembansal @claudio @dexter-mh-lee @EnricoMi @frsann @gabe @hsheth2 @jeffmerrick @jjoyce0510 @kevinhu @maggiehays @mayurinehate @pedro93 @rslanka @serefacet @shirshanka @swaroopjagadish @treff7es @varunbharill

Release Notes

Metadata

Open Source

Community

NEXT UP

Governing the Kafka Firehose

Kafka’s schema registry and data portal are great, but without a way to actually enforce schema standards across all your upstream apps and services, data breakages are still going to happen. Just as important, without insight into who or what depends on this data, you can’t contain the damage. And, as data teams know, Kafka data breakages almost always cascade far and wide downstream—wrecking not just data pipelines, and not just business-critical products and services, but also any reports, dashboards, or operational analytics that depend on upstream Kafka data.

When Data Quality Fires Break Out, You're Always First to Know with Acryl Observe

Acryl Observe is a complete observability solution offered by Acryl Cloud. It helps you detect data quality issues as soon as they happen so you can address them proactively, rather than waiting for them to impact your business’ operations and services. And it integrates seamlessly with all data warehouses—including Snowflake, BigQuery, Redshift, and Databricks. But Acryl Observe is more than just detection. When data breakages do inevitably occur, it gives you everything you need to assess impact, debug, and resolve them fast; notifying all the right people with real-time status updates along the way.

John Joyce

2024-04-23

Five Signs You Need a Unified Data Observability Solution

A data observability tool is like loss-prevention for your data ecosystem, equipping you with the tools you need to proactively identify and extinguish data quality fires before they can erupt into towering infernos. Damage control is key, because upstream failures almost always have cascading downstream effects—breaking KPIs, reports, and dashboards, along with the business products and services these support and enable. When data quality fires become routine, trust is eroded. Stakeholders no longer trust their reports, dashboards, and analytics, jeopardizing the data-driven culture you’ve worked so hard to nurture

John Joyce

2024-04-17

TermsPrivacySecurity
© 2025 Acryl Data