BACK TO ALL POSTS

Humans of DataHub: Liu Xianglong

Humans of DataHub

Community

Open Source

Data Engineering

Elizabeth Cohen

Mar 25, 2022

Humans of DataHub

Community

Open Source

Data Engineering

Humans of DataHub

We are excited to share our fourth installment of Humans of DataHub. This week we are joined by Liu Xianglong, of the Centre for Strategic Infocomm Technologies, Singapore, where he is a Data Platform Engineer.

Liu Xianglong (@xl on DataHub Slack)

Liu Xianglong (@xl on DataHub Slack)

How did you first learn about DataHub?

“We were evaluating open-source data discovery solutions and Datahub was one of the projects we found.”

What do you enjoy most about the DataHub Community?

“Compared to some of the other open-source projects that I follow, I like the fact that the community is very active and I can get help and advice from the developers and other users very quickly.”

What has DataHub enabled within your organization?

“We plan to use Datahub to consolidate data definitions since it has the ability to accept metadata from a wide range of data sources as well as allow developers to do custom implementations.”

What are you most excited to see happen with DataHub in 2022?

“View ACL would be very much welcomed as we have requirements to control access to certain datasets. Also, I hope that version 1.0 of Datahub can be launched this year.”

What’s your favorite DataHub feature/use case?

“BrowsePaths, where you can specify where a dataset is located for browsing, is a very handy tool to cater to different user groups as we can put soft-links to common datasets at different points in the catalogue.”

Thank you, Liu, for speaking with the team and for all of your contributions to the DataHub Community.


If you are new to DataHub, just beginning to understand what “metadata” and “modern data stack” mean, or you’ve just read these words for the first time (welcome aboard! 🚀), let us take a moment to introduce ourselves and share a little history;

DataHub is an extensible metadata platform, enabling data discovery, data observability, and federated governance to tame the complexity of increasingly diverse data ecosystems. Originally built at LinkedIn, DataHub was open-sourced under the Apache 2.0 License in 2020. It now has a thriving community with over 2.3k members and 100+ code contributors, and many companies are actively using DataHub in production.

We believe that data-driven organizations need a reimagined developer-friendly data catalog to tackle the diversity and scale of the modern data stack. Our goal is to provide the most reliable and trusted enterprise data graph to empower data teams with best-in-class search and discovery and enable continuous data quality based on DataOps practices. This allows central data teams to scale their effectiveness and companies to maximize the value they derive from data.

Want to learn more about DataHub and how to join our community? Visit https://datahubproject.io and say hello on Slack. 👋


Humans of DataHub

Community

Open Source

Data Engineering

NEXT UP

Governing the Kafka Firehose

Kafka’s schema registry and data portal are great, but without a way to actually enforce schema standards across all your upstream apps and services, data breakages are still going to happen. Just as important, without insight into who or what depends on this data, you can’t contain the damage. And, as data teams know, Kafka data breakages almost always cascade far and wide downstream—wrecking not just data pipelines, and not just business-critical products and services, but also any reports, dashboards, or operational analytics that depend on upstream Kafka data.

When Data Quality Fires Break Out, You're Always First to Know with Acryl Observe

Acryl Observe is a complete observability solution offered by Acryl Cloud. It helps you detect data quality issues as soon as they happen so you can address them proactively, rather than waiting for them to impact your business’ operations and services. And it integrates seamlessly with all data warehouses—including Snowflake, BigQuery, Redshift, and Databricks. But Acryl Observe is more than just detection. When data breakages do inevitably occur, it gives you everything you need to assess impact, debug, and resolve them fast; notifying all the right people with real-time status updates along the way.

John Joyce

2024-04-23

Five Signs You Need a Unified Data Observability Solution

A data observability tool is like loss-prevention for your data ecosystem, equipping you with the tools you need to proactively identify and extinguish data quality fires before they can erupt into towering infernos. Damage control is key, because upstream failures almost always have cascading downstream effects—breaking KPIs, reports, and dashboards, along with the business products and services these support and enable. When data quality fires become routine, trust is eroded. Stakeholders no longer trust their reports, dashboards, and analytics, jeopardizing the data-driven culture you’ve worked so hard to nurture

John Joyce

2024-04-17

TermsPrivacySecurity
© 2024 Acryl Data