Provectus, a Silicon Valley artificial intelligence (AI) consultancy, is debuting enhancements to its Open-Source Data Discovery (ODD) and Observability Platform v0.2, upgrading the platform’s data discovery and observability capabilities while adding new features for data quality assurance and support of new, third-party service adapters, including Amazon Athena, Amazon SageMaker Feature Store, Feast, and Great Expectations (GE).
Previously announced in August, the ODD Platform is an open-source data discovery and observability tool for data-driven enterprises that are looking to democratize their data by making it more discoverable, manageable, observable, reliable, and secure. The platform is designed with the needs of data teams in mind.
“Today, data scientists spend 30 per cent of their time on the discovery and validation of datasets. Data and ML engineers have to invest too many resources to ensure that their data is clean and reliable. Tasks such as fine-tuning, debugging, and maintaining data pipelines, as well as cataloging and curating datasets, create data silos that keep engineers away from ML models, analytical dashboards, and other business-critical tasks,” says German Osin, chief product owner of ODD Platform.
“I believe that the Open Data Discovery Platform can become the ultimate, open-source, open-standard ecosystem for solving most data problems, to dramatically reduce the costs of building and maintaining data products for enterprises of all sizes.”
The ODD Platform closes the gaps that conventional data catalogs cannot cover, such as lack of standardised data collection, incompatibility of different catalogs, limited data lineage, and inefficient data quality and observability practices.
Also Read: Why Does Data Observability Matter?
ODD Platform v0.2 adds new features to the mix:
- OpenTelemetry integration for better data observability
- Data Entity alert page
- Data quality entities and Data QA reports in Data Entities
- Integration with Great Expectations for Data QA tests
- Search suggestions
- Ability to count the number of views for each data entity and range them in search by popularity
- Added namespaces to group data sources
- Data quality alerts
- Dataset schema change alerts
- Counters to tags for popularity ranking
The ODD Platform is a work in progress. Provectus and the ODD team are inviting the data community to contribute to the project.