Snorkel AI Accelerates Foundation Model Adoption With Data-centric AI


Introduces Data-centric Foundation Model Development to bridge the adaptation and deployment gaps between foundation models and enterprise AI

The data-centric AI platform company Snorkel AI introduced Data-centric Foundation Model Development for enterprises to unlock complex, performance-critical use cases with GPT-3, RoBERTa, T5, and other foundation models. With this launch, enterprise data science and machine learning teams can overcome adaptation and deployment challenges by creating large, domain-specific datasets to fine-tune foundation models and using them to build smaller, specialised models deployable within governance and cost constraints. New capabilities for Data-centric Foundation Model Development are available within Snorkel Flow, the company’s flagship platform, in preview.

Foundation models such as GPT-3, DALL-E-2, Stable Diffusion, and more offer a lot of promise for generative, creative, and exploratory tasks. But enterprises still need to be closer to deploying foundation models in production for complex, performance-critical NLP and other automation use cases. Enterprises need large volumes of domain- and task-specific labelled training data to adapt foundation models for domain-specific use. Creating these high-quality training datasets with traditional, manual data labelling approaches is painfully slow and expensive. Moreover, foundation models are incredibly costly to develop and maintain and pose governance constraints when deploying in production.

These challenges must be addressed before enterprises can reap the benefits of foundation models. Snorkel Flow’s Data-centric Foundation Management Development is a new paradigm for enterprise AI/ML teams to overcome the adaptation and deployment challenges currently blocking them from using foundation models to accelerate AI development.

Using early versions of Data-centric Foundation Management Development, AI/ML teams have built and deployed highly-accurate NLP applications in days:

  • A top US bank improved accuracy from 25.5 per cent to 91.5 per cent when extracting information from complex, multi-hundred-page long contracts.
  • A global home goods e-commerce company improved accuracy by 7-22 per cent when classifying products from descriptions and reduced development time from four weeks to one day.
  • Pixability distilled knowledge from foundation models and built smaller classification models with more than 90% accuracy in days.
  • Snorkel AI research team and partners from Stanford University and Brown University have achieved the same quality as a fine-tuned GPT-3 model with a model that was over 1000x smaller on LEDGAR, a complex 100-class legal benchmark task.

“With over 500 hours of content created on YouTube every minute, we need to constantly and accurately categorize billions of videos to make sure we fully understand the context of videos so that advertisers can be sure they are running their ads on brand suitable content,” said Jackie Swansburg Paulino, Chief Product Officer at Pixability. “With Snorkel Flow, we can apply data-centric workflows to distill knowledge from foundation models and build high-cardinality classification models with more than 90% accuracy in days.”

Enterprise Foundation Model Management Suite features include:

  • Foundation Model Fine-tuning to create large, domain-specific training datasets to fine-tune and adapt foundation models for enterprise use cases with production-grade accuracy.
  • Foundation Model Warm Start to use foundation models and state-of-the-art zero- and few-shot learning to auto-label training data with a push of a button to train deployable models.
  • Foundation Model Prompt Builder to develop, evaluate, and combine prompts to tune and correct the output of foundation models to precisely label datasets and train deployable models.

“Enterprises have struggled to harness the power of foundation models like GPT-3 and DALL-E due to fundamental adaptation and deployment challenges. To work in real enterprise use cases, foundation models need to be adapted using task-specific training data and need to clear major deployment challenges around cost and governance,” said Alex Ratner, CEO and co-founder at Snorkel AI. “Snorkel Flow’s unique data-centric approach provides the necessary bridge between foundation models and enterprise AI, solving the adaptation and deployment challenges so enterprises can achieve real value from foundation models.”