Governance
Discovery & Classification
Novatria Discovery & Classification continuously scans your data estate — warehouses, lakes, APIs, and SaaS platforms — to discover sensitive data, classify it by type (PII, PHI, PCI, secrets), and maintain a live inventory that feeds governance policies and security controls.
Continuous Discovery
Agentless scanners continuously map your data estate — no manual inventory maintenance. New tables, columns, and APIs are discovered automatically within minutes.
AI-Powered Classification
ML models detect PII, PHI, PCI data, secrets, and custom sensitive-data patterns with 99.2% precision — across structured, semi-structured, and unstructured data.
Live Sensitivity Map
A real-time map of where sensitive data lives, who can access it, and what policies protect it — updated continuously as your data estate evolves.
Capabilities
Why customers choose Novatria Discovery & Classification
50+ Connectors
Pre-built connectors for Snowflake, Databricks, BigQuery, Redshift, S3, Azure Blob, Postgres, MySQL, Kafka, and 40+ more. Add custom connectors via the SDK.
Column-Level Classification
Classify data at the column level with confidence scores. Tag columns as PII, PHI, PCI, or custom taxonomies — and feed classifications directly into policy rules.
Custom Classifiers
Define custom classification patterns using regex, keyword lists, or trained ML models for industry-specific sensitive data types (e.g., loan numbers, patient IDs).
Change Detection
Detect schema changes, new data sources, and classification drift. Get alerted when a new column containing potential PII appears in an unprotected table.
Data Sampling
Configurable sampling strategies balance thoroughness with performance. Full-scan mode for regulated data, statistical sampling for high-volume streams.
Lineage Integration
Classification results feed directly into the lineage graph. Trace sensitive data from source column through transformations to downstream consumers.
Common Questions
How accurate is the AI classification?
Our production models achieve 99.2% precision and 97.8% recall for standard PII/PHI types. Custom classifiers can be trained on your data for even higher accuracy on domain-specific patterns.
Does scanning impact my production database performance?
No. Novatria uses agentless, read-only connectors with configurable rate limits. For data warehouses, we query metadata and sample data — never full table scans unless explicitly configured.
Can I classify data in real-time streams?
Yes. Novatria integrates with Kafka, Kinesis, and Pub/Sub to classify streaming data in near real-time, enabling immediate policy enforcement on data in motion.
Get started
See Novatria in action
Land through Governance or Security, then expand across the full trust platform.