Skip to main content

Governance

Discovery & Classification

Novatria Discovery & Classification continuously scans your data estate — warehouses, lakes, APIs, and SaaS platforms — to discover sensitive data, classify it by type (PII, PHI, PCI, secrets), and maintain a live inventory that feeds governance policies and security controls.

Continuous Discovery

Agentless scanners continuously map your data estate — no manual inventory maintenance. New tables, columns, and APIs are discovered automatically within minutes.

AI-Powered Classification

ML models detect PII, PHI, PCI data, secrets, and custom sensitive-data patterns with 99.2% precision — across structured, semi-structured, and unstructured data.

Live Sensitivity Map

A real-time map of where sensitive data lives, who can access it, and what policies protect it — updated continuously as your data estate evolves.

Capabilities

Why customers choose Novatria Discovery & Classification

50+ Connectors

Pre-built connectors for Snowflake, Databricks, BigQuery, Redshift, S3, Azure Blob, Postgres, MySQL, Kafka, and 40+ more. Add custom connectors via the SDK.

Column-Level Classification

Classify data at the column level with confidence scores. Tag columns as PII, PHI, PCI, or custom taxonomies — and feed classifications directly into policy rules.

Custom Classifiers

Define custom classification patterns using regex, keyword lists, or trained ML models for industry-specific sensitive data types (e.g., loan numbers, patient IDs).

Change Detection

Detect schema changes, new data sources, and classification drift. Get alerted when a new column containing potential PII appears in an unprotected table.

Data Sampling

Configurable sampling strategies balance thoroughness with performance. Full-scan mode for regulated data, statistical sampling for high-volume streams.

Lineage Integration

Classification results feed directly into the lineage graph. Trace sensitive data from source column through transformations to downstream consumers.

Common Questions

How accurate is the AI classification?

Our production models achieve 99.2% precision and 97.8% recall for standard PII/PHI types. Custom classifiers can be trained on your data for even higher accuracy on domain-specific patterns.

Does scanning impact my production database performance?

No. Novatria uses agentless, read-only connectors with configurable rate limits. For data warehouses, we query metadata and sample data — never full table scans unless explicitly configured.

Can I classify data in real-time streams?

Yes. Novatria integrates with Kafka, Kinesis, and Pub/Sub to classify streaming data in near real-time, enabling immediate policy enforcement on data in motion.

Get started

See Novatria in action

Land through Governance or Security, then expand across the full trust platform.