Skip to content

Data Classification Policy

Effective Date: 2026-03-02 Last Review: 2026-03-02 Next Review: 2026-09-02 Owner: Greg Felice, Project Lead

1. Purpose

This policy defines how data within the tomo ecosystem is classified, handled, stored, transmitted, and disposed of based on its sensitivity. It ensures that appropriate protections are applied to data at every stage of its lifecycle.

2. Scope

All data created, collected, processed, stored, or transmitted by tomo systems, including:

  • SDK source code and build artifacts
  • Docker image contents and metadata
  • Hosted service tenant data (current and future)
  • Infrastructure configuration and secrets
  • Monitoring and audit data
  • User and contributor information

3. Classification Levels

Level Label Description Unauthorized Disclosure Impact
Level 1 Public Intended for unrestricted access None
Level 2 Internal Not public, but low sensitivity; for project team use Minor inconvenience, no material harm
Level 3 Confidential Sensitive data requiring access controls Significant harm: security compromise, competitive disadvantage
Level 4 Restricted Highest sensitivity; strict access and handling Severe harm: data breach, regulatory violation, reputational damage

4. Classification Examples

Level 1 — Public

Data Type Examples
Source code SDK source in src/tomo/, tests, documentation
Published artifacts PyPI package, Docker image, tomo.rizlabs.com docs
Project metadata License, README, CONTRIBUTING, CHANGELOG
Public communications Blog posts, release announcements, conference talks
Benchmark results Published performance reports

Level 2 — Internal

Data Type Examples
Engineering backlog docs/BACKLOG.md, issue trackers (non-security)
Architecture decisions ADRs, internal design docs, research notes
CI/CD configurations Woodpecker pipeline files (without secrets)
Monitoring dashboards Grafana dashboard configurations
Meeting notes Internal project discussions
Development database content Test data, benchmark datasets

Level 3 — Confidential

Data Type Examples
API keys and tokens PyPI tokens, Docker Hub tokens, B2 application keys
Service account credentials Database passwords, CI service account tokens
Infrastructure configuration Ansible vault contents, nginx configs with internal IPs
Security audit findings Vulnerability scan results, penetration test reports
Hosted service tenant metadata Tenant list, subscription details, usage metrics
Incident records Security incident details, forensic evidence

Level 4 — Restricted

Data Type Examples
TLS private keys Certificate private keys for *.rizlabs.com
Database master passwords PostgreSQL superuser credentials
Encryption keys LUKS keys, backup encryption keys, sealed recovery envelope contents
Hosted service tenant data Customer database contents, graph data, vector embeddings
Personal identifiable information (PII) Customer names, email addresses, billing information
Payment data Stripe customer IDs, subscription records (future)

5. Handling Requirements

5.1 Storage

Level At Rest Encryption Storage Location Access Control
Public Not required Any (Git, CDN, public web) None
Internal Recommended (LUKS) Access-controlled repositories, internal systems Role-based (project team)
Confidential Required (LUKS + application-level) Secrets manager, encrypted volumes, Woodpecker secrets Named individuals only, audit logged
Restricted Required (LUKS + application-level + separate key management) Encrypted volumes, hardware security module (future), sealed envelope Project Lead only, dual-control for recovery keys

5.2 Transmission

Level In Transit Encryption Permitted Channels
Public Not required (HTTPS recommended) Any
Internal TLS 1.2+ required Git (SSH/HTTPS), internal APIs, encrypted email
Confidential TLS 1.2+ required SSH, HTTPS, encrypted messaging (Signal)
Restricted TLS 1.2+ required, end-to-end encryption recommended SSH, HTTPS (pinned certificates for critical transfers), Signal

5.3 Processing

Level Environment Requirements
Public No restrictions
Internal Project-controlled systems only
Confidential Hardened systems, audit logging enabled, no processing on personal devices without LUKS
Restricted Dedicated hardened systems, full audit logging, no cloud processing without DPA and encryption

5.4 Sharing

Level Internal Sharing External Sharing
Public Unrestricted Unrestricted
Internal Project team, no approval required Project Lead approval required
Confidential Need-to-know basis, documented access Not permitted without Project Lead approval and NDA
Restricted Project Lead only, logged access Not permitted except under legal obligation or DPA

6. Labeling

Level Labeling Requirement
Public No label required (default assumption for SDK/docs repos)
Internal No label required in access-controlled repos; label if stored alongside public data
Confidential Files and documents marked CONFIDENTIAL in filename or header
Restricted Files and documents marked RESTRICTED in filename or header; stored in dedicated encrypted directories

7. Retention and Deletion

7.1 Retention Schedule

Data Type Retention Period Justification
Source code and Git history Indefinite Open source project record
Published artifacts (PyPI, Docker Hub) Indefinite (yanked versions retained by platform) User dependency
CI build logs 90 days Operational reference
Audit logs (auth, database, access) 12 months Compliance and incident investigation
Security incident records 3 years Legal and compliance requirements
Hosted service tenant data Duration of service + 30 days after account deletion Contractual obligation
Backup data Per backup policy (7-90 days depending on tier) Recovery capability
PII (customer contact info) Duration of relationship + 12 months Business need + regulatory requirement
Payment records 7 years Tax and financial compliance

7.2 Deletion Procedures

Level Deletion Method
Public Standard deletion (no special requirements)
Internal Standard deletion; verify removal from backups within retention window
Confidential Secure deletion (shred or equivalent); verify removal from all copies including backups
Restricted Cryptographic erasure (delete encryption keys) or secure wipe; documented destruction certificate; verify removal from all copies including offsite backups

7.3 Right to Deletion (Hosted Service)

When a hosted service tenant requests data deletion:

  1. Delete tenant database within 5 business days
  2. Remove tenant data from active backups within 30 days (via backup rotation)
  3. Confirm deletion in writing to the tenant
  4. Retain audit logs (anonymized) per retention schedule

8. Data in the Tomo SDK

The SDK itself does not collect, store, or transmit user data. Specific considerations:

Concern Policy
Telemetry No telemetry is collected. The SDK does not phone home.
Error reporting Exceptions are raised locally; no automatic error reporting to tomo servers
Connection strings Users provide connection strings; the SDK does not log or persist them
Query results Results are returned to the caller; the SDK does not cache or store query results

9. Data in the Docker Image

Concern Policy
Default credentials No default passwords shipped; setup requires explicit credential configuration
Sample data No production or customer data included in the image
Telemetry No telemetry; standard PostgreSQL logging only (configurable by operator)
Volume data Data volumes are owned by the operator; tomo project has no access

10. Compliance Mapping

SOC 2 Criteria Control
CC6.1 Data classification and protection controls
CC6.5 Data disposal and destruction
CC6.7 Restriction of data transmission
P6.1 Collection and use of personal information (privacy)
P6.5 Disposal of personal information
C1.1 Confidentiality commitments and system requirements