HeyDonto Data Transition Policy
This policy describes how HeyDonto securely and efficiently transitions dental practice data—whether from on-premises or cloud-based EHR systems—into our cloud infrastructure. It covers our event-driven approach using Kafka, best practices for encryption and access control, and how we observe and monitor the process using Kubernetes, Google Cloud Logging, and (optionally) Prometheus + Grafana for metrics/dashboards.
Note: This policy does not address any AI/knowledge-graph–related data usage. Please refer to our separate AI Data Privacy & Training Policy for details on how AI models interact with standardized data.
1. Data Flow Overview
1. Data Sources (On-Prem or Cloud-Based EHRs)
-
On-Prem: A secure HeyDonto synchronizer connects to SQL databases on the local network.
-
Cloud-Based: We connect via a secure API or data export endpoint provided by the remote EHR platform.
-
Transport Encryption (TLS) is required in both cases to ensure no plaintext data traverses untrusted networks.
2. Kafka Ingestion
-
The synchronizer or cloud connector converts extracted data into event messages and publishes them to Kafka topics in our cloud.
-
Kafka serves as a decoupled, real-time messaging layer for subsequent microservices.
3. Transformation & Routing
-
Our microservices (deployed on Kubernetes) subscribe to Kafka topics.
-
Data is validated, mapped to the FHIR standard (details in separate documentation), and prepared for storage.
4. Storage in GCP Healthcare API (FHIR)
-
Finalized data is stored in Google Cloud’s Healthcare API (FHIR datastore).
-
External applications or user-facing dashboards retrieve data as needed from this secure datastore.
2. Security & Compliance
2.1 Encryption in Transit
-
TLS/SSL for Kafka:
-
All Kafka traffic (producers/consumers) is encrypted using TLS.
-
Mutual TLS (mTLS) may be used to authenticate both client and broker in higher-security deployments.
-
-
Intra-Service Communication:
-
Within our Kubernetes environment, microservices communicate via TLS where feasible.
-
All calls to GCP APIs (including Healthcare API) are secured over HTTPS.
-
2.2 Authentication & Authorization
-
Role-Based Access Control (RBAC):
-
Kubernetes uses RBAC to ensure microservices can only access the resources they need.
-
Kafka ACLs further restrict which microservices (identified by service accounts) can produce or consume specific topics.
-
-
Least Privilege Principle:
-
Each microservice / synchronizer is assigned minimal privileges to reduce the risk of lateral movement if compromised.
-
2.3 Network Segmentation & Isolation
-
Kubernetes Namespaces:
-
We isolate production workloads and dev/test workloads into distinct namespaces and/or clusters.
-
-
Dedicated VPC:
-
Our Kafka infrastructure resides in a dedicated virtual network. Traffic is restricted by firewall rules or private endpoints.
-
-
On-Prem Connectivity:
-
For on-prem data sources, we may use VPN or mTLS over the public internet, depending on site capabilities.
-
2.4 Compliance Alignment
-
HIPAA:
-
Since the data may contain PHI, HeyDonto enforces HIPAA-compliant encryption in transit and at rest, and signs BAAs as needed.
-
-
GDPR:
-
If EU data is processed, relevant GDPR data protection measures (e.g., data subject rights, data minimization) are observed.
-
3. Kafka Topic Management & Data Retention
3.1 Topic Naming & Partitioning
-
Descriptive Topic Names:
-
We use clear conventions (e.g.,
patient-appointments
,record-updates
) to identify data domains.
-
-
Tenant Partitioning:
-
In multi-tenant scenarios, we may create separate partitions or distinct topics per dental office to streamline data isolation.
-
3.2 Retention & Purging
-
Minimal Retention Windows:
-
Kafka typically retains messages for 24–72 hours—long enough for reprocessing if a consumer fails.
-
Retaining PHI in Kafka for extended periods is avoided.
-
-
Automated Purging:
-
Once data is confirmed in the FHIR datastore, older Kafka messages are automatically removed per retention policy.
-
4. Observability & Monitoring
4.1 Logging
-
Google Cloud Logging:
-
All Kubernetes, Kafka, and microservice logs are collected and aggregated via GCP Logging.
-
Standard logging levels (INFO, WARN, ERROR) are used to separate routine events from anomalies.
-
4.2 Metrics & Dashboards
- Prometheus (Optional):
-
If deployed, Prometheus scrapes metrics from microservices and Kafka exporters, storing them for real-time analysis.
-
- Grafana Dashboards:
-
Grafana can be installed in Kubernetes to visualize metrics (from Prometheus or GCP Monitoring).
-
User-Facing Dashboards: Our React-based admin portal can embed or link to Grafana panels, allowing clinics or API owners to see high-level integration metrics (e.g., last sync time, record counts, error rates).
-
4.3 Alerts & Incident Response
-
Alerts:
-
We configure thresholds for Kafka consumer lag, microservice errors, etc. in Prometheus (or GCP Monitoring). Alerts can route to PagerDuty, Slack, or email.
-
-
Incident Response Plan:
-
In case of major outages or security events, we follow our standardized plan (escalation paths, forensic logging, timely notifications).
-
5. Operational Best Practices
5.1 Kubernetes Deployment
-
Immutable Deployments:
-
We use container images built through CI/CD pipelines. Configuration is stored in version control, ensuring reproducibility.
-
-
Namespace & Secret Management:
-
Sensitive credentials (e.g., Kafka SASL keys, TLS certs) are kept in Kubernetes Secrets with restricted RBAC access.
-
5.2 High Availability & Disaster Recovery
-
Redundant Kafka Instances:
-
We run Kafka brokers across multiple Availability Zones for failover.
-
-
Backups:
-
Kafka’s cluster config and Zookeeper states (if applicable) are backed up regularly.
-
-
Periodic Restoration Testing:
-
We test the restore process to ensure data can be recovered in the event of a cluster-level failure.
-
5.3 Versioning & Updates
-
Kafka & Microservices:
-
Updates are tested in staging before production rollout.
-
Zero-downtime deployments via Kubernetes rolling updates or blue-green strategies.
-
-
Continuous Improvement:
-
The Data Transition Policy is reviewed semi-annually (or upon major architecture changes) to reflect the latest security, compliance, and performance best practices.
-
6. Summary & Key Takeaways
- Event-Driven & Scalable
-
Kafka underpins our architecture, ensuring near-real-time data flow from on-prem or cloud-based EHRs into our cloud environment.
-
- Security & Compliance First
-
TLS, RBAC, minimal data retention, and alignment with HIPAA/GDPR are cornerstones of our strategy.
-
- Kubernetes for Orchestration
-
All microservices run on Kubernetes, with cloud-native practices (immutable deployments, secrets management) enhancing reliability.
-
- Observability
-
Google Cloud Logging provides centralized logs; Prometheus (optional) and Grafana deliver real-time dashboards and alerting.
-
Clinics or external API owners can view sync metrics via embedded Grafana panels in our React admin portal.
-
- Ongoing Review
-
We regularly revisit this policy to adapt to new regulations, technologies, and user needs.
-
By following this Data Transition Policy, HeyDonto ensures that patient and appointment data is transferred securely, efficiently, and in a compliant manner—laying the groundwork for reliable synchronization and standardized FHIR storage, while preserving privacy and integrity at each step.