AI API Proxy Data Retention
Implement intelligent data retention policies for AI systems. Define storage periods, automate cleanup, manage data lifecycles, and ensure compliance with regulatory requirements.
Retention Policy Types
Different data types require different retention approaches.
Request Logs
API request metadata for debugging and analytics. Typically retained 30-90 days with optional extended compliance storage.
Conversation History
Chat messages and AI responses. User-configurable retention with default 90-day standard storage period.
Security Logs
Authentication and access logs for security auditing. Mandatory 1-year retention for compliance requirements.
Analytics Data
Aggregated metrics and usage statistics. Retained indefinitely in anonymized form for business intelligence.
Cache Data
Response caches for performance optimization. Short-term retention with automatic TTL expiration.
Configuration Data
System settings and user preferences. Retained until account deletion or explicit change.
Retention Strategies
Best practices for implementing data retention.
Tiered Storage
Move data through storage tiers based on age and access patterns to optimize costs.
- Hot: SSD for frequent access
- Warm: HDD for occasional queries
- Cold: Archive storage for compliance
- Automated migration policies
Automated Cleanup
Scheduled jobs that enforce retention policies without manual intervention.
- Daily TTL enforcement
- Batch deletion for efficiency
- Cascade to related data
- Audit log preservation
User-Controlled Retention
Allow users to customize retention periods within policy boundaries.
- Configurable TTL settings
- Manual deletion triggers
- Export before deletion
- Retention notifications
Compliance Mapping
Align retention policies with regulatory requirements automatically.
- GDPR: 30-day DSAR response
- HIPAA: 6-year medical records
- SOX: 7-year financial data
- Custom policy templates
Implementation Guide
Build automated data retention systems.
class RetentionManager: """Automated data retention management""" def __init__(self, db, storage_tiers): self.db = db self.tiers = storage_tiers self.policies = self.load_policies() def load_policies(self) -> dict: """Load retention policies configuration""" return { 'request_logs': { 'hot_days': 30, 'warm_days': 60, 'cold_days': 90, 'delete_after_days': 90 }, 'conversations': { 'hot_days': 7, 'warm_days': 30, 'cold_days': 90, 'delete_after_days': 90 }, 'security_logs': { 'hot_days': 90, 'warm_days': 180, 'cold_days': 365, 'delete_after_days': 365 }, 'cache': { 'hot_days': 7, 'delete_after_days': 7 } } async def enforce_retention(self): """Run retention policy enforcement""" for data_type, policy in self.policies.items(): # Delete expired data if 'delete_after_days' in policy: deleted = await self.delete_expired( data_type, policy['delete_after_days'] ) logger.info(f"Deleted {deleted} {data_type} records") # Move to cold storage if 'cold_days' in policy: archived = await self.archive_data( data_type, policy['cold_days'], target_tier='cold' ) logger.info(f"Archived {archived} {data_type} records") async def delete_expired( self, data_type: str, days: int ) -> int: """Delete data older than retention period""" cutoff = datetime.utcnow() - timedelta(days=days) # Find records to delete query = { 'data_type': data_type, 'created_at': {'$lt': cutoff} } # Check for legal holds legal_hold_ids = await self.get_legal_holds() if legal_hold_ids: query['_id'] = {'$nin': legal_hold_ids} # Count for audit count = await self.db.count(data_type, query) # Perform deletion await self.db.delete_many(data_type, query) # Log deletion for audit trail await self.audit_log({ 'action': 'retention_deletion', 'data_type': data_type, 'records_deleted': count, 'older_than': cutoff.isoformat(), 'timestamp': datetime.utcnow().isoformat() }) return count async def archive_data( self, data_type: str, days: int, target_tier: str ) -> int: """Move data to archive storage tier""" cutoff = datetime.utcnow() - timedelta(days=days) # Find records to archive query = { 'data_type': data_type, 'created_at': {'$lt': cutoff}, 'storage_tier': {'$ne': target_tier} } records = await self.db.find(data_type, query) for record in records: # Compress data compressed = await self.compress(record) # Move to archive tier await self.tiers[target_tier].store(compressed) # Update metadata await self.db.update( data_type, {'_id': record['_id']}, {'$set': { 'storage_tier': target_tier, 'archived_at': datetime.utcnow() }} ) return len(records) async def schedule_enforcement(self): """Schedule daily retention enforcement""" scheduler = AsyncIOScheduler() scheduler.add_job( self.enforce_retention, trigger='cron', hour=3, # Run at 3 AM minute=0 ) scheduler.start()