Initial push

2025-07-20 03:41:39 -04:00
commit d315f5d26e
118 changed files with 25819 additions and 0 deletions
--- a/samples/BestPractices/README.md
+++ b/samples/BestPractices/README.md
@@ -0,0 +1,328 @@
+# SqrtSpace SpaceTime Best Practices
+
+This project demonstrates best practices for building production-ready applications using the SqrtSpace SpaceTime library. It showcases advanced patterns and configurations for optimal memory efficiency and performance.
+
+## Key Concepts Demonstrated
+
+### 1. **Comprehensive Service Configuration**
+
+The application demonstrates proper configuration of all SpaceTime services:
+
+```csharp
+// Environment-aware memory configuration
+builder.Services.Configure<SpaceTimeConfiguration>(options =>
+{
+    options.Memory.MaxMemory = environment.IsDevelopment() 
+        ? 256 * 1024 * 1024  // 256MB for dev
+        : 1024 * 1024 * 1024; // 1GB for production
+    
+    // Respect container limits
+    var memoryLimit = Environment.GetEnvironmentVariable("MEMORY_LIMIT");
+    if (long.TryParse(memoryLimit, out var limit))
+    {
+        options.Memory.MaxMemory = (long)(limit * 0.8); // Use 80% of container limit
+    }
+});
+```
+
+### 2. **Layered Caching Strategy**
+
+Implements hot/cold tiered caching with automatic spill-to-disk:
+
+```csharp
+builder.Services.AddSpaceTimeCaching(options =>
+{
+    options.MaxHotMemory = 50 * 1024 * 1024; // 50MB hot cache
+    options.EnableColdStorage = true;
+    options.ColdStoragePath = Path.Combine(Path.GetTempPath(), "spacetime-cache");
+});
+```
+
+### 3. **Production-Ready Diagnostics**
+
+Comprehensive monitoring with OpenTelemetry integration:
+
+```csharp
+builder.Services.AddSpaceTimeDiagnostics(options =>
+{
+    options.EnableMetrics = true;
+    options.EnableTracing = true;
+    options.SamplingRate = builder.Environment.IsDevelopment() ? 1.0 : 0.1;
+});
+```
+
+### 4. **Entity Framework Integration**
+
+Shows how to configure EF Core with SpaceTime optimizations:
+
+```csharp
+options.UseSqlServer(connectionString)
+       .UseSpaceTimeOptimizer(opt =>
+       {
+           opt.EnableSqrtNChangeTracking = true;
+           opt.BufferPoolStrategy = BufferPoolStrategy.SqrtN;
+       });
+```
+
+### 5. **Memory-Aware Background Processing**
+
+Background services that respond to memory pressure:
+
+```csharp
+_memoryMonitor.PressureEvents
+    .Where(e => e.CurrentLevel >= MemoryPressureLevel.High)
+    .Subscribe(e =>
+    {
+        _logger.LogWarning("High memory pressure detected, pausing processing");
+        // Implement backpressure
+    });
+```
+
+### 6. **Pipeline Pattern for Complex Processing**
+
+Multi-stage processing with checkpointing:
+
+```csharp
+var pipeline = _pipelineFactory.CreatePipeline<Order, ProcessedOrder>("OrderProcessing")
+    .Configure(config =>
+    {
+        config.ExpectedItemCount = orders.Count();
+        config.EnableCheckpointing = true;
+    })
+    .AddTransform("Validate", ValidateOrder)
+    .AddBatch("EnrichCustomerData", EnrichWithCustomerData)
+    .AddParallel("CalculateTax", CalculateTax, maxConcurrency: 4)
+    .AddCheckpoint("SaveProgress")
+    .Build();
+```
+
+### 7. **Distributed Processing Coordination**
+
+Shows how to partition work across multiple nodes:
+
+```csharp
+var partition = await _coordinator.RequestPartitionAsync(
+    request.WorkloadId, 
+    request.EstimatedSize);
+
+// Process only this node's portion
+var filter = new OrderFilter
+{
+    StartDate = partition.StartRange,
+    EndDate = partition.EndRange
+};
+```
+
+### 8. **Streaming API Endpoints**
+
+Demonstrates memory-efficient streaming with automatic chunking:
+
+```csharp
+[HttpGet("export")]
+[SpaceTimeStreaming(ChunkStrategy = ChunkStrategy.SqrtN)]
+public async IAsyncEnumerable<OrderExportDto> ExportOrders([FromQuery] OrderFilter filter)
+{
+    await foreach (var batch in orders.BatchBySqrtNAsync())
+    {
+        foreach (var order in batch)
+        {
+            yield return MapToDto(order);
+        }
+    }
+}
+```
+
+## Architecture Patterns
+
+### Service Layer Pattern
+
+The `OrderService` demonstrates:
+- Dependency injection of SpaceTime services
+- Operation tracking with diagnostics
+- External sorting for large datasets
+- Proper error handling and logging
+
+### Memory-Aware Queries
+
+```csharp
+// Automatically switches to external sorting for large results
+var orders = await query
+    .OrderByExternal(o => o.CreatedDate)
+    .ToListWithSqrtNMemoryAsync();
+```
+
+### Batch Processing
+
+```csharp
+// Process data in memory-efficient batches
+await foreach (var batch in context.Orders
+    .Where(o => o.Status == "Pending")
+    .BatchBySqrtNAsync())
+{
+    // Process batch
+}
+```
+
+### Task Scheduling
+
+```csharp
+// Schedule work based on memory availability
+await _scheduler.ScheduleAsync(
+    async () => await ProcessNextBatchAsync(stoppingToken),
+    estimatedMemory: 50 * 1024 * 1024, // 50MB
+    priority: TaskPriority.Low);
+```
+
+## Configuration Best Practices
+
+### 1. **Environment-Based Configuration**
+
+- Development: Lower memory limits, full diagnostics
+- Production: Higher limits, sampled diagnostics
+- Container: Respect container memory limits
+
+### 2. **Conditional Service Registration**
+
+```csharp
+// Only add distributed coordination if Redis is available
+var redisConnection = builder.Configuration.GetConnectionString("Redis");
+if (!string.IsNullOrEmpty(redisConnection))
+{
+    builder.Services.AddSpaceTimeDistributed(options =>
+    {
+        options.NodeId = Environment.MachineName;
+        options.CoordinationEndpoint = redisConnection;
+    });
+}
+```
+
+### 3. **Health Monitoring**
+
+```csharp
+app.MapGet("/health", async (IMemoryPressureMonitor monitor) =>
+{
+    var stats = monitor.CurrentStatistics;
+    return Results.Ok(new
+    {
+        Status = "Healthy",
+        MemoryPressure = monitor.CurrentPressureLevel.ToString(),
+        MemoryUsage = new
+        {
+            ManagedMemoryMB = stats.ManagedMemory / (1024.0 * 1024.0),
+            WorkingSetMB = stats.WorkingSet / (1024.0 * 1024.0),
+            AvailablePhysicalMemoryMB = stats.AvailablePhysicalMemory / (1024.0 * 1024.0)
+        }
+    });
+});
+```
+
+## Production Considerations
+
+### 1. **Memory Limits**
+
+Always configure memory limits based on your deployment environment:
+- Container deployments: Use 80% of container limit
+- VMs: Consider other processes running
+- Serverless: Respect function memory limits
+
+### 2. **Checkpointing Strategy**
+
+Enable checkpointing for:
+- Long-running operations
+- Operations that process large datasets
+- Critical business processes that must be resumable
+
+### 3. **Monitoring and Alerting**
+
+Monitor these key metrics:
+- Memory pressure levels
+- External sort operations
+- Checkpoint frequency
+- Cache hit rates
+- Pipeline processing times
+
+### 4. **Error Handling**
+
+Implement proper error handling:
+- Use diagnostics to track operations
+- Log errors with context
+- Implement retry logic for transient failures
+- Clean up resources on failure
+
+### 5. **Performance Tuning**
+
+- Adjust batch sizes based on workload
+- Configure parallelism based on CPU cores
+- Set appropriate cache sizes
+- Monitor and adjust memory thresholds
+
+## Testing Recommendations
+
+### 1. **Load Testing**
+
+Test with datasets that exceed memory limits to ensure:
+- External processing activates correctly
+- Memory pressure is handled gracefully
+- Checkpointing works under load
+
+### 2. **Failure Testing**
+
+Test recovery scenarios:
+- Process crashes during batch processing
+- Memory pressure during operations
+- Network failures in distributed scenarios
+
+### 3. **Performance Testing**
+
+Measure:
+- Response times under various memory conditions
+- Throughput with different batch sizes
+- Resource utilization patterns
+
+## Deployment Checklist
+
+- [ ] Configure memory limits based on deployment environment
+- [ ] Set up monitoring and alerting
+- [ ] Configure persistent storage for checkpoints and cold cache
+- [ ] Test failover and recovery procedures
+- [ ] Document memory requirements and scaling limits
+- [ ] Configure appropriate logging levels
+- [ ] Set up distributed coordination (if using multiple nodes)
+- [ ] Verify health check endpoints
+- [ ] Test under expected production load
+
+## Advanced Scenarios
+
+### Multi-Node Deployment
+
+For distributed deployments:
+1. Configure Redis for coordination
+2. Set unique node IDs
+3. Implement partition-aware processing
+4. Monitor cross-node communication
+
+### High-Availability Setup
+
+1. Use persistent checkpoint storage
+2. Implement automatic failover
+3. Configure redundant cache storage
+4. Monitor node health
+
+### Performance Optimization
+
+1. Profile memory usage patterns
+2. Adjust algorithm selection thresholds
+3. Optimize batch sizes for your workload
+4. Configure appropriate parallelism levels
+
+## Summary
+
+This best practices project demonstrates how to build robust, memory-efficient applications using SqrtSpace SpaceTime. By following these patterns, you can build applications that:
+
+- Scale gracefully under memory pressure
+- Process large datasets efficiently
+- Recover from failures automatically
+- Provide predictable performance
+- Optimize resource utilization
+
+The key is to embrace the √n space-time tradeoff philosophy throughout your application architecture, letting the library handle the complexity of memory management while you focus on business logic.