2025-07-20 04:08:08 -04:00
# SqrtSpace SpaceTime for PHP
[](https://packagist.org/packages/sqrtspace/spacetime)
[](https://packagist.org/packages/sqrtspace/spacetime)
[](https://packagist.org/packages/sqrtspace/spacetime)
[](https://packagist.org/packages/sqrtspace/spacetime)
Memory-efficient algorithms and data structures for PHP using Williams' √n space-time tradeoffs.
2025-12-28 17:25:50 +00:00
**Paper Repository ** : [git.marketally.com/sqrtspace/sqrtspace-paper ](https://git.marketally.com/sqrtspace/sqrtspace-paper )
2025-07-20 15:10:35 -04:00
2025-07-20 04:08:08 -04:00
## Installation
``` bash
composer require sqrtspace/spacetime
```
## Core Concepts
SpaceTime implements theoretical computer science results showing that many algorithms can achieve better memory usage by accepting slightly slower runtime. The key insight is using √n memory instead of n memory, where n is the input size.
### Key Features
- **External Sorting**: Sort large datasets that don't fit in memory
- **External Grouping**: Group and aggregate data with minimal memory usage
- **Streaming Operations**: Process files and data streams efficiently
- **Memory Pressure Handling**: Automatic response to low memory conditions
- **Checkpoint/Resume**: Save progress and resume long-running operations
- **Laravel Integration**: Deep integration with Laravel collections and queries
## Quick Start
``` php
use SqrtSpace\SpaceTime\Collections\SpaceTimeArray ;
use SqrtSpace\SpaceTime\Algorithms\ExternalSort ;
// Handle large arrays with automatic memory management
$array = new SpaceTimeArray ();
for ( $i = 0 ; $i < 10000000 ; $i ++ ) {
$array [] = random_int ( 1 , 1000000 );
}
// Sort large datasets using only √n memory
$sorted = ExternalSort :: sort ( $array );
// Process in optimal chunks
foreach ( $array -> chunkBySqrtN () as $chunk ) {
processChunk ( $chunk );
}
```
## Examples
### Basic Examples
See [`examples/comprehensive_example.php` ](examples/comprehensive_example.php ) for a complete demonstration of all features including:
- Memory-efficient arrays and dictionaries
- External sorting and grouping
- Stream processing
- CSV import/export
- Batch processing with checkpoints
- Memory pressure monitoring
### Laravel Application
Check out [`examples/laravel-app/` ](examples/laravel-app/ ) for a complete Laravel application demonstrating:
- Streaming API endpoints
- Memory-efficient CSV exports
- Background job processing with checkpoints
- Real-time analytics with SSE
- Production-ready configurations
See the [Laravel example README ](examples/laravel-app/README.md ) for setup instructions and detailed usage.
## Features
### 1. Memory-Efficient Collections
``` php
use SqrtSpace\SpaceTime\Collections\SpaceTimeArray ;
use SqrtSpace\SpaceTime\Collections\AdaptiveDictionary ;
// Adaptive array - automatically switches between memory and disk
$array = new SpaceTimeArray ();
$array -> setThreshold ( 10000 ); // Switch to external storage after 10k items
// Adaptive dictionary with optimal memory usage
$dict = new AdaptiveDictionary ();
for ( $i = 0 ; $i < 1000000 ; $i ++ ) {
$dict [ " key_ $i " ] = " value_ $i " ;
}
```
### 2. External Algorithms
``` php
use SqrtSpace\SpaceTime\Algorithms\ExternalSort ;
use SqrtSpace\SpaceTime\Algorithms\ExternalGroupBy ;
// Sort millions of records using minimal memory
$data = getData (); // Large dataset
$sorted = ExternalSort :: sort ( $data , fn ( $a , $b ) => $a [ 'date' ] <=> $b [ 'date' ]);
// Group by with external storage
$grouped = ExternalGroupBy :: groupBy ( $data , fn ( $item ) => $item [ 'category' ]);
```
### 3. Streaming Operations
``` php
use SqrtSpace\SpaceTime\Streams\SpaceTimeStream ;
// Process large files with bounded memory
$stream = SpaceTimeStream :: fromFile ( 'large_file.csv' )
-> map ( fn ( $line ) => str_getcsv ( $line ))
-> filter ( fn ( $row ) => $row [ 2 ] > 100 )
-> chunkBySqrtN ()
-> each ( function ( $chunk ) {
processBatch ( $chunk );
});
```
### 4. Database Integration
``` php
use SqrtSpace\SpaceTime\Database\SpaceTimeQueryBuilder ;
// Process large result sets efficiently
$query = new SpaceTimeQueryBuilder ( $pdo );
$query -> from ( 'orders' )
-> where ( 'status' , '=' , 'pending' )
-> orderByExternal ( 'created_at' , 'desc' )
-> chunkBySqrtN ( function ( $orders ) {
foreach ( $orders as $order ) {
processOrder ( $order );
}
});
// Stream results for minimal memory usage
$stream = $query -> from ( 'logs' )
-> where ( 'level' , '=' , 'error' )
-> stream ();
$stream -> filter ( fn ( $log ) => strpos ( $log [ 'message' ], 'critical' ) !== false )
-> each ( fn ( $log ) => alertAdmin ( $log ));
```
### 5. Laravel Integration
``` php
// In AppServiceProvider
use SqrtSpace\SpaceTime\Laravel\SpaceTimeServiceProvider ;
public function register ()
{
$this -> app -> register ( SpaceTimeServiceProvider :: class );
}
// Collection macros
$collection = collect ( $largeArray );
// Sort using external memory
$sorted = $collection -> sortByExternal ( 'price' );
// Group by with external storage
$grouped = $collection -> groupByExternal ( 'category' );
// Process in √n chunks
$collection -> chunkBySqrtN () -> each ( function ( $chunk ) {
processBatch ( $chunk );
});
// Query builder extensions
DB :: table ( 'orders' )
-> chunkBySqrtN ( function ( $orders ) {
foreach ( $orders as $order ) {
processOrder ( $order );
}
});
```
### 6. Memory Pressure Handling
``` php
use SqrtSpace\SpaceTime\Memory\MemoryPressureMonitor ;
use SqrtSpace\SpaceTime\Memory\Handlers\LoggingHandler ;
use SqrtSpace\SpaceTime\Memory\Handlers\CacheEvictionHandler ;
use SqrtSpace\SpaceTime\Memory\Handlers\GarbageCollectionHandler ;
$monitor = new MemoryPressureMonitor ( '512M' );
// Add handlers
$monitor -> registerHandler ( new LoggingHandler ( $logger ));
$monitor -> registerHandler ( new CacheEvictionHandler ());
$monitor -> registerHandler ( new GarbageCollectionHandler ());
// Check pressure in your operations
if ( $monitor -> check () === MemoryPressureLevel :: HIGH ) {
// Switch to more aggressive memory saving
$processor -> useExternalStorage ();
}
```
### 7. Checkpointing for Fault Tolerance
``` php
use SqrtSpace\SpaceTime\Checkpoint\CheckpointManager ;
$checkpoint = new CheckpointManager ( 'import_job_123' );
foreach ( $largeDataset -> chunkBySqrtN () as $chunk ) {
processChunk ( $chunk );
// Save progress every √n items
if ( $checkpoint -> shouldCheckpoint ()) {
$checkpoint -> save ([
'processed' => $processedCount ,
'last_id' => $lastId
]);
}
}
```
## Real-World Examples
### Processing Large CSV Files
``` php
use SqrtSpace\SpaceTime\File\CsvReader ;
use SqrtSpace\SpaceTime\Algorithms\ExternalGroupBy ;
$reader = new CsvReader ( 'sales_data.csv' );
// Get column statistics
$stats = $reader -> getColumnStats ( 'amount' );
echo " Average order: $ " . $stats [ 'avg' ];
// Process with type conversion
$totals = $reader -> readWithTypes ([
'amount' => 'float' ,
'quantity' => 'int' ,
'date' => 'date'
]) -> reduce ( function ( $totals , $row ) {
$month = $row [ 'date' ] -> format ( 'Y-m' );
$totals [ $month ] = ( $totals [ $month ] ? ? 0 ) + $row [ 'amount' ];
return $totals ;
}, []);
```
### Large Data Export
``` php
use SqrtSpace\SpaceTime\File\CsvExporter ;
use SqrtSpace\SpaceTime\Database\SpaceTimeQueryBuilder ;
$exporter = new CsvExporter ( 'users_export.csv' );
$query = new SpaceTimeQueryBuilder ( $pdo );
// Export with headers
$exporter -> writeHeaders ([ 'ID' , 'Name' , 'Email' , 'Created At' ]);
// Stream data directly to CSV
$query -> from ( 'users' )
-> orderBy ( 'created_at' , 'desc' )
-> chunkBySqrtN ( function ( $users ) use ( $exporter ) {
$exporter -> writeRows ( array_map ( function ( $user ) {
return [
$user [ 'id' ],
$user [ 'name' ],
$user [ 'email' ],
$user [ 'created_at' ]
];
}, $users ));
});
echo " Exported " . number_format ( $exporter -> getBytesWritten ()) . " bytes \n " ;
```
### Batch Processing with Memory Limits
``` php
use SqrtSpace\SpaceTime\Batch\BatchProcessor ;
$processor = new BatchProcessor ([
'memory_threshold' => 0.8 ,
'checkpoint_enabled' => true ,
'progress_callback' => function ( $batch , $size , $result ) {
echo " Processed batch $batch ( $size items) \n " ;
}
]);
$result = $processor -> process ( $millionItems , function ( $batch ) {
$processed = [];
foreach ( $batch as $key => $item ) {
$processed [ $key ] = expensiveOperation ( $item );
}
return $processed ;
}, 'job_123' );
echo " Success: " . $result -> getSuccessCount () . " \n " ;
echo " Errors: " . $result -> getErrorCount () . " \n " ;
echo " Time: " . $result -> getExecutionTime () . " s \n " ;
```
## Configuration
``` php
use SqrtSpace\SpaceTime\SpaceTimeConfig ;
// Global configuration
SpaceTimeConfig :: configure ([
'memory_limit' => '512M' ,
'external_storage_path' => '/tmp/spacetime' ,
'chunk_strategy' => 'sqrt_n' , // or 'memory_based', 'fixed'
'enable_checkpointing' => true ,
'compression' => true ,
'compression_level' => 6
]);
// Per-operation configuration
$array = new SpaceTimeArray ( 10000 ); // threshold
// Check configuration
echo " Chunk size for 1M items: " . SpaceTimeConfig :: calculateSqrtN ( 1000000 ) . " \n " ;
echo " Storage path: " . SpaceTimeConfig :: getStoragePath () . " \n " ;
```
## Advanced Usage
### JSON Lines Processing
``` php
use SqrtSpace\SpaceTime\File\JsonLinesProcessor ;
// Process large JSONL files
JsonLinesProcessor :: processInChunks ( 'events.jsonl' , function ( $events ) {
foreach ( $events as $event ) {
if ( $event [ 'type' ] === 'error' ) {
logError ( $event );
}
}
});
// Split large file
$files = JsonLinesProcessor :: split ( 'huge.jsonl' , 100000 , 'output/chunk' );
echo " Split into " . count ( $files ) . " files \n " ;
// Merge multiple files
$count = JsonLinesProcessor :: merge ( $files , 'merged.jsonl' );
echo " Merged $count records \n " ;
```
### Streaming Operations
``` php
use SqrtSpace\SpaceTime\Streams\SpaceTimeStream ;
// Chain operations efficiently
SpaceTimeStream :: fromCsv ( 'sales.csv' )
-> filter ( fn ( $row ) => $row [ 'region' ] === 'US' )
-> map ( fn ( $row ) => [
'product' => $row [ 'product' ],
'revenue' => $row [ 'quantity' ] * $row [ 'price' ]
])
-> chunkBySqrtN ()
-> each ( function ( $chunk ) {
$total = array_sum ( array_column ( $chunk , 'revenue' ));
echo " Chunk revenue: \$ $total\n " ;
});
```
### Custom Batch Jobs
``` php
use SqrtSpace\SpaceTime\Batch\BatchJob ;
class ImportJob extends BatchJob
{
private string $filename ;
public function __construct ( string $filename )
{
parent :: __construct ();
$this -> filename = $filename ;
}
protected function getItems () : iterable
{
return SpaceTimeStream :: fromCsv ( $this -> filename );
}
public function processItem ( array $batch ) : array
{
$results = [];
foreach ( $batch as $key => $row ) {
$user = User :: create ([
'name' => $row [ 'name' ],
'email' => $row [ 'email' ]
]);
$results [ $key ] = $user -> id ;
}
return $results ;
}
protected function getUniqueId () : string
{
return md5 ( $this -> filename );
}
}
// Run job with automatic checkpointing
$job = new ImportJob ( 'users.csv' );
$result = $job -> execute ();
// Or resume if interrupted
if ( $job -> canResume ()) {
$result = $job -> resume ();
}
```
## Testing
``` bash
# Run all tests
vendor/bin/phpunit
# Run specific test suite
vendor/bin/phpunit tests/Algorithms
# With coverage
vendor/bin/phpunit --coverage-html coverage
```
## Performance Considerations
1. **Chunk Size ** : The default √n chunk size is optimal for most cases, but you can tune it:
```php
SpaceTimeConfig::configure(['chunk_strategy' => 'fixed', 'fixed_chunk_size' => 5000]);
` ``
2. **Compression**: Enable for text-heavy data, disable for already compressed data:
` ``php
SpaceTimeConfig::configure(['compression' => false]);
` ``
3. **Storage Location**: Use fast local SSDs for external storage:
` ``php
SpaceTimeConfig::configure(['external_storage_path' => '/mnt/fast-ssd/spacetime']);
` ``
## Framework Integration
### Laravel
` ``php
// config/spacetime.php
return [
'memory_limit' => env('SPACETIME_MEMORY_LIMIT', '256M'),
'storage_driver' => env('SPACETIME_STORAGE', 'file'),
'redis_connection' => env('SPACETIME_REDIS', 'default'),
];
// In controller
public function exportOrders()
{
return SpaceTimeResponse::stream(function() {
Order::orderByExternal('created_at')
->chunkBySqrtN(function($orders) {
foreach ($orders as $order) {
echo $order->toCsv() . "\n";
}
});
});
}
` ``
### Symfony
2025-12-28 17:25:50 +00:00
For a complete Symfony integration example, see our [Symfony bundle documentation](https://git.marketally.com/sqrtspace/sqrtspace-php/wiki/Symfony-Integration).
2025-07-20 04:08:08 -04:00
` ``yaml
# config/bundles.php
return [
// ...
SqrtSpace\SpaceTime\Symfony\SpaceTimeBundle::class => ['all' => true],
];
` ``
` ``yaml
# config/packages/spacetime.yaml
spacetime:
memory_limit: '%env(SPACETIME_MEMORY_LIMIT)%'
storage_path: '%kernel.project_dir%/var/spacetime'
chunk_strategy: 'sqrt_n'
enable_checkpointing: true
compression: true
` ``
` ``php
// In controller
use SqrtSpace\SpaceTime\Batch\BatchProcessor;
use SqrtSpace\SpaceTime\File\CsvReader;
#[Route('/import')]
public function import(BatchProcessor $processor): Response
{
$reader = new CsvReader($this->getParameter('import_file'));
$result = $processor->process(
$reader->stream(),
fn($batch) => $this->importBatch($batch)
);
return $this->json([
'imported' => $result->getSuccessCount(),
'errors' => $result->getErrorCount()
]);
}
` ``
` ``bash
# Console command
php bin/console spacetime:process-file input.csv output.csv --format=csv --checkpoint
` ``
## Troubleshooting
### Out of Memory Errors
1. Reduce chunk size:
` ``php
SpaceTimeConfig::configure(['chunk_strategy' => 'fixed', 'fixed_chunk_size' => 1000]);
` ``
2. Enable more aggressive memory handling:
` ``php
$monitor = new MemoryPressureMonitor('128M'); // Lower threshold
` ``
3. Use external storage earlier:
` ``php
$array = new SpaceTimeArray(100); // Smaller threshold
` ``
### Performance Issues
1. Check disk I/O speed
2. Enable compression for text data
3. Use memory-based external storage:
` ``php
SpaceTimeConfig::configure(['external_storage_path' => '/dev/shm/spacetime']);
` ``
### Checkpoint Recovery
` ``php
$checkpoint = new CheckpointManager('job_id');
if ($checkpoint->exists()) {
$state = $checkpoint->load();
echo "Resuming from: " . json_encode($state) . "\n";
}
` ``
## Requirements
- PHP 8.1 or higher
- ext-json
- ext-mbstring
## Optional Extensions
- ext-apcu for faster caching
- ext-redis for distributed operations
- ext-zlib for compression
## Contributing
Please see [CONTRIBUTING.md ](CONTRIBUTING.md ) for details.
## License
The Apache 2.0 License. Please see [LICENSE ](LICENSE ) for details.