Module ScanDirectory

Module ScanDirectory 

Source
Expand description

§ScanDirectory

§File: Indexing/Scan/ScanDirectory.rs

§Role in Air Architecture

Provides directory scanning functionality for the File Indexer service, handling recursive traversal of directories to discover files for indexing.

§Primary Responsibility

Scan directories recursively to discover files matching include patterns while respecting exclude patterns and filesystem limits.

§Secondary Responsibilities

  • Validate directory permissions before scanning
  • Parallel file enumeration for performance
  • Skip directories like node_modules, target, .git
  • Collect files with metadata for batch processing

§Dependencies

External Crates:

  • ignore - .gitignore-aware directory walking
  • tokio - Async runtime for I/O operations

Internal Modules:

  • crate::Result - Error handling type
  • crate::AirError - Error types
  • crate::Configuration::IndexingConfig - Indexing configuration

§Dependents

  • Indexing::mod::FileIndexer - Main file indexer implementation
  • Indexing::Background::StartWatcher - Background task scanning

§VSCode Pattern Reference

Inspired by VSCode’s file system scanning in src/vs/base/common/files/

§Security Considerations

  • Path traversal protection through canonicalization
  • Symbolic link following disabled by default
  • Depth limits prevent infinite recursion
  • Permission checking before access

§Performance Considerations

  • Parallel directory scanning with limited concurrency
  • Batch collection of files for processing
  • Lazy evaluation with ignore crate
  • Early filtering by file patterns

§Error Handling Strategy

Scan operations log warnings for individual errors and continue, returning a result only if the top-level operation fails.

§Thread Safety

Scan operations are designed to be called from async tasks and return collectable results for parallel processing.

Structs§

DirectoryStatistics
Directory statistics
ScanDirectoryResult
Scan directory result with statistics

Functions§

GetDefaultExcludePatterns
Get default exclude patterns for directory scanning
GetDirectoryStatistics
Get file count statistics for a directory without full scan
MatchesPattern
Check if filename matches a single pattern
MatchesPatterns
Check if file path matches any of the provided patterns
ScanAndRemoveDeleted
Scan a directory and remove deleted files from index
ScanDirectoriesParallel
Parallel scan of multiple directories
ScanDirectory
Scan a directory recursively and collect matching files