Tags: fogfish/iq
Tags
(fix) Housekeeping after Claude generated code (#88) * release [x.Y.z] * (fix) define abstraction for composition of computation elements * (fix) all computation elements, defining runtime abstraction * (fix) improve AST nodes * (fix) data exchange using the type rather than any * (fix) the context, use Event instead of context-based data exchange * (fix) reporting context * (fix) emit intermediate results to sink * (fix) caching of intermediate results * (fix) consolidate file format codec * (fix) integration testing * (fix) update ADRs * (fix) remove outdated/un-used code * (fix) simple integration testing via it.sh * (fix) foreach merge results into target format (json array, plain text) * (fix) remove jsonify feature * (fix) /dev/null runs-on target for templates, generate report
Implement #81: Pipeline integration with storage sink and emit execut… …ion (#87) * Implement #81: Pipeline integration with storage sink and emit execution Added storage-based sink with emit context support for pipeline integration: - Created StorageSink that wraps Storage interface with emit support - Updated all step types (Agent, Router, Foreach, Run) to set emit context - ForeachStep now manages iteration counters for nested loops - Each step propagates emit prefix through execution context - Storage sink applies emit prefix and counters to output keys - Added Storage() method to sink builder for emit-aware output - Added comprehensive tests for StorageSink with emit and counters This enables emit-based output control where workflow steps can specify output key prefixes via 'emit' attribute. Foreach loops add iteration counters (e.g., file.000001.txt, file.000002.txt) for multi-item processing. The implementation: - Preserves emit context across step execution using context.Context - Supports nested foreach with stacked counters - Maintains backward compatibility (emit is optional) - Uses existing Storage interface for filesystem operations Example usage: steps: - uses: prompts/summarize.md emit: summaries # Output to summaries/doc.txt - foreach: selector: document emit: processed # Output to processed/doc.000001.txt, etc. * Fix emit context propagation using EmitContextCapture pattern Issue: Emit context set in workflow steps wasn't reaching StorageSink because Go contexts are immutable - workflow execution creates new contexts that don't propagate back. Solution: Implemented EmitContextCapture pattern - a mutable struct passed via context that workflow steps can write to during execution. Changes: - Added EmitContextCapture struct and helpers (WithEmitCapture, GetEmitCapture) to compiler/context.go - Updated all workflow steps (AgentStep, RouterStep, ForeachStep, RunStep) to capture emit in EmitContextCapture - Modified Agent processor to create emit capture before workflow execution and store captured emit in document metadata - Fixed Agent processor to preserve document Key from input to output - Added workflow detection (workflowUsesEmit) and conditional sink selection (buildWithEmit) in agent command - StorageSink reads emit from document metadata and applies prefix/counters to output path Result: emit attribute now correctly controls output directory structure (e.g., emit: processed → output/processed/file.txt)
feat: Implement skip-if-exists flag for incremental processing (ADR-0… ( #86) * feat: Implement skip-if-exists flag for incremental processing (ADR-0006 Phase 3) This commit implements the --skip-if-exists CLI flag to enable incremental processing and recovery from failures. The flag checks if anchor output exists before processing each document, allowing expensive LLM operations to be skipped when results already exist. Implementation Details: 1. CLI Flag Integration (cmd/opts.go): - Added skipIfExists field to optsAgent struct - Registered --skip-if-exists flag with description - Integrated skip logic into agent build pipeline - Validates that --output-dir is specified when using skip-if-exists 2. Anchor Key Computation (internal/blueprint/compiler/anchor.go): - Created AnchorKeyComputer to calculate expected output keys - Supports all step types: AgentStep, RouterStep, ForeachStep, RunStep - Handles emit attribute to compute prefixed output paths - For foreach steps, anchor is the array file (not individual elements) - Defaults to input key when no emit is specified 3. Skip Checking Logic (internal/blueprint/compiler/skip.go): - Created SkipChecker using Storage interface to check file existence - Uses AnchorKeyComputer to determine expected output location - Reports skipped documents via progress reporter - Returns early if anchor file already exists 4. Pipeline Integration (internal/service/worker/worker.go): - Added workflow field to Builder to store compiled workflow - Created SkipIfExists() builder method - Integrates storage, anchor computer, and progress reporter - Adds skip processor to pipeline before agent execution 5. Processor Implementation (internal/iosystem/processor/skipifexists.go): - Created SkipIfExists processor implementing Processor interface - Filters documents based on anchor existence check - Passes through EOF markers and empty document sets - Implements Close() for proper resource cleanup 6. Blueprint API Extension (internal/blueprint/blueprint.go): - Added Workflow() method to expose compiled workflow - Enables access to workflow structure for anchor computation - Maintains encapsulation while allowing internal use 7. Comprehensive Testing: - anchor_test.go: Tests anchor computation for all step types - skip_test.go: Tests skip logic with various scenarios - All tests passing with no regressions Usage Example: iq agent batch -f workflow.yml -I input/ -O output/ --skip-if-exists This implementation completes Phase 3 of ADR-0006, enabling fault-tolerant, resumable execution by skipping documents that have already been processed. Resolves #80 * Fix #80: Implement step-level output caching (skip-if-exists) Corrected implementation from document-level filtering to per-step caching: - Added CacheContext with TryLoadCached/SaveCached for step output caching - Each step type (Agent, Router, Foreach, Run) now checks cache before operations - Cache key computed from emit attribute + document path - Steps skip LLM calls/commands when cached output exists - Added StepSkipped() progress reporting for cache hits - Removed obsolete document-level filtering approach (SkipIfExists processor) This enables true incremental processing where workflows can resume from any partially completed document, not just skip entire documents.
feat: Add emit attribute for step-level output control (ADR-0006 Phas… ( #85) * feat: Add emit attribute for step-level output control (ADR-0006 Phase 2) This commit implements the emit attribute as specified in issue #79, enabling explicit output key prefixes for workflow steps. Changes: - AST: Added Emit field to all step node types (AgentStepNode, RouterStepNode, ForeachStepNode, RunStepNode) - Parser: Added emit YAML field parsing for all step types - Compiler: Added EmitContext for tracking output prefixes and foreach iteration counters - Compiler: Updated all compiled step structures to include Emit field - Compiler: Implemented ApplyEmit and ApplyEmitWithCounters functions for key transformation - Workflow: Initialized EmitContext in Job.Prompt execution Tests: - Added parser tests for emit attribute parsing across all step types - Added unit tests for ApplyEmit and ApplyEmitWithCounters functions - Added tests for EmitContext push/pop operations - All existing tests pass without regressions The emit attribute allows multi-stage workflows to organize outputs into separate directories. For example: - uses: prompts/summarize.md emit: summary will transform output keys like 'a.txt' to 'summary/a.txt'. For foreach steps with counters, keys are transformed with iteration counters inserted before the file extension: emit: research, key: a.txt, iteration: 1 → research/a.000001.txt Note: This is Phase 2 of ADR-0006. Actual I/O integration with emit prefixes will be implemented in Phase 3. * (fix) file formatting
Phase 1: Implement Key/Value Storage Infrastructure (#78) (#84) * Phase 1: Implement Key/Value Storage Infrastructure (#78) Implements core infrastructure for ADR 0006 key/value I/O system: - Add Key type as simple string for identity - Create Storage interface (Put/Get/Has/Walk) - Implement FSStorage wrapping github.com/fogfish/stream - Update Document with Key field and Metadata struct - Maintain backward compatibility via Document.Path field - Update FSSource to construct relative path keys - Comprehensive unit tests for storage layer All existing tests pass with backward compatibility maintained. * (fix) adr update emit spec * (fix) storage get api * (fix) auto closing requirment * (fix) failing tests * (fix) file format
Add integration tests for ForeachStep formatter integration (Issue #68)… … (#77) - Added comprehensive tests for JSON, JSONL, and text formatters - Tests verify formatted output types and correct storage in workflow context - Tests verify error handling when formatter fails - Tests verify default format behavior (JSON) - All tests pass successfully This completes ADR-0007 Phase 5: Formatter integration into ForeachStep. The core implementation (Formatter field, compiler integration, and Prompt() method updates) was already completed by a previous agent. These tests validate that the integration works correctly.
PreviousNext