Skip to content

Conversation

@dyxushuai
Copy link
Contributor

@dyxushuai dyxushuai commented Dec 28, 2025

Problem

Window.print(.word) computed gwidth twice per word: once for the word width check and again per-grapheme while writing cells. This duplicates grapheme width work and adds overhead in word-wrapped text.

Fix

Cache grapheme slices + widths while computing the word width, then reuse that cache when writing cells. If the fixed buffer fills up, reuse the cached prefix and fall back to the original per-grapheme path for the remainder.

Note: Caching uses a fixed 4KB stack buffer to avoid heap allocation for the per-word cache
(ArrayListUnmanaged append).

Bench (local, zig build bench, iterations=200, 80x24)

Baseline = print + extra per-word gwidth pass to mirror the old double-work
Cached = current print implementation

Case Baseline ns/frame Cached ns/frame Improvement Speedup
Small 81,281 67,043 -17.5% 1.21x
Medium 318,911 264,232 -17.1% 1.21x
Large 632,170 526,237 -16.8% 1.20x
Overflow 4,600,554 3,894,809 -15.3% 1.18x

Improvement: Small -17.5% (1.21x); Medium -17.1% (1.21x); Large -16.8% (1.20x); Overflow -15.3% (1.18x).

Tests

  • zig build test
  • zig build bench

Copilot AI review requested due to automatic review settings December 28, 2025 14:08
@dyxushuai dyxushuai marked this pull request as draft December 28, 2025 14:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes word-wrapping performance in Window.print by caching grapheme slice positions and widths during the initial width calculation pass, eliminating redundant gwidth() calls when rendering. The optimization uses a fixed 4KB stack buffer to avoid heap allocations and falls back to the original per-grapheme iteration if the buffer overflows.

Key Changes:

  • Introduced a caching mechanism that stores grapheme boundaries and widths in a fixed buffer during width calculation
  • Refactored the word-rendering path into two branches: a cached path that reuses computed widths, and a fallback path for cache overflow scenarios
  • Added comprehensive benchmarks demonstrating ~18-22% performance improvement across different text sizes

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/Window.zig Implements grapheme width caching with WordPiece struct and FixedBufferAllocator, adds cached and fallback rendering paths
bench/bench.zig Adds benchmark infrastructure and test cases (small/medium/large) to measure the caching optimization, includes helper iterators for baseline comparison

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@dyxushuai dyxushuai marked this pull request as ready for review December 28, 2025 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant