Skip to content

Sliding window KV cache #12975

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 29, 2025
Merged

Conversation

sxu
Copy link
Contributor

@sxu sxu commented Jul 29, 2025

Summary: Make the KV cache a sliding window by default. If this is not desired user can check number of tokens generated in the should_stop callback. The interface still only accepts a single cache length, but internally the class stores cache length and cache position independently for each layer, in preparation for local-global attention.

Reviewed By: billmguo

Differential Revision: D79128246

Summary: Make the KV cache a sliding window by default. If this is not desired user can check number of tokens generated in the `should_stop` callback. The interface still only accepts a single cache length, but internally the class stores cache length and cache position independently for each layer, in preparation for local-global attention.

Reviewed By: billmguo

Differential Revision: D79128246
@sxu sxu requested review from lucylq and jackzhxng as code owners July 29, 2025 20:13
Copy link

pytorch-bot bot commented Jul 29, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12975

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1d8bfd8 with merge base 8b2ddb2 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 29, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79128246

Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@sxu sxu requested review from billmguo and YIWENX14 July 29, 2025 20:59
@facebook-github-bot facebook-github-bot merged commit cd34c47 into pytorch:main Jul 29, 2025
100 of 105 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants