p.enthalabs

GitHub - infiniteregrets/kv-psi: Use Linux Pressure Stall Information to trim an LLM KV cache

Skip to content

Navigation Menu

Toggle navigation

[](https://github.com/)

Sign in

Appearance settings

* Platform

* AI CODE CREATION

- GitHub Copilot Write better code with AI

- GitHub Copilot app Direct agents from issue to merge

- MCP Registry New Integrate external tools

* DEVELOPER WORKFLOWS

- Actions Automate any workflow

- Codespaces Instant dev environments

- Issues Plan and track work

- Code Review Manage code changes

* APPLICATION SECURITY

- GitHub Advanced Security Find and fix vulnerabilities

- Code security Secure your code as you build

- Secret protection Stop leaks before they start

* EXPLORE

- Why GitHub

- Documentation

- Blog

- Changelog

- Marketplace

View all features

* Solutions

* BY COMPANY SIZE

- Enterprises

- Small and medium teams

- Startups

- Nonprofits

* BY USE CASE

- App Modernization

- DevSecOps

- DevOps

- CI/CD

- View all use cases

* BY INDUSTRY

- Healthcare

- Financial services

- Manufacturing

- Government

- View all industries

View all solutions

* Resources

* EXPLORE BY TOPIC

- AI

- Software Development

- DevOps

- Security

- View all topics

* EXPLORE BY TYPE

- Customer stories

- Events & webinars

- Ebooks & reports

- Business insights

- GitHub Skills

* SUPPORT & SERVICES

- Documentation

- Customer support

- Community forum

- Trust center

- Partners

View all resources

* Open Source

* COMMUNITY

- GitHub Sponsors Fund open source developers

* PROGRAMS

- Security Lab

- Maintainer Community

- Accelerator

- GitHub Stars

- Archive Program

* REPOSITORIES

- Topics

- Trending

- Collections

* Enterprise

* ENTERPRISE SOLUTIONS

- Enterprise platform AI-powered developer platform

* AVAILABLE ADD-ONS

- GitHub Advanced Security Enterprise-grade security features

- Copilot for Business Enterprise-grade AI features

- Premium Support Enterprise-grade 24/7 support

- Pricing

Search or jump to...

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

- [x] Include my email address so I can be contacted

Cancel Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Cancel Create saved search

Sign in

Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert

{{ message }}

infiniteregrets/**kv-psi**Public

- NotificationsYou must be signed in to change notification settings

- Fork 0

- Star 3

- Code

- Issues 0

- Pull requests 0

- Actions

- Projects

- Security and quality 0

- Insights

Additional navigation options

- Code

- Issues

- Pull requests

- Actions

- Projects

- Security and quality

- Insights

[](https://github.com/infiniteregrets/kv-psi)

infiniteregrets/kv-psi

main

**1**Branch**0**Tags

[](https://github.com/infiniteregrets/kv-psi/branches)[](https://github.com/infiniteregrets/kv-psi/tags)

Go to file

Code

Open more actions menu

Folders and files

| Name | Name | Last commit message | Last commit date | | --- | --- | --- | --- |

| ## Latest commit ![Image 1: infiniteregrets](https://github.com/infiniteregrets)infiniteregrets Initial commit Jun 27, 2026 df8802c·Jun 27, 2026 ## History 1 Commit Open commit details [](https://github.com/infiniteregrets/kv-psi/commits/main/)1 Commit |

| [benchmarks](https://github.com/infiniteregrets/kv-psi/tree/main/benchmarks "benchmarks") | [benchmarks](https://github.com/infiniteregrets/kv-psi/tree/main/benchmarks "benchmarks") | [Initial commit](https://github.com/infiniteregrets/kv-psi/commit/df8802ce14a8e314e9d780ffbb9df56e88657672 "Initial commit") | Jun 27, 2026 |

| [examples](https://github.com/infiniteregrets/kv-psi/tree/main/examples "examples") | [examples](https://github.com/infiniteregrets/kv-psi/tree/main/examples "examples") | [Initial commit](https://github.com/infiniteregrets/kv-psi/commit/df8802ce14a8e314e9d780ffbb9df56e88657672 "Initial commit") | Jun 27, 2026 |

| [integrations/llama.cpp](https://github.com/infiniteregrets/kv-psi/tree/main/integrations/llama.cpp "This path skips through empty directories") | [integrations/llama.cpp](https://github.com/infiniteregrets/kv-psi/tree/main/integrations/llama.cpp "This path skips through empty directories") | [Initial commit](https://github.com/infiniteregrets/kv-psi/commit/df8802ce14a8e314e9d780ffbb9df56e88657672 "Initial commit") | Jun 27, 2026 |

| [scripts](https://github.com/infiniteregrets/kv-psi/tree/main/scripts "scripts") | [scripts](https://github.com/infiniteregrets/kv-psi/tree/main/scripts "scripts") | [Initial commit](https://github.com/infiniteregrets/kv-psi/commit/df8802ce14a8e314e9d780ffbb9df56e88657672 "Initial commit") | Jun 27, 2026 |

| [src/psi_kv_governor](https://github.com/infiniteregrets/kv-psi/tree/main/src/psi_kv_governor "This path skips through empty directories") | [src/psi_kv_governor](https://github.com/infiniteregrets/kv-psi/tree/main/src/psi_kv_governor "This path skips through empty directories") | [Initial commit](https://github.com/infiniteregrets/kv-psi/commit/df8802ce14a8e314e9d780ffbb9df56e88657672 "Initial commit") | Jun 27, 2026 |

| [tests](https://github.com/infiniteregrets/kv-psi/tree/main/tests "tests") | [tests](https://github.com/infiniteregrets/kv-psi/tree/main/tests "tests") | [Initial commit](https://github.com/infiniteregrets/kv-psi/commit/df8802ce14a8e314e9d780ffbb9df56e88657672 "Initial commit") | Jun 27, 2026 |

| [.gitignore](https://github.com/infiniteregrets/kv-psi/blob/main/.gitignore ".gitignore") | [.gitignore](https://github.com/infiniteregrets/kv-psi/blob/main/.gitignore ".gitignore") | [Initial commit](https://github.com/infiniteregrets/kv-psi/commit/df8802ce14a8e314e9d780ffbb9df56e88657672 "Initial commit") | Jun 27, 2026 |

| [README.md](https://github.com/infiniteregrets/kv-psi/blob/main/README.md "README.md") | [README.md](https://github.com/infiniteregrets/kv-psi/blob/main/README.md "README.md") | [Initial commit](https://github.com/infiniteregrets/kv-psi/commit/df8802ce14a8e314e9d780ffbb9df56e88657672 "Initial commit") | Jun 27, 2026 |

| [pyproject.toml](https://github.com/infiniteregrets/kv-psi/blob/main/pyproject.toml "pyproject.toml") | [pyproject.toml](https://github.com/infiniteregrets/kv-psi/blob/main/pyproject.toml "pyproject.toml") | [Initial commit](https://github.com/infiniteregrets/kv-psi/commit/df8802ce14a8e314e9d780ffbb9df56e88657672 "Initial commit") | Jun 27, 2026 | | View all files |

Repository files navigation

- README

More items

PSI KV Governor

[](https://github.com/infiniteregrets/kv-psi#psi-kv-governor)

PSI KV Governor is a small reference implementation for using Linux Pressure Stall Information to trim an LLM KV cache when the system is under memory pressure.

Requirements

[](https://github.com/infiniteregrets/kv-psi#requirements)

- Linux with PSI enabled: cgroup `memory.pressure` or `/proc/pressure/memory`

- Python 3.10+

- llama.cpp build dependencies for the runner

- a GGUF model, for example `models/SmolLM2-135M-Instruct-Q2_K.gguf`

Check PSI:

undefinedshell cat /proc/pressure/memory PYTHONPATH=src python benchmarks/pressure_bench.py --preflight-only undefined

Basic Usage

[](https://github.com/infiniteregrets/kv-psi#basic-usage)

Run the reference simulator:

undefinedshell PYTHONPATH=src python -m psi_kv_governor.cli simulate undefined

Build the llama.cpp runner:

undefinedshell scripts/build_llama_runner.sh undefined

Download the small benchmark model if needed:

undefinedshell python scripts/download_demo_model.py undefined

PSI Benchmark

[](https://github.com/infiniteregrets/kv-psi#psi-benchmark)

Run both variant orders. This matters because PSI `avg10`, cache, and zram/swap state can carry over from the first pressure run into the second.

undefinedshell PYTHONPATH=src python benchmarks/pressure_bench.py \ -c 2048 \ -n 1536 \ --keep 64 \ --tail 256 \ --min-prune 64 \ --pressure-mib 6000 \ --pressure-step-mib 1024 \ --pressure-warmup-s 10 \ --variant-cooldown-s 45 \ --out-dir data/bench-pressure/fixed-first

PYTHONPATH=src python benchmarks/pressure_bench.py \ --variant-order psi-first \ -c 2048 \ -n 1536 \ --keep 64 \ --tail 256 \ --min-prune 64 \ --pressure-mib 6000 \ --pressure-step-mib 1024 \ --pressure-warmup-s 10 \ --variant-cooldown-s 45 \ --out-dir data/bench-pressure/psi-first undefined

Recent Jetson result:

| run | variant | decoded | tok/s | prunes | final KV | external PSI some/full | | --- | --- | ---: | ---: | ---: | ---: | --- | | fixed-first | fixed | 1536 | 94.00 | 0 | 1547 | 1.61/1.61 | | fixed-first | PSI | 1536 | 88.80 | 4 | 1291 | 4.14/3.94 | | psi-first | PSI | 1536 | 96.16 | 2 | 1004 | 2.46/2.33 | | psi-first | fixed | 1536 | 89.76 | 0 | 1547 | 5.56/5.56 |

Result directories:

- `data/bench-pressure/real-psi-6000m-1536tok-cooldown`

- `data/bench-pressure/real-psi-6000m-1536tok-cooldown-psi-first`

About

Use Linux Pressure Stall Information to trim an LLM KV cache

Resources

Readme

Uh oh!

There was an error while loading. Please reload this page.

Activity

Stars

**3** stars

Watchers

**0** watching

Forks

**0** forks

Report repository

Releases

No releases published

Packages 0

No packages published

Contributors 1

- ![Image 2: @infiniteregrets](https://github.com/infiniteregrets)**infiniteregrets**Mehul Arora

Languages

- Python 72.1%

- C++27.2%

- Shell 0.7%

Footer

[](https://github.com/) © 2026 GitHub,Inc.

Footer navigation

- Terms

- Privacy

- Security

- Status

- Community

- Docs

- Contact

- Manage cookies

- Do not share my personal information

You can’t perform that action at this time.