Skip to content

[3.14] gh-151497: Avoid huge pre-allocation for oversized tarfile extended headers (GH-151498)#151979

Merged
encukou merged 1 commit into
python:3.14from
miss-islington:backport-da99711-3.14
Jun 24, 2026
Merged

[3.14] gh-151497: Avoid huge pre-allocation for oversized tarfile extended headers (GH-151498)#151979
encukou merged 1 commit into
python:3.14from
miss-islington:backport-da99711-3.14

Conversation

@miss-islington

@miss-islington miss-islington commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

tarfile reads a member's extended header (a GNU long name/link or a pax
header) with a single read sized by the header's size field:

buf = tarfile.fileobj.read(self._block(self.size))

The size is taken from the archive and is not validated, so a ~512-byte
crafted file can claim several gigabytes (or, via base-256 encoding, far
more) and make read() pre-allocate that much memory -- on open/iterate,
before any extraction filter runs.

Read the extended-header data in bounded chunks instead, so an oversized
or truncated header can no longer force a huge allocation. The bytes
returned for valid archives are unchanged.
(cherry picked from commit da99711)

Co-authored-by: Shardul Deshpande iamsharduld@users.noreply.github.com

…nded headers (pythonGH-151498)

tarfile reads a member's extended header (a GNU long name/link or a pax
header) with a single read sized by the header's size field:

    buf = tarfile.fileobj.read(self._block(self.size))

The size is taken from the archive and is not validated, so a ~512-byte
crafted file can claim several gigabytes (or, via base-256 encoding, far
more) and make read() pre-allocate that much memory -- on open/iterate,
before any extraction filter runs.

Read the extended-header data in bounded chunks instead, so an oversized
or truncated header can no longer force a huge allocation. The bytes
returned for valid archives are unchanged.
(cherry picked from commit da99711)

Co-authored-by: Shardul Deshpande <iamsharduld@users.noreply.github.com>
@encukou encukou merged commit 2cf26d0 into python:3.14 Jun 24, 2026
49 of 52 checks passed
@miss-islington miss-islington deleted the backport-da99711-3.14 branch June 24, 2026 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants