Messing with CVE-2022-30190 by Understanding Compound File Binary Format and OLE Structures

By:

Created: June 22, 2022

Updated: September 20, 2023

Initially, I began this research to generate weaponized RTF files delivering the CVE-2022-30190(Follina) exploit. 

Why RTF files? 

Because the payload with RTF files will deliver on (probably) all Windows versions (to the date of writing this report) and can execute by just enabling the preview pane and viewing the RTF document from File Explorer. In contrast, the payload does not execute on all Windows versions when loaded from DOCX files.

To generate RTF files containing the exploit, I have used Cas Van Cootens POC code to generate a DOCX file and then create a new copy of the same document just in RTF format. This would create a valid RTF file weaponized with the exploit. 

What’s the problem then? 

The problem was that every time I wanted to generate a valid RTF file, I had to first generate a DOCX file and then regenerate an RTF file from within Microsoft Word. What if I don’t have MS Word? What if I’m lazy? Well, I thought to myself – “How hard could it be automating this?” I opened the RTF file, saw where my payload was saved in plain text, replaced it, and there we go. 

It should work, right? 

Well, it doesn’t!

If we take a simple look at how an RTF file can be loaded with a malicious hyperlink, we can apply the method specified in this article regarding CVE-2017-0199. We can see that

This was right, as long as the last field – objdata – was loaded with a proper OLE object. Honestly, the assumption I initially had about this implementation was due to my laziness and wishful thinking that implementing this within Windows would be THAT easy. This error, which I totally missed as I did not test the POC properly, prompted a person with very creative user handles to raise an issue on GitHub for Cas’s POC. This issue was raised regarding the RTF generation feature I contributed.

MSisfuckedupmanimaginepayingtogetRCEd wrote:

And indeed, after further inspection, Cas confirmed that:

Apparently, (which honestly makes a lot of sense now), when regenerating an RTF file containing a hyperlink to a remote template, an OLE object is generated by Office -Actually, 2 objects are generated, but only one of those two is needed. When we confirm this by viewing the OLE object within HxD, we find the Compound File Signature, as can be seen in this hex blob taken from the beginning of the OLE object stored within the RTF file dumped using oletools.

This prompted me to learn how OLE Objects are stored and understand how they work, so I could automate their creation. One might ask, why am I doing this? As generating an RTF loaded with Follina is easy to do, why not just regenerate it with Word?

Well… I am annoyed by the Github issue and simply curious. Anyway, put on your seatbelts and take your sanity pills because we are about to deep dive into some DEEP Microsoft RFCs and Specifications!

OLE, Compound Binary File Format, COM, and Windows theory

First, let’s examine the RTF file specifications to understand how RTF stores embedded objects such as files, hyperlinks, and other data streams.

Microsoft OLE links, Microsoft OLE embedded objects, and Macintosh Edition Manager subscriber objects are represented in RTF as objects. Objects are destinations that contain a data part and a result part. The data part is generally hidden from the application that produced the document. A separate application uses the data and supplies the appearance of the data. This appearance is the result part of the object.

We can see how this is implemented with an RTF file that I generated using Word and which should contain the Follina Payload:

The fields of interest are \objautlink which specifies an auto-link object, essentially a link within the word document that auto executes. According to the RTF specifications, the \objupdate should execute it by force, but my own testing shows that this works arbitrarily. Finally, the most interesting field is \objdata.

This sub-destination contains the data for the object in the appropriate format; OLE objects are in OLESaveToStream format. This is a destination control word.

This is where things start to get a bit convoluted. The payload generated is stored within an OLE object within the RTF file. It is a hex-encoded object that looks like this:

To understand what these hex numerals mean, we must first understand what OLE objects are and in which format they are stored. According to Wikipedia:

“OLE allows an editing application to export part of a document to another editing application and then import it with additional content. For example, a desktop publishing system might send some text to a word processor or a picture to a bitmap editor using OLE. The main benefit of OLE is to add different kinds of data to a document from different applications, like a text editor and an image editor. This creates a**Compound File Binary Format** document and a master file to which the document makes reference. Changes to data in the master file immediately affect the document that references it. This is called “linking” (instead of “embedding”). OLE objects essentially allow File Explorer Addins in your apps, Drag and Drop feature, Links to excel documents within a word document, or add GIFS into email messages. OLEs are stored using the Compound File Binary Format (CFBF also named CBF or CFB) which is based on the FAT File System specifications. Yes and if this sounds crazy, OLE Objects use the Component Object Model (COM).

COM is a binary interface that is the basis for a lot of Microsoft Technology, it allows for inter-process communication which allows for Windows objects to be implemented in different environments in which they were created. For example, Word and Excel documents are unrelated but using COM I can either link or embed a useable Excel document file into a word document. The COM technology knows how to do this using its various interfaces”.

If, for example, I wanted to embed an excel file in a Word document and display it to anyone, I would embed an OLE object within the Word document, which would include either a link or an embedded excel file. This OLE object would contain “instructions” written using the COM Interface for the Word process, which would explain how to load this excel file. Word would process the OLE Object and the COM “instructions”, then call the COM Interface specified, and then load the excel document properly into Word. Though Word does not understand what Excel is, the COM interface handles all the heavy duties and allows Word to either link to the referenced Excel document or to literally embed an excel document within it.

Just writing this hurts my brain, but, in summary, this picture should explain everything:

OLE and Compound File Binary Format in practice

Let’s look at the OLE file within the generated RTF file mentioned previously. I took the raw object data stored within it and loaded it into my Hex Editor of choice HxD.

To simplify reading, I created a nice-to-read diagram that explains what’s going on (Important note: all structures are stored in little-endian format). The first 33 bytes specify the OLE Object header (the last 2 bytes are missing from the picture).

Using the MS-OLEDS specifications, we can infer that this is an OLE Embedded Object Container. As the FormatID field contains the value 0x2.

The class name field contains the name “ OLE2Link” which might hint at what this OLE Object is meant to do. Finally, after the ObjectHeader, we have the value contained in offset 0x1D, which is 0x0000A000 and represents the total stream size of this object. This value is quite crucial as modifying the OLE Object would require altering this value as well. Otherwise, the Word process would not read the OLE Object in its entirety.

Following it is the NativeData. This data is actually a Compound Binary File Format that stores (or should store) OLE Objects, Embedded files or documents, links, and pictures. This can be confirmed by the first 8 bytes found in the NativeData stream.

According to the MS-CFB, this value is the CFB file signature.

What is a Compound Binary File ( CBF )?

According to Wikipedia :

“Compound File Binary Format (CFBF), also called Compound File, Compound Document format, or Composite Document File V2 (CDF), is a compound document file format for storing numerous files and streams within a single file on a disk. CFBF is developed by Microsoft and is an implementation of Microsoft COM Structured Storage. At its simplest, the Compound File Binary Format is a container, with little restriction on what can be stored within it. A CFBF file structure loosely resembles a FATfilesystem. The file is partitioned into Sectors which are chained together with a File Allocation Table (not to be mistaken with the file system of the same name) which contains chains of sectors related to each file, a Directory holds information for contained files with a Sector ID (SID) for the starting sector of a chain and so on.”

Microsoft stores OLE objects within CBFs and COM Objects within those OLEs. (Why Microsoft chose CBFs as the main format to store these objects can be read here. ) This format is mainly replaced by Office Open XML, but it is still used within RTF objects and old office extensions such as:

Anyhow, the RTF hyperlink should be stored somewhere within this CBF file, so let’s check out how.

General Guidelines about CBFs

CBF files are divided into 512 byte-sized sectors. The two tables below should help understand what the first sector for the CBF file looks like(read about it here). For OLE objects, we’re primarily interested in the Directory sector that contains information about OLE data object streams.

At offset 0x30, you find the DWORD 0x1000000000 that indicates the location of the Directory sector. Since CBFs are stored in little-endian format, the starting location is one. To calculate the offset of the starting location we follow the formula of  (1+DirectoryStartingSectorLocation)*512 , which drops us at offset  0x400.

While the CLSID field indicates the type of the COM object associated with the activation of the document(in this case, it’s the SAX XML Reader 6.0), the more exciting fields are located at offsets 0x74 and 0x78. The starting location of the OLE streams is calculated from the Mini stream sector, which starts at offset 0x600 in our case, using the formula (SSL*0x64) – which can be adequately viewed using the olebrowse tool from the oletools collection. The stream size field specifies the size of the stream. It will be modified to shrink/increase in accordance with the length of the remote template URI.

The next sector specifies the MiniFAT chain, which gives information regarding the chained streams within the CFB. It’s not a very important sector for the blog, but it’s worthwhile to see how it looks. The chain shows how the streams are linked. Each cell in the chain represents 40 bytes of a stream. It continues until it reaches the value 0xFFFFFFFE, so the first stream goes for 5 blocks of 40 bytes (or 0x140 in hex).

This can be confirmed by reading the first 320 bytes in the last sector.

However, the directory entry specifies the stream size will only use 275 bytes out of the 320 bytes or ( 0x130 ).

The OLE Stream and Monikers

As usual, I created a diagram below of the OLE Stream structure. From offset 0x810 within the OLE stream, we reach the first moniker stream. A moniker is an object (or component) in Microsoft’s Component Object Model ( COM ) that refers to a specific instance of another object. A moniker stream always starts with a CLSID, which describes what type of moniker it is, followed by a data stream. There are quite a few moniker specifications, read about them here.

This OLE stream “instructs” the COM interface to load and launch the malicious hyperlink. Additionally to the OLE stream, the LinkInfo stream contains data regarding the hyperlink, which also needs to be modified.

My hypothesis is that, if I can somehow control the size of the OLE stream, the LinkInfo stream, and their components, I can generate different hyperlinks.

Luckily for the reader and me, this blog is the aftermath of my success in doing so, so first, let’s name the important size fields:

NativeDataSize – This field specifies the size of the value of the entire OLE Data object, this value cannot be easily modified unless I modify and reconstruct the MiniFAT chain and the FAT chain.
Directory Stream Size – This field specifies the size of each stream (Important note: most streams don’t use their entire size limits and are padded with null bytes.)
OLE Stream AbsoluteMonikerStreamSize – This field specifies the size of the entire HyperLinkMoniker and its components.
URLMoniker length – This field specifies the byte size of the URI string plus 24 (really, it’s weird, I know – but it’s specified in the URLMoniker specification.

Crafting an OLE Stream

The first problem that needs addressing is the total size of the streams. While I can manually adjust the CFB file myself, a much easier solution would be to just generate a very large OLE stream. I just input a different sized port with the value of 65535 instead of the default 80 using the Cas Van Cootens POC code for Follina that was mentioned at the beginning of this blog post, and this, in turn, generates a very large stream. This time the NativeDataSize contains the value 0xC00, which is 512 bytes larger than previously (just one sector larger.)

Additionally, the MiniFAT chain is much larger now.

So, this marks the problem as solved.

Next, I decided to look at the RootEntry stream size field. Currently, it’s set to 0x142 (322) bytes.

But I know for a fact that it’s padded with null bytes to align with the MiniFat sector size specification of 64 bytes.

So, essentially, this stream size can be increased to 0x180 (384) bytes. The same can be implemented for the LinkInfo object.

The current LinkInfo stream size is 0xf0 (240) but can be increased to **0x240 (576) bytes! This is as simple as just changing the values within the blob.

The final two problems are quite easy to solve. With some simple math to calculate the total size of the objects, subtract the modified URL fields from the original size and pad what’s left with zeros in the appropriate locations.

By doing this, I’m essentially just modifying the objects and not changing the stream alignment! This, in fact, works perfectly! Here is a demonstration of Follina where one VM in a VLAN serves the payload at a typically long URI, and another VM in the VLAN retrieves the payload using the RTF script.