lucidiot's cybrecluster

Microsoft Compound File Binary

Compound File Binary (CFB) is a file format designed by Microsoft as part of the COM API, as an implementation of the COM Structured Storage. It may also be referred to as a Composite Document File V2, OLE container, or OLE file. It is used by a lot of Microsoft software, and even some non-Microsoft. I commonly encounter it while working on weird things at the Morgue.

Detector script

I wrote a VBScript script to look for any file starting with the CFB file signature in a Windows 98SE virtual machine:

On Error Resume Next

header = Chr(&HD0) & Chr(&HCF) & Chr(&H11) & Chr(&HE0) & Chr(&HA1) & Chr(&HB1) & Chr(&H1A) & Chr(&HE1)

Sub CFBFinder(folder)
    For Each subfolder In folder.SubFolders
        CFBFinder folder
    Next
    For Each file In folder.Files
        If file.Size > 19 Then
            Set stream = file.OpenAsTextStream(1, 0) 'open for reading in ASCII
            'handle possible permission errors
            If Err.Number = 0 Then
                If stream.Read(Len(header)) = header Then
                    WScript.Echo file.Path
                End If
            End If
        End If
    Next
End Sub

Extractor script

I wrote a smol Python script to extract a CFB file into a directory structure, to make inspection easier on Linux.

#!/usr/bin/env python
import argparse
import shutil
from dataclasses import dataclass
from pathlib import Path
from olefile import OleFileIO


@dataclass
class Args:
    cfb_file: OleFileIO
    output_dir: Path
    verbose: bool = False


def main() -> None:
    parser = argparse.ArgumentParser(description="Microsoft Compound File Binary extractor")
    parser.add_argument("cfb_file", type=OleFileIO)
    parser.add_argument("-o", "--output-dir", type=Path, default=".")
    parser.add_argument("-v", "--verbose", action="store_true", default=False)
    args = parser.parse_args(namespace=Args)
    args.output_dir.mkdir(exist_ok=True)

    with args.cfb_file as ole:
        for storage in ole.listdir(storages=True, streams=False):
            dir = args.output_dir.joinpath(*storage)
            if args.verbose:
                print(f"Creating directory {dir} for storage {storage!r}")
            dir.mkdir(exist_ok=True)

        for stream_path in ole.listdir(storages=False, streams=True):
            output_path = args.output_dir.joinpath(*stream_path)
            if args.verbose:
                print(f"Extracting stream {stream_path!r} to {output_path}")
            with ole.openstream(stream_path) as stream, output_path.open("wb") as f:
                shutil.copyfileobj(stream, f)

if __name__ == '__main__':
    main()

Occurrences

I have observed CFB in use in the following cases:


Licensed under Creative Commons Attribution 4.0 International Generated on 2024-05-17T00:39:45+02:00 using pandoc 2.9.2.1