Excel Formula/Macro in .xlsb?

Excel Formula, or XLM – does it ever stop giving pain to researchers?

Last week I received a new sample using the xlsb file format that supposedly contained malicious code. I had a quick look, and wow – this was different. An initial check on

VirusTotal

(

VT

) showed that it hadn’t been uploaded to VT yet. So, with nothing to go on, I started looking into the sample.

Structurally, it’s a

Microsoft Excel 2007+

document (ZIP) containing the following files:

Excel-Formula-fig1.png

Naturally, we look at the xl/macrosheets/sheet1.bin, right? First we need to enumerate these records. The xl/macrosheets/sheet1.bin looks like this:

Excel-Formula-fig2.png

How are the records stored? The answer is in

Microsoft

’s documentation. To establish the recordId, you read the first byte (0x81). Since the high-bit (0x80) is set, this means there is another byte to add to the recordId. Remove this bit for now and we get 0x01. The next byte is 0x01 and as the high-bit (0x80) isn’t set, we can use the value of the byte multiplied with 0x80. This means that the recordId is (1*128)+1 = 129 – which is BrtBeginSheet. To get the length you do the same, read the next byte (0x00) which means there is no high-bit (0x80), so there is no other byte. The rest of the seven bits say 0, so the record has no data.

The next record is BrtWsProp with recordId 147 and length 23.

recordId: (0x93 & 0x7F) + (0x01*0x80) = 147 (0x93)
length: (0x17 & 7x7F) = 23 (0x17)

Now you can parse all the records and get a nice list. Unfortunately, when parsing the records of the xl/macrosheets/sheet1.bin I see nothing unusual.

So we move on to look at the other sheets, what can we find here? Quite a lot actually. The records we are interested in, while we learn, are:


RecordId

Name

Description
0 BrtRowHdr Tells you what row you currently are on
8 BrtFmlaString Tells you about an embedded string and the pcode (parsed expression) to build this string
11 BrtFmlaError Tells you the pcode (parsed-expression)

Let’s have a brief look at the data we need.

BrtFmlaString


Microsoft

has documented this well in a

PDF

. To start with, it contains an eight-byte cell information structure, a variable XLWideString (which looks like a Unicode string), two bytes of grbitFlags, and then you get to the formula itself (CellParsedFormula structure).

The first one you’ll find is this:

Excel-Formula-fig3.png

After decoding, we get this:

RECORD: BrtFmlaString (Id 8,offset 58d), LENGTH: 30
col: 26, row: 20 | strlen=1 : "/"
         1E 2F 00            PtgInt: 47
         41 6F 00            PtgFunc: CHAR (111)

The record has no information about the row, so you need to get this from the BrtRowHdr record. When you get to the CellParsedFormula structure you parse it (as mentioned in my

previous article

).

BrtFmlaError

This record also starts with a eight-byte cell structure, then a one-byte fErr, and two-byte grbitFlags before you reach the formula itself (CellParsedFormula structure).

Excel-Formula-fig4.png

When you parse the first record of this stream you’ll get:

RECORD: BrtFmlaError (Id 11,offset 1e3), LENGTH: 62
     49 27 00            PtgMemFunc: 27
     19 40 00 01         PtgAttrSpace: 0100
     23 04 00 00 00      PtgName: index 4
     23 14 00 00 00      PtgName: index 20
     0F                  PtgIsect:
     23 5D 00 00 00      PtgName: index 93
     0F                  PtgIsect:
     23 46 00 00 00      PtgName: index 70
     0F                  PtgIsect:
     23 15 00 00 00      PtgName: index 21
     0F                  PtgIsect:
     23 2F 00 00 00      PtgName: index 47
     0F                  PtgIsect:
     13                  PtgUminus:

BrtRowHdr

This is a simple structure, but for now we just want the row.

Excel-Formula-fig5.png

The first DWORD gives you the sequence you need (in this case, 2).

The result

When you have parsed all these records from all these binary worksheets, you’ll end up with a virtual sheet that looks like this:

Excel-Formula-fig6.png

This is more informative, but it was a bit of work to get there. At least it is context you can relate to.

As I am writing this article I see that

VT

has received a copy of the sample, and that when it was first checked (on entry) a single engine was detecting it:

Excel-Formula-fig7.png

Kudos to

Ikarus

!

I think my little project is over – when I have a problem like this I can’t let it go until it’s solved, but now I can finally relax!

Get in touch with me if you need help! I think tools should give this kind of context automagically.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *