16 minute read

I played HITCON Quals CTF 2024 with merger team World Wide Union.

This challenge provided a run.sh and print_flag.cbc file.

Content of run.sh is as follow:

#!/bin/sh

docker run -v /home/ctf/clamav/:/test/ --rm -it clamav/clamav clamscan --bytecode-unsigned -d/test/print_flag.cbc /test/sample.exe

Looking at it, this is obviously a bytecode challenge, specifically ClamAV’s bytecode.

Setting up ClamAV

Below are the commands used to install ClamAV from scratch, please ensure you have ninja installed in prior:

sudo apt-get update && apt-get install -y \
  `# install tools` \
  gcc make pkg-config python3 python3-pip python3-pytest valgrind cmake \
  `# install clamav dependencies` \
  check libbz2-dev libcurl4-openssl-dev libjson-c-dev libmilter-dev \
  libncurses5-dev libpcre2-dev libssl-dev libxml2-dev zlib1g-dev


git clone https://github.com/Cisco-Talos/clamav.git
mkdir build
cd build
cmake .. -G Ninja \
    -D CMAKE_BUILD_TYPE=Debug \
    -D OPTIMIZE=OFF \
    -D CMAKE_INSTALL_PREFIX=`pwd`/install \
    -D ENABLE_EXAMPLES=ON \
    -D ENABLE_STATIC_LIB=ON \
    -D ENABLE_SYSTEMD=OFF

cmake --build . --target install

# To debug, getting output from function cli_dbgmsg
./clamscan/clamscan --bytecode-unsigned -dprint_flag.cbc ./example_pe_file.exe --debug 2>&1

# To run normally
./clamscan/clamscan --bytecode-unsigned -dprint_flag.cbc ./example_pe_file.exe

Bytecode Disassemble

At first, I spent a lot of time modifying the source code in /libclamav/bytecode.c to call two apparent unused functions (cli_byteinst_describe & cli_bytefunc_describe) to spit out the bytecode debugging information runtime.

However, later our teammate tamponlover69 mentioned that we can get the bytecode IR with the following command, which is a tool provided by ClamAV:

clambc --printbcir print_flag.cbc

The output IR is as follows (with some parts truncated due to length):

found 25 extra types of 89 total, starting at tid 69
TID  KIND                INTERNAL
------------------------------------------------------------------------
 65: DPointerType        i8*
 66: DPointerType        i16*
 67: DPointerType        i32*
 68: DPointerType        i64*
 69: DArrayType          [1 x i8]
 70: DArrayType          [2 x i8]
 71: DArrayType          [3 x i8]
 72: DArrayType          [4 x i8]
 73: DArrayType          [5 x i8]
 74: DArrayType          [6 x i8]
 75: DArrayType          [7 x i8]
 76: DPointerType        [32 x i8]*
 77: DPointerType        [396 x i8]*
 78: DPointerType        [16 x i8]*
 79: DPointerType        i8**
 80: DArrayType          [1024 x i8]
 81: DPointerType        [1024 x i8]*
 82: DFunctionType       i32 func ( i32 i32 )
 83: DFunctionType       i32 func ( i32 i32 )
 84: DFunctionType       i0 func ( i0 i0 i0 i0 )
 85: DFunctionType       i0 func ( i0 i0 i0 i0 )
 86: DArrayType          [16 x i8]
 87: DArrayType          [396 x i8]
 88: DArrayType          [32 x i8]
------------------------------------------------------------------------
########################################################################
####################### Function id   0 ################################
########################################################################
found a total of 13 globals
GID  ID    VALUE
------------------------------------------------------------------------
  0 [  0]: i0 unknown
  1 [  1]: [32 x i8] unknown
  2 [  2]: [396 x i8] unknown
  3 [  3]: [16 x i8] unknown
  4 [  4]: [16 x i8] unknown
  5 [  5]: i8* unknown
  6 [  6]: i8* unknown
  7 [  7]: i8* unknown
  8 [  8]: i8* unknown
  9 [  9]: i8* unknown
 10 [ 10]: i8* unknown
 11 [ 11]: i8* unknown
 12 [ 12]: i8* unknown
------------------------------------------------------------------------
found 30 values with 0 arguments and 30 locals
VID  ID    VALUE
------------------------------------------------------------------------
  0 [  0]: alloc i64
  1 [  1]: alloc i64
  2 [  2]: alloc i8*
  3 [  3]: alloc [1024 x i8]
  4 [  4]: i8*
  5 [  5]: i32
  6 [  6]: i1
  7 [  7]: i32
  8 [  8]: i32
  9 [  9]: i32
 10 [ 10]: i32
 11 [ 11]: i1
 12 [ 12]: i64
 13 [ 13]: i64
 14 [ 14]: i64
 15 [ 15]: i32
 16 [ 16]: i8*
 17 [ 17]: i8*
 18 [ 18]: i8
 19 [ 19]: i64
 20 [ 20]: i64
 21 [ 21]: i32
 22 [ 22]: i8*
 23 [ 23]: i8*
 24 [ 24]: i8
 25 [ 25]: i1
 26 [ 26]: i64
 27 [ 27]: i32
 28 [ 28]: i64
 29 [ 29]: i32
------------------------------------------------------------------------
found a total of 23 constants
CID  ID    VALUE
------------------------------------------------------------------------
  0 [ 30]: 0(0x0)
  1 [ 31]: 0(0x0)
  2 [ 32]: 2(0x2)
  3 [ 33]: 0(0x0)
  4 [ 34]: 1024(0x400)
  5 [ 35]: 396(0x18c)
  6 [ 36]: 15(0xf)
  7 [ 37]: 1(0x1)
  8 [ 38]: 0(0x0)
  9 [ 39]: 0(0x0)
 10 [ 40]: 396(0x18c)
 11 [ 41]: 396(0x18c)
 12 [ 42]: 0(0x0)
 13 [ 43]: 396(0x18c)
 14 [ 44]: 32(0x20)
 15 [ 45]: 32(0x20)
 16 [ 46]: 32(0x20)
 17 [ 47]: 32(0x20)
 18 [ 48]: 0(0x0)
 19 [ 49]: 1(0x1)
 20 [ 50]: 0(0x0)
 21 [ 51]: 15(0xf)
 22 [ 52]: 1(0x1)
------------------------------------------------------------------------
found a total of 53 total values
------------------------------------------------------------------------
FUNCTION ID: F.0 -> NUMINSTS 40
BB   IDX  OPCODE              [ID /IID/MOD]  INST
------------------------------------------------------------------------
  0    0  OP_BC_GEPZ          [36 /184/  4]  4 = gepz p.3 + (30)
  0    1  OP_BC_CALL_API      [33 /168/  3]  5 = seek[3] (31, 32)
  0    2  OP_BC_MEMSET        [40 /200/  0]  0 = memset (p.4, 33, 34)
  0    3  OP_BC_ICMP_EQ       [21 /108/  3]  6 = (5 == 35)
  0    4  OP_BC_BRANCH        [17 / 85/  0]  br 6 ? bb.2 : bb.1

  1    5  OP_BC_CALL_API      [33 /168/  3]  7 = setvirusname[4] (p.-2147483636, 36)
  1    6  OP_BC_COPY          [34 /174/  4]  cp 37 -> 0
  1    7  OP_BC_JMP           [18 / 90/  0]  jmp bb.6

  2    8  OP_BC_CALL_API      [33 /168/  3]  8 = seek[3] (38, 39)
  2    9  OP_BC_CALL_API      [33 /168/  3]  9 = read[1] (p.4, 40)
  2   10  OP_BC_CALL_DIRECT   [32 /163/  3]  10 = call F.1 (4, 41)
  2   11  OP_BC_COPY          [34 /174/  4]  cp 42 -> 1
  2   12  OP_BC_JMP           [18 / 90/  0]  jmp bb.4

  3   13  OP_BC_ICMP_ULT      [25 /129/  4]  11 = (26 < 43)
  3   14  OP_BC_COPY          [34 /174/  4]  cp 26 -> 1
  3   15  OP_BC_BRANCH        [17 / 85/  0]  br 11 ? bb.4 : bb.5

  4   16  OP_BC_COPY          [34 /174/  4]  cp 1 -> 12
  4   17  OP_BC_SHL           [8  / 44/  4]  13 = 12 << 44
  4   18  OP_BC_ASHR          [10 / 54/  4]  14 = 13 >> 45
  4   19  OP_BC_TRUNC         [14 / 73/  3]  15 = 14 trunc ffffffffffffffff
  4   20  OP_BC_COPY          [34 /174/  4]  cp -2147483640 -> 2
  4   21  OP_BC_COPY          [34 /174/  4]  cp 2 -> 16
  4   22  OP_BC_GEP1          [35 /179/  4]  17 = gep1 p.16 + (15 * 65)
  4   23  OP_BC_LOAD          [39 /196/  1]  load  18 <- p.17
  4   24  OP_BC_SHL           [8  / 44/  4]  19 = 12 << 46
  4   25  OP_BC_ASHR          [10 / 54/  4]  20 = 19 >> 47
  4   26  OP_BC_TRUNC         [14 / 73/  3]  21 = 20 trunc ffffffffffffffff
  4   27  OP_BC_GEPZ          [36 /184/  4]  22 = gepz p.3 + (48)
  4   28  OP_BC_GEP1          [35 /179/  4]  23 = gep1 p.22 + (21 * 65)
  4   29  OP_BC_LOAD          [39 /196/  1]  load  24 <- p.23
  4   30  OP_BC_ICMP_EQ       [21 /106/  1]  25 = (18 == 24)
  4   31  OP_BC_ADD           [1  /  9/  0]  26 = 12 + 49
  4   32  OP_BC_COPY          [34 /174/  4]  cp 50 -> 0
  4   33  OP_BC_BRANCH        [17 / 85/  0]  br 25 ? bb.3 : bb.6

  5   34  OP_BC_CALL_API      [33 /168/  3]  27 = setvirusname[4] (p.-2147483638, 51)
  5   35  OP_BC_COPY          [34 /174/  4]  cp 52 -> 0
  5   36  OP_BC_JMP           [18 / 90/  0]  jmp bb.6

  6   37  OP_BC_COPY          [34 /174/  4]  cp 0 -> 28
  6   38  OP_BC_TRUNC         [14 / 73/  3]  29 = 28 trunc ffffffffffffffff
  6   39  OP_BC_RET           [19 / 98/  3]  ret 29
------------------------------------------------------------------------
########################################################################
####################### Function id   1 ################################
########################################################################
found a total of 13 globals
GID  ID    VALUE
------------------------------------------------------------------------
  0 [  0]: i0 unknown
  1 [  1]: [32 x i8] unknown
  2 [  2]: [396 x i8] unknown
  3 [  3]: [16 x i8] unknown
  4 [  4]: [16 x i8] unknown
  5 [  5]: i8* unknown
  6 [  6]: i8* unknown
  7 [  7]: i8* unknown
  8 [  8]: i8* unknown
  9 [  9]: i8* unknown
 10 [ 10]: i8* unknown
 11 [ 11]: i8* unknown
 12 [ 12]: i8* unknown
------------------------------------------------------------------------
found 303 values with 2 arguments and 301 locals
VID  ID    VALUE
------------------------------------------------------------------------
  0 [  0]: i8* argument
  1 [  1]: i32 argument
  2 [  2]: alloc i64
  3 [  3]: alloc i64
  4 [  4]: alloc i64
  5 [  5]: alloc i64
  6 [  6]: alloc i64
  7 [  7]: alloc i64
  8 [  8]: alloc i64
  9 [  9]: alloc i64
 10 [ 10]: alloc i64
 11 [ 11]: alloc i64
 12 [ 12]: alloc i8*
 13 [ 13]: alloc i8*
 14 [ 14]: alloc i8*
 15 [ 15]: alloc i8*
 <SNIP>
299 [299]: i8
300 [300]: i8
301 [301]: i64
302 [302]: i1
------------------------------------------------------------------------
found a total of 154 constants
CID  ID    VALUE
------------------------------------------------------------------------
  0 [303]: 7(0x7)
  1 [304]: 0(0x0)
  2 [305]: 0(0x0)
  3 [306]: 32(0x20)
  4 [307]: 32(0x20)
  5 [308]: 255(0xff)
  6 [309]: 0(0x0)
  7 [310]: 0(0x0)
  8 [311]: 4290493196(0xffbbbb0c)
  9 [312]: 1(0x1)
 10 [313]: 0(0x0)
 11 [314]: 4290772926(0xffbfffbe)
 12 [315]: 1(0x1)
 13 [316]: 0(0x0)
 14 [317]: 16(0x10)
 15 [318]: 22(0x16)
 16 [319]: 22(0x16)
 17 [320]: 22(0x16)
 18 [321]: 6(0x6)
 19 [322]: 0(0x0)
 20 [323]: 4290509612(0xffbbfb2c)
 21 [324]: 1(0x1)
 22 [325]: 0(0x0)
 <SNIP>
150 [453]: 1(0x1)
151 [454]: 32(0x20)
152 [455]: 1(0x1)
153 [456]: 1(0x1)
------------------------------------------------------------------------
found a total of 457 total values
------------------------------------------------------------------------
FUNCTION ID: F.1 -> NUMINSTS 453
BB   IDX  OPCODE              [ID /IID/MOD]  INST
------------------------------------------------------------------------
  0    0  OP_BC_TRUNC         [14 / 71/  1]  19 = 1 trunc ffffffff
  0    1  OP_BC_AND           [11 / 56/  1]  20 = 19 & 303
  0    2  OP_BC_ICMP_EQ       [21 /108/  3]  21 = (1 == 304)
  0    3  OP_BC_BRANCH        [17 / 85/  0]  br 21 ? bb.92 : bb.1

  1    4  OP_BC_ZEXT          [16 / 84/  4]  22 = 1 zext ffffffff
  1    5  OP_BC_COPY          [34 /174/  4]  cp 305 -> 11
  1    6  OP_BC_JMP           [18 / 90/  0]  jmp bb.2

  <SNIP>
 92  452  OP_BC_RET           [19 / 98/  3]  ret 456
------------------------------------------------------------------------

From the looks of it, there are two functions (F.0 and F.1), we can assumed that F.0 is like the typical main function in C code. Let’s analyze the first 5 opcodes from F.0 to understand what it does:

0    0  OP_BC_GEPZ          [36 /184/  4]  4 = gepz p.3 + (30)
0    1  OP_BC_CALL_API      [33 /168/  3]  5 = seek[3] (31, 32)
0    2  OP_BC_MEMSET        [40 /200/  0]  0 = memset (p.4, 33, 34)
0    3  OP_BC_ICMP_EQ       [21 /108/  3]  6 = (5 == 35)
0    4  OP_BC_BRANCH        [17 / 85/  0]  br 6 ? bb.2 : bb.1

The opcode OP_BC_GEPZ is in charge of resolving the pointer value, the (30) in this case is the a constant value 0, which can be obtain by referencing the constants table : 0 [ 30]: 0(0x0).

The opcode OP_BC_CALL_API, is in charge of calling API, in this case it is calling seek with argument (31, 32), by referencing the constants table again, we will get:

1 [ 31]: 0(0x0)
2 [ 32]: 2(0x2)

And looking at the implementation of seek function call, this suggest that it is actually getting the sizes of input. The value 2 indicate SEEK_END. Since our input is an EXE file, we can assume that it is getting the sizes of our input file.

The opcode OP_BC_MEMSET is pretty straight forward, is setting the 0x400 bytes in memory to NULL (0x00), again, you can obtain the values from the constants table above.

The opcode OP_BC_ICMP_EQ in charge of comparison, its comparing 5 (our input file size) with 35 (correspond to constant value 396). This suggest that our input EXE file has to be file size of 396 bytes.

Extra Note:

It is worth mention that for example the value -2147483638 appears below in F.0, must be converted to an unsigned value. This conversion is done by performing -2147483638 & 0x7FFFFFFF, which results in 6. This value points to the global value [6]:

5   34  OP_BC_CALL_API      [33 /168/  3]  27 = setvirusname[4] (p.-2147483638, 51)

F.0

This function is calculating our input file size, ensure it is 396 bytes before proceed, and later perform values comparison of the ciphertext from global constant with the input file byte values that is encrypted in F.1.

F.1

This function is huge, the conclusion from our teammate is it will generates a keystream and then XOR it with each bytes from the input EXE file. You can use the script in SECCON CTF 2022 Quals - Devil Hunter to generate a C file from the IR then proceed with analyzing in your favourite decompiler.

Solution

We all the information given, we can patch /libclamav/bytecode_vm.c like below to make it spit out important information for us, like the for the example what values being compared during the opcode OP_BC_ICMP_EQ:

diff --git a/libclamav/bytecode_vm.c b/libclamav/bytecode_vm.c
index 6c4d46c23..46dbb828d 100644
--- a/libclamav/bytecode_vm.c
+++ b/libclamav/bytecode_vm.c
@@ -831,16 +831,26 @@ cl_error_t cli_vm_execute(const struct cli_bc *bc, struct cli_bc_ctx *ctx, const
             DEFINE_OP_BC_RET_VOID(OP_BC_RET_VOID * 5 + 3, uint8_t);
             DEFINE_OP_BC_RET_VOID(OP_BC_RET_VOID * 5 + 4, uint8_t);

-            DEFINE_ICMPOP(OP_BC_ICMP_EQ, res = (op0 == op1));
-            DEFINE_ICMPOP(OP_BC_ICMP_NE, res = (op0 != op1));
-            DEFINE_ICMPOP(OP_BC_ICMP_UGT, res = (op0 > op1));
-            DEFINE_ICMPOP(OP_BC_ICMP_UGE, res = (op0 >= op1));
-            DEFINE_ICMPOP(OP_BC_ICMP_ULT, res = (op0 < op1));
-            DEFINE_ICMPOP(OP_BC_ICMP_ULE, res = (op0 <= op1));
-            DEFINE_ICMPOP(OP_BC_ICMP_SGT, res = (sop0 > sop1));
-            DEFINE_ICMPOP(OP_BC_ICMP_SGE, res = (sop0 >= sop1));
-            DEFINE_ICMPOP(OP_BC_ICMP_SLE, res = (sop0 <= sop1));
-            DEFINE_ICMPOP(OP_BC_ICMP_SLT, res = (sop0 < sop1));
+            // DEFINE_ICMPOP(OP_BC_ICMP_EQ, res = (op0 == op1));
+            // DEFINE_ICMPOP(OP_BC_ICMP_NE, res = (op0 != op1));
+            // DEFINE_ICMPOP(OP_BC_ICMP_UGT, res = (op0 > op1));
+            // DEFINE_ICMPOP(OP_BC_ICMP_UGE, res = (op0 >= op1));
+            // DEFINE_ICMPOP(OP_BC_ICMP_ULT, res = (op0 < op1));
+            // DEFINE_ICMPOP(OP_BC_ICMP_ULE, res = (op0 <= op1));
+            // DEFINE_ICMPOP(OP_BC_ICMP_SGT, res = (sop0 > sop1));
+            // DEFINE_ICMPOP(OP_BC_ICMP_SGE, res = (sop0 >= sop1));
+            // DEFINE_ICMPOP(OP_BC_ICMP_SLE, res = (sop0 <= sop1));
+            // DEFINE_ICMPOP(OP_BC_ICMP_SLT, res = (sop0 < sop1));
+            DEFINE_ICMPOP(OP_BC_ICMP_EQ, printf("OP_BC_ICMP_EQ : %d = %x == %x\n", bb_inst, op0, op1); res = (op0 == op1));
+            DEFINE_ICMPOP(OP_BC_ICMP_NE, printf("OP_BC_ICMP_NE : %d = %x != %x\n", bb_inst, op0, op1); res = (op0 != op1));
+            DEFINE_ICMPOP(OP_BC_ICMP_UGT, printf("OP_BC_ICMP_UGT : %d = %x > %x\n", bb_inst, op0, op1); res = (op0 > op1));
+            DEFINE_ICMPOP(OP_BC_ICMP_UGE, printf("OP_BC_ICMP_UGE : %d = %x >= %x\n", bb_inst, op0, op1); res = (op0 >= op1));
+            DEFINE_ICMPOP(OP_BC_ICMP_ULT, printf("OP_BC_ICMP_ULT : %d = %x < %x\n", bb_inst, op0, op1); res = (op0 < op1));
+            DEFINE_ICMPOP(OP_BC_ICMP_ULE, printf("OP_BC_ICMP_ULE : %d = %x <= %x\n", bb_inst, op0, op1); res = (op0 <= op1));
+            DEFINE_ICMPOP(OP_BC_ICMP_SGT, printf("OP_BC_ICMP_SGT : %d = %x > %x\n", bb_inst, sop0, sop1); res = (sop0 > sop1));
+            DEFINE_ICMPOP(OP_BC_ICMP_SGE, printf("OP_BC_ICMP_SGE : %d = %x >= %x\n", bb_inst, sop0, sop1); res = (sop0 >= sop1));
+            DEFINE_ICMPOP(OP_BC_ICMP_SLE, printf("OP_BC_ICMP_SLE : %d = %x <= %x\n", bb_inst, sop0, sop1); res = (sop0 <= sop1));
+            DEFINE_ICMPOP(OP_BC_ICMP_SLT, printf("OP_BC_ICMP_SLT : %d = %x < %x\n", bb_inst, sop0, sop1); res = (sop0 < sop1));

             case OP_BC_SELECT * 5: {
                 uint8_t t0, t1, t2;
@@ -1073,30 +1083,40 @@ cl_error_t cli_vm_execute(const struct cli_bc *bc, struct cli_bc_ctx *ctx, const
                 uint8_t op;
                 READ1(op, BINOP(0));
                 WRITE8(BINOP(1), op);
+                printf("OP_BC_COPY * 5: op=%lu\n", op);
+                printf("SRC=%u , DST=%u\n", BINOP(0), BINOP(1));
                 break;
             }
             case OP_BC_COPY * 5 + 1: {
                 uint8_t op;
                 READ8(op, BINOP(0));
                 WRITE8(BINOP(1), op);
+                printf("OP_BC_COPY * 5 + 1: op=%lu\n", op);
+                printf("SRC=%u , DST=%u\n", BINOP(0), BINOP(1));
                 break;
             }
             case OP_BC_COPY * 5 + 2: {
                 uint16_t op;
                 READ16(op, BINOP(0));
                 WRITE16(BINOP(1), op);
+                printf("OP_BC_COPY * 5 + 2: op=%lu\n", op);
+                printf("SRC=%u , DST=%u\n", BINOP(0), BINOP(1));
                 break;
             }
             case OP_BC_COPY * 5 + 3: {
                 uint32_t op;
                 READ32(op, BINOP(0));
                 WRITE32(BINOP(1), op);
+                printf("OP_BC_COPY * 5 + 3: op=%lu\n", op);
+                printf("SRC=%u , DST=%u\n", BINOP(0), BINOP(1));
                 break;
             }
             case OP_BC_COPY * 5 + 4: {
                 uint64_t op;
                 READ64(op, BINOP(0));
                 WRITE64(BINOP(1), op);
+                printf("OP_BC_COPY * 5 + 4: op=%lu\n", op);
+                printf("SRC=%u , DST=%u\n", BINOP(0), BINOP(1));
                 break;
             }

@@ -1105,24 +1125,28 @@ cl_error_t cli_vm_execute(const struct cli_bc *bc, struct cli_bc_ctx *ctx, const
                 uint8_t *ptr;
                 READPOP(ptr, inst->u.unaryop, 1);
                 WRITE8(inst->dest, (*ptr));
+                printf("OP_BC_LOAD * 5: value=%x\n", *ptr);
                 break;
             }
             case OP_BC_LOAD * 5 + 2: {
                 const union unaligned_16 *ptr;
                 READPOP(ptr, inst->u.unaryop, 2);
                 WRITE16(inst->dest, (ptr->una_u16));
+                printf("OP_BC_LOAD * 5 + 2: value=%x\n", *ptr);
                 break;
             }
             case OP_BC_LOAD * 5 + 3: {
                 const union unaligned_32 *ptr;
                 READPOP(ptr, inst->u.unaryop, 4);
                 WRITE32(inst->dest, (ptr->una_u32));
+                printf("OP_BC_LOAD * 5 + 3: value=%x\n", *ptr);
                 break;
             }
             case OP_BC_LOAD * 5 + 4: {
                 const union unaligned_64 *ptr;
                 READPOP(ptr, inst->u.unaryop, 8);
                 WRITE64(inst->dest, (ptr->una_u64));
+                printf("OP_BC_LOAD * 5 + 4: value=%x\n", *ptr);
                 break;
             }

Now when we run the command to load the bytecode on example EXE again, we will get tons of information, what we want specifically is the 4 lines after SRC=16 , DST=1120, like below:

SRC=16 , DST=1120
OP_BC_LOAD * 5: value=45
OP_BC_LOAD * 5: value=45
OP_BC_ICMP_EQ : 14 = 45 == 45
OP_BC_COPY * 5 + 4: op=0
--
SRC=16 , DST=1120
OP_BC_LOAD * 5: value=5f
OP_BC_LOAD * 5: value=5f
OP_BC_ICMP_EQ : 14 = 5f == 5f

From the output it is pretty clear that it is doing comparison, and we get the ciphertext too, therefore what we need to do now is write a script to solve it !

Solve Script

import subprocess
import regex as re


def check(buffer):
    # save buffer to file
    open("inp", "wb").write(buffer)

    # run command and get input "clamscan --bytecode-unsigned -dprint_flag.cbc inp"
    p = subprocess.Popen(
        ["./clamscan/clamscan", "--bytecode-unsigned", "-dprint_flag.cbc", "inp"],
        stdout=subprocess.PIPE,
    )
    out, _ = p.communicate()
    return out


def filtering_output(output):
    # SRC=16 , DST=1120
    # OP_BC_LOAD * 5: value=45
    # OP_BC_LOAD * 5: value=45
    # OP_BC_ICMP_EQ : 14 = 45 == 45
    # OP_BC_COPY * 5 + 4: op=0
    m = re.findall(
        r"""SRC=16 , DST=1120
OP_BC_LOAD \* 5: value=([\dabcdefABCDEF]+)
OP_BC_LOAD \* 5: value=([\dabcdefABCDEF]+)
OP_BC_ICMP_EQ : 14 = [\dabcdefABCDEF]+ == [\dabcdefABCDEF]+
OP_BC_COPY \* 5 \+ 4: op=0""",
        output,
    )
    return m


len_buffer = 0x18C

buffer = bytearray(b"\x00" * len_buffer)
buffer[0] = 0x4D
buffer[1] = 0x5A
known = 2

while known != len_buffer:
    out = check(buffer)
    list_known = filtering_output(out.decode())
    len_list_known = len(list_known)

    assert len_list_known >= known, "Must be more known bytes"

    buffer[len_list_known-1] = int(list_known[len_list_known-1][0], 16) ^ int(list_known[len_list_known-1][1], 16) ^ 0x0

    # print(hex(buffer[len_list_known-1]) + "=" + list_known[len_list_known-1][0] + "^" + list_known[len_list_known-1][1])

    known = len_list_known

open("flag.exe", "wb").write(buffer)

Or if you prefer a slow script that doesn’t required you to analyze the F.1 at all:

#!/usr/bin/env python3
import subprocess
import re
import tqdm

def create_pe_file(filename, file_content, file_size):
    # Write the content to a file
    with open(filename, 'wb') as file:
        file.write(file_content)

def main():

    known = 6
    file_size = 0x0000018C

    command = ["./clamscan/clamscan", "--bytecode-unsigned", "-dprint_flag.cbc", "./example_pe.exe"]
    file_content = bytearray(b'MZ\x00\x00PE' + b'\x00' * (file_size - 6))

    while tqdm.tqdm(known < file_size):

        max_count_len = 0
        latest_known_byte = 0

        for byte in range(0xFF):
            file_content[known] = byte
            create_pe_file('example_pe.exe', file_content, file_size)

            result = subprocess.run(command, capture_output=True, text=True, check=True)

            count_len_result = re.findall(r"""SRC=16 , DST=1120
OP_BC_LOAD \* 5: value=([\dabcdefABCDEF]+)
OP_BC_LOAD \* 5: value=([\dabcdefABCDEF]+)
OP_BC_ICMP_EQ : 14 = [\dabcdefABCDEF]+ == [\dabcdefABCDEF]+
OP_BC_COPY \* 5 \+ 4: op=0""",
                result.stdout
                )
            if len(count_len_result) > max_count_len:
                max_count_len = len(count_len_result)
                latest_known_byte = byte

        file_content[known] = latest_known_byte
        print(file_content)

        known += 1


if __name__ == '__main__':
    main()

The correct EXE file should be like follow:

└─$ xxd flag.exe
00000000: 4d5a 0000 5045 0000 6486 0100 0000 0000  MZ..PE..d.......
00000010: 0000 0000 0000 0000 8000 2200 0b02 0000  ..........".....
00000020: 8201 0000 0000 0000 0000 0000 2601 0000  ............&...
00000030: 0a00 0000 0000 0040 0100 0000 0400 0000  .......@........
00000040: 0400 0000 0600 0000 0000 0000 0600 0000  ................
00000050: 0000 0000 8c01 0000 e900 0000 0000 0000  ................
00000060: 0300 6081 0000 1000 0000 0000 0010 0000  ..`.............
00000070: 0000 0000 0000 1000 0000 0000 0010 0000  ................
00000080: 0000 0000 0000 0000 0200 0000 0000 0000  ................
00000090: 0000 0000 6001 0000 2c00 0000 2e00 0000  ....`...,.......
000000a0: 0000 0000 8201 0000 1601 0000 8201 0000  ................
000000b0: 1601 0000 a0a1 bcab a7a6 b3bb adab baad  ................
000000c0: bc97 bda6 b8a9 aba3 adba 97a1 a697 aba4  ................
000000d0: a9a5 a9be 97aa b1bc adab a7ac ad97 bba1  ................
000000e0: afa6 a9bc bdba adb5 0094 0247 6574 5374  ...........GetSt
000000f0: 6448 616e 646c 6500 9502 5772 6974 6543  dHandle...WriteC
00000100: 6f6e 736f 6c65 4100 4b45 524e 454c 3332  onsoleA.KERNEL32
00000110: 2e64 6c6c 0000 e900 0000 0000 0000 f800  .dll............
00000120: 0000 0000 0000 b9f5 ffff ffff 15e5 ffff  ................
00000130: ff45 31c9 458d 4135 488d 1575 ffff ff48  .E1.E.A5H..u...H
00000140: 31c9 8034 0ac8 48ff c148 83f9 3575 f348  1..4..H..H..5u.H
00000150: 8d15 5eff ffff 4889 c1ff 15bf ffff ffc3  ..^...H.........
00000160: 7401 0000 0000 0000 0000 0000 0801 0000  t...............
00000170: 1601 0000 e900 0000 0000 0000 f800 0000  ................
00000180: 0000 0000 0000 0000 0000 0000            ............

Execute it in Windows environment and we will get a flag.

Flag:

hitcon{secret_unpacker_in_clamav_bytecode_signature}

References