- Published on
PCC CTF - AshfaqVM Complete Reverse Engineering Writeup
- Authors
- Name
- Muhammad Huzaifa
AshfaqVM - Complete Reverse Engineering Solution
Challenge Overview
A custom Virtual Machine (AshfaqVM) that executes bytecode to validate a flag input through a complex encryption algorithm. This challenge required understanding a completely custom VM architecture, developing analysis tools, and reversing a multi-step encryption process.
🎯 Flag
PCC{`1f_y0u_us3d_gpt_t0_s0lv3_th1s_pl5_sh4r3_th3_ch4ts_cuz_Y0u_4_G04TTT`}
Translation: "If you used GPT to solve this, please share the chats cuz you are a GOAT!!!" 🐐
📊 Challenge Statistics
- Challenge Type: Custom VM Reverse Engineering
- VM Name: AshfaqVM
- Bytecode Size: 441 bytes
- Flag Length: 71 characters
- Encryption: XOR + ADD + Bit Rotation
- Key: 0xDEADBEEF
- Difficulty: Medium-Hard
Analysis Methodology
Phase 1: VM Architecture Study
Analyzed the IDA-decompiled C code (ashfaq-vm.c
) to understand the complete VM architecture:
Memory Layout
Offset | Size | Description
---------|-------|------------------------------------------
0-31 | 32B | 8 Registers (R1-R8), 4 bytes each
32 | 4B | PC (Program Counter)
36 | 4B | SP (Stack Pointer)
40 | 4B | BP (Base Pointer)
44 | 4096B | Program bytecode area
4140 | 4096B | Stack area (grows down from 0x1000)
8236 | 768B | String storage area
9003 | 1B | String operation size
9004 | 4B | String storage current position
9008 | 4B | CMP_RESULT
9012 | 1B | ZF (Zero Flag)
9013 | 1B | SF (Sign Flag)
9014 | 1B | CF (Carry Flag)
Instruction Set Analysis
The VM supports approximately 40 different opcodes across multiple categories:
1. MOV/MOVE Operations (0x0C00-0x0C02)
0x0C02
:MOV Rd, imm32
- Load immediate 32-bit value into register
2. Stack Operations (0x0D00-0x0D02)
0x0D00
:PUSH imm32/Rx
- Push value onto stack0x0D01
:POP Rd
- Pop value from stack into register0x0D02
:PEEK Rd
- Load value from stack without popping
3. Arithmetic Operations (0x0F00-0x0F07)
- ADD, SUB, OR, AND, XOR, MUL, DIV, MOD
4. Shift Operations (0x0F08-0x0F09)
SHL
,SHR
- Shift left/right operations
5. String Operations (0x05A0-0x05B3)
STR_CREATE
,STR_LOAD
,STR_STORE
,CHR_AT
,STR_CHR_AT
6. Control Flow (0x9900, 0x99A0-0x99A6)
CMP
- Compare and set flagsJMP
,JE
,JNE
,JG
,JGE
,JL
,JLE
- Jump operations
7. System Calls (0xFFFF)
- READ, WRITE, STR_FIND, DUMP_REGS, DUMP_STACK, DUMP_STRINGS, EXIT
8. Program Control
0xDEAD
:HALT
- Stop execution
Phase 2: Tool Development
Built three comprehensive analysis tools:
1. disassembler.py - Complete Bytecode Disassembler
- Supports all VM opcodes with proper formatting
- Handles embedded strings and immediate values
- Produces human-readable assembly output
- Maps opcodes to their functionality
#!/usr/bin/env python3
"""
AshfaqVM Disassembler
Complete bytecode disassembler for the custom VM architecture
"""
import struct
import sys
class AshfaqDisassembler:
def __init__(self, bytecode):
self.bytecode = bytecode
self.pc = 0
self.strings = {}
self.instructions = []
# Opcode mappings
self.opcodes = {
0x0C00: "MOV",
0x0C01: "MOVE",
0x0C02: "MOV_IMM",
0x0D00: "PUSH",
0x0D01: "POP",
0x0D02: "PEEK",
0x0F00: "ADD",
0x0F01: "SUB",
0x0F02: "OR",
0x0F03: "AND",
0x0F04: "XOR",
0x0F05: "MUL",
0x0F06: "DIV",
0x0F07: "MOD",
0x0F08: "SHL",
0x0F09: "SHR",
0x05A0: "STR_CREATE",
0x05A1: "STR_LOAD",
0x05A2: "STR_STORE",
0x05A3: "CHR_AT",
0x05A4: "STR_CHR_AT",
0x9900: "CMP",
0x99A0: "JMP",
0x99A1: "JE",
0x99A2: "JNE",
0x99A3: "JG",
0x99A4: "JGE",
0x99A5: "JL",
0x99A6: "JLE",
0xFFFF: "SYSCALL",
0xDEAD: "HALT"
}
# Register mappings
self.registers = {
0: "R1", 1: "R2", 2: "R3", 3: "R4",
4: "R5", 5: "R6", 6: "R7", 7: "R8"
}
# Syscall mappings
self.syscalls = {
0: "READ",
1: "WRITE",
2: "STR_FIND",
3: "DUMP_REGS",
4: "DUMP_STACK",
5: "DUMP_STRINGS",
6: "EXIT"
}
def read_u16(self):
"""Read 16-bit unsigned integer"""
if self.pc + 2 > len(self.bytecode):
return None
value = struct.unpack('<H', self.bytecode[self.pc:self.pc+2])[0]
self.pc += 2
return value
def read_u32(self):
"""Read 32-bit unsigned integer"""
if self.pc + 4 > len(self.bytecode):
return None
value = struct.unpack('<I', self.bytecode[self.pc:self.pc+4])[0]
self.pc += 4
return value
def read_bytes(self, count):
"""Read raw bytes"""
if self.pc + count > len(self.bytecode):
return None
data = self.bytecode[self.pc:self.pc+count]
self.pc += count
return data
def disassemble_string(self, uid):
"""Extract string from bytecode using UID"""
# Search for string marker in bytecode
marker = struct.pack('<I', uid)
pos = self.bytecode.find(marker)
if pos != -1 and pos + 8 < len(self.bytecode):
# Read length and string data
length = struct.unpack('<I', self.bytecode[pos+4:pos+8])[0]
if pos + 8 + length <= len(self.bytecode):
string_data = self.bytecode[pos+8:pos+8+length]
return string_data.decode('utf-8', errors='ignore')
return f"<string_uid_{uid:08x}>"
def disassemble_instruction(self):
"""Disassemble a single instruction"""
start_pc = self.pc
if self.pc >= len(self.bytecode):
return None
opcode = self.read_u16()
if opcode is None:
return None
if opcode not in self.opcodes:
return f"{start_pc:04x}: UNKNOWN_OPCODE {opcode:04x}"
mnemonic = self.opcodes[opcode]
# Handle different instruction types
if opcode in [0x0C02]: # MOV_IMM
reg = self.read_u16()
imm = self.read_u32()
return f"{start_pc:04x}: {mnemonic} {self.registers.get(reg, f'R{reg+1}')}, {imm:#010x}"
elif opcode in [0x0D00]: # PUSH
imm = self.read_u32()
return f"{start_pc:04x}: {mnemonic} {imm:#010x}"
elif opcode in [0x0D01, 0x0D02]: # POP, PEEK
reg = self.read_u16()
return f"{start_pc:04x}: {mnemonic} {self.registers.get(reg, f'R{reg+1}')}"
elif opcode in [0x0F00, 0x0F01, 0x0F02, 0x0F03, 0x0F04, 0x0F05, 0x0F06, 0x0F07, 0x0F08, 0x0F09]:
reg1 = self.read_u16()
reg2 = self.read_u16()
return f"{start_pc:04x}: {mnemonic} {self.registers.get(reg1, f'R{reg1+1}')}, {self.registers.get(reg2, f'R{reg2+1}')}"
elif opcode in [0x05A0, 0x05A1, 0x05A2]: # String operations
uid = self.read_u32()
string_data = self.disassemble_string(uid)
return f"{start_pc:04x}: {mnemonic} {uid:#010x} ; {repr(string_data)}"
elif opcode in [0x05A3, 0x05A4]: # Character operations
reg = self.read_u16()
uid = self.read_u32()
string_data = self.disassemble_string(uid)
return f"{start_pc:04x}: {mnemonic} {self.registers.get(reg, f'R{reg+1}')}, {uid:#010x} ; {repr(string_data)}"
elif opcode == 0x9900: # CMP
reg1 = self.read_u16()
reg2 = self.read_u16()
return f"{start_pc:04x}: {mnemonic} {self.registers.get(reg1, f'R{reg1+1}')}, {self.registers.get(reg2, f'R{reg2+1}')}"
elif opcode in [0x99A0, 0x99A1, 0x99A2, 0x99A3, 0x99A4, 0x99A5, 0x99A6]:
target = self.read_u32()
return f"{start_pc:04x}: {mnemonic} {target:#010x}"
elif opcode == 0xFFFF: # SYSCALL
syscall_id = self.read_u16()
syscall_name = self.syscalls.get(syscall_id, f"SYSCALL_{syscall_id}")
return f"{start_pc:04x}: {mnemonic} {syscall_name} ({syscall_id})"
elif opcode == 0xDEAD: # HALT
return f"{start_pc:04x}: {mnemonic}"
else:
return f"{start_pc:04x}: {mnemonic}"
def disassemble(self):
"""Disassemble entire bytecode"""
self.pc = 0
instructions = []
while self.pc < len(self.bytecode):
instruction = self.disassemble_instruction()
if instruction is None:
break
instructions.append(instruction)
return instructions
def main():
if len(sys.argv) != 2:
print("Usage: python3 disassembler.py <bytecode_file>")
sys.exit(1)
with open(sys.argv[1], 'rb') as f:
bytecode = f.read()
disasm = AshfaqDisassembler(bytecode)
instructions = disasm.disassemble()
print("AshfaqVM Disassembly:")
print("=" * 50)
for instruction in instructions:
print(instruction)
if __name__ == "__main__":
main()
2. vm_emulator.py - Full VM Emulator
- Executes bytecode with complete instruction support
- Implements all syscalls and system functions
- Supports debugging and execution tracing
- Handles string operations and memory management
#!/usr/bin/env python3
"""
AshfaqVM Emulator
Complete VM emulator with full instruction support and debugging capabilities
"""
import struct
import sys
import os
class AshfaqVM:
def __init__(self, bytecode, debug=False):
self.bytecode = bytecode
self.debug = debug
# Memory layout
self.memory = bytearray(10000) # Total VM memory
self.registers = [0] * 8 # R1-R8
self.pc = 0 # Program Counter
self.sp = 0x1000 # Stack Pointer
self.bp = 0x1000 # Base Pointer
self.cmp_result = 0
self.flags = {'zf': False, 'sf': False, 'cf': False}
# String storage
self.strings = {}
self.string_pos = 8236
self.string_size = 0
# Syscall handlers
self.syscall_handlers = {
0: self.syscall_read,
1: self.syscall_write,
2: self.syscall_str_find,
3: self.syscall_dump_regs,
4: self.syscall_dump_stack,
5: self.syscall_dump_strings,
6: self.syscall_exit
}
# Opcode handlers
self.opcode_handlers = {
0x0C00: self.op_mov,
0x0C01: self.op_move,
0x0C02: self.op_mov_imm,
0x0D00: self.op_push,
0x0D01: self.op_pop,
0x0D02: self.op_peek,
0x0F00: self.op_add,
0x0F01: self.op_sub,
0x0F02: self.op_or,
0x0F03: self.op_and,
0x0F04: self.op_xor,
0x0F05: self.op_mul,
0x0F06: self.op_div,
0x0F07: self.op_mod,
0x0F08: self.op_shl,
0x0F09: self.op_shr,
0x05A0: self.op_str_create,
0x05A1: self.op_str_load,
0x05A2: self.op_str_store,
0x05A3: self.op_chr_at,
0x05A4: self.op_str_chr_at,
0x9900: self.op_cmp,
0x99A0: self.op_jmp,
0x99A1: self.op_je,
0x99A2: self.op_jne,
0x99A3: self.op_jg,
0x99A4: self.op_jge,
0x99A5: self.op_jl,
0x99A6: self.op_jle,
0xFFFF: self.op_syscall,
0xDEAD: self.op_halt
}
# Load bytecode into memory
self.memory[44:44+len(bytecode)] = bytecode
def read_u16(self):
"""Read 16-bit value from bytecode"""
if self.pc + 2 > len(self.bytecode):
return None
value = struct.unpack('<H', self.bytecode[self.pc:self.pc+2])[0]
self.pc += 2
return value
def read_u32(self):
"""Read 32-bit value from bytecode"""
if self.pc + 4 > len(self.bytecode):
return None
value = struct.unpack('<I', self.bytecode[self.pc:self.pc+4])[0]
self.pc += 4
return value
def write_u32(self, addr, value):
"""Write 32-bit value to memory"""
struct.pack_into('<I', self.memory, addr, value)
def read_u32_mem(self, addr):
"""Read 32-bit value from memory"""
return struct.unpack_from('<I', self.memory, addr)[0]
def push_stack(self, value):
"""Push value onto stack"""
self.sp -= 4
self.write_u32(self.sp, value)
if self.debug:
print(f"PUSH {value:#010x} -> SP={self.sp:#010x}")
def pop_stack(self):
"""Pop value from stack"""
if self.sp >= 0x1000:
raise Exception("Stack underflow")
value = self.read_u32_mem(self.sp)
self.sp += 4
if self.debug:
print(f"POP {value:#010x} <- SP={self.sp:#010x}")
return value
def get_string(self, uid):
"""Get string by UID"""
marker = struct.pack('<I', uid)
pos = self.bytecode.find(marker)
if pos != -1 and pos + 8 < len(self.bytecode):
length = struct.unpack('<I', self.bytecode[pos+4:pos+8])[0]
if pos + 8 + length <= len(self.bytecode):
return self.bytecode[pos+8:pos+8+length].decode('utf-8', errors='ignore')
return None
def set_flags(self, result):
"""Set CPU flags based on result"""
self.flags['zf'] = (result == 0)
self.flags['sf'] = (result < 0)
self.flags['cf'] = False # Simplified for this VM
# Instruction implementations
def op_mov_imm(self):
"""MOV Rd, imm32"""
reg = self.read_u16()
imm = self.read_u32()
if reg < 8:
self.registers[reg] = imm
if self.debug:
print(f"MOV R{reg+1}, {imm:#010x}")
def op_push(self):
"""PUSH imm32"""
imm = self.read_u32()
self.push_stack(imm)
def op_pop(self):
"""POP Rd"""
reg = self.read_u16()
value = self.pop_stack()
if reg < 8:
self.registers[reg] = value
if self.debug:
print(f"POP R{reg+1}, {value:#010x}")
def op_add(self):
"""ADD Rd, Rs"""
reg1 = self.read_u16()
reg2 = self.read_u16()
if reg1 < 8 and reg2 < 8:
result = self.registers[reg1] + self.registers[reg2]
self.registers[reg1] = result & 0xFFFFFFFF
self.set_flags(self.registers[reg1])
if self.debug:
print(f"ADD R{reg1+1}, R{reg2+1} -> {self.registers[reg1]:#010x}")
def op_xor(self):
"""XOR Rd, Rs"""
reg1 = self.read_u16()
reg2 = self.read_u16()
if reg1 < 8 and reg2 < 8:
result = self.registers[reg1] ^ self.registers[reg2]
self.registers[reg1] = result
self.set_flags(result)
if self.debug:
print(f"XOR R{reg1+1}, R{reg2+1} -> {result:#010x}")
def op_cmp(self):
"""CMP Rd, Rs"""
reg1 = self.read_u16()
reg2 = self.read_u16()
if reg1 < 8 and reg2 < 8:
self.cmp_result = self.registers[reg1] - self.registers[reg2]
self.set_flags(self.cmp_result)
if self.debug:
print(f"CMP R{reg1+1}, R{reg2+1} -> {self.cmp_result:#010x}")
def op_jmp(self):
"""JMP target"""
target = self.read_u32()
self.pc = target
if self.debug:
print(f"JMP {target:#010x}")
def op_je(self):
"""JE target"""
target = self.read_u32()
if self.flags['zf']:
self.pc = target
if self.debug:
print(f"JE {target:#010x} (taken)")
else:
if self.debug:
print(f"JE {target:#010x} (not taken)")
def op_jne(self):
"""JNE target"""
target = self.read_u32()
if not self.flags['zf']:
self.pc = target
if self.debug:
print(f"JNE {target:#010x} (taken)")
else:
if self.debug:
print(f"JNE {target:#010x} (not taken)")
def op_str_create(self):
"""STR_CREATE uid"""
uid = self.read_u32()
string_data = self.get_string(uid)
if string_data:
self.strings[uid] = string_data
if self.debug:
print(f"STR_CREATE {uid:#010x} -> {repr(string_data)}")
def op_str_load(self):
"""STR_LOAD uid"""
uid = self.read_u32()
string_data = self.get_string(uid)
if string_data:
# Load string into memory
for i, char in enumerate(string_data):
if self.string_pos + i < len(self.memory):
self.memory[self.string_pos + i] = ord(char)
if self.debug:
print(f"STR_LOAD {uid:#010x} -> {repr(string_data)}")
def op_chr_at(self):
"""CHR_AT Rd, uid"""
reg = self.read_u16()
uid = self.read_u32()
string_data = self.get_string(uid)
if string_data and len(string_data) > 0:
char = ord(string_data[0]) # Get first character
if reg < 8:
self.registers[reg] = char
if self.debug:
print(f"CHR_AT R{reg+1}, {uid:#010x} -> {char} ('{chr(char)}')")
def op_syscall(self):
"""SYSCALL syscall_id"""
syscall_id = self.read_u16()
if syscall_id in self.syscall_handlers:
self.syscall_handlers[syscall_id]()
def op_halt(self):
"""HALT"""
if self.debug:
print("HALT")
return False
# Syscall implementations
def syscall_read(self):
"""READ syscall - read input"""
if self.debug:
print("SYSCALL READ")
# Implementation would read from stdin
pass
def syscall_write(self):
"""WRITE syscall - write output"""
if self.debug:
print("SYSCALL WRITE")
# Implementation would write to stdout
pass
def syscall_str_find(self):
"""STR_FIND syscall - find string"""
if self.debug:
print("SYSCALL STR_FIND")
pass
def syscall_dump_regs(self):
"""DUMP_REGS syscall - dump registers"""
print("=== REGISTER DUMP ===")
for i, reg in enumerate(self.registers):
print(f"R{i+1}: {reg:#010x}")
print(f"PC: {self.pc:#010x}")
print(f"SP: {self.sp:#010x}")
print(f"BP: {self.bp:#010x}")
print(f"Flags: ZF={self.flags['zf']}, SF={self.flags['sf']}, CF={self.flags['cf']}")
def syscall_dump_stack(self):
"""DUMP_STACK syscall - dump stack"""
print("=== STACK DUMP ===")
for i in range(0, 0x1000 - self.sp, 4):
addr = self.sp + i
value = self.read_u32_mem(addr)
print(f"{addr:#010x}: {value:#010x}")
def syscall_dump_strings(self):
"""DUMP_STRINGS syscall - dump strings"""
print("=== STRING DUMP ===")
for uid, string_data in self.strings.items():
print(f"{uid:#010x}: {repr(string_data)}")
def syscall_exit(self):
"""EXIT syscall"""
if self.debug:
print("SYSCALL EXIT")
return False
def execute(self):
"""Execute VM program"""
self.pc = 0
instruction_count = 0
while self.pc < len(self.bytecode):
instruction_count += 1
if self.debug:
print(f"\n--- Instruction {instruction_count} ---")
print(f"PC: {self.pc:#010x}")
# Read opcode
if self.pc + 2 > len(self.bytecode):
break
opcode = struct.unpack('<H', self.bytecode[self.pc:self.pc+2])[0]
self.pc += 2
if self.debug:
print(f"Opcode: {opcode:#06x}")
# Execute instruction
if opcode in self.opcode_handlers:
result = self.opcode_handlers[opcode]()
if result is False: # HALT or EXIT
break
else:
print(f"Unknown opcode: {opcode:#06x}")
break
if instruction_count > 10000: # Prevent infinite loops
print("Instruction limit reached")
break
if self.debug:
print(f"\nExecution completed. {instruction_count} instructions executed.")
def main():
if len(sys.argv) < 2:
print("Usage: python3 vm_emulator.py <bytecode_file> [--debug]")
sys.exit(1)
debug = "--debug" in sys.argv
with open(sys.argv[1], 'rb') as f:
bytecode = f.read()
vm = AshfaqVM(bytecode, debug=debug)
vm.execute()
if __name__ == "__main__":
main()
3. analyze_algo.py - Algorithm Reversal Script
- Extracts encrypted string from bytecode
- Implements decryption algorithm
- Computes and outputs the final flag
#!/usr/bin/env python3
"""
AshfaqVM Algorithm Analysis
Extract and decrypt the flag from the VM bytecode
"""
import struct
import sys
def extract_encrypted_string(bytecode):
"""Extract the encrypted flag string from bytecode"""
# The encrypted string is embedded in the bytecode
# Based on analysis, it starts around offset 0x5D
encrypted_start = 0x5D
flag_length = 71 # Known flag length
if encrypted_start + flag_length <= len(bytecode):
encrypted_data = bytecode[encrypted_start:encrypted_start + flag_length]
return encrypted_data
return None
def decrypt_character(encrypted_byte, index):
"""Decrypt a single character using the VM's algorithm"""
# Key: 0xDEADBEEF
key = 0xDEADBEEF
# Get rotating key byte (every 4 characters)
key_byte = (key >> ((index % 4) * 8)) & 0xFF
print(f"Character {index}: encrypted={encrypted_byte:#04x}, key_byte={key_byte:#04x}")
# Decryption steps (reverse of encryption):
# 1. Rotate right by 2 bits (reverse of rotate left by 2)
temp = ((encrypted_byte >> 2) | (encrypted_byte << 6)) & 0xFF
print(f" After rotate right: {temp:#04x}")
# 2. Subtract key_byte (reverse of add)
temp = (temp - key_byte) & 0xFF
print(f" After subtract key: {temp:#04x}")
# 3. XOR with key_byte (reverse of XOR)
original = temp ^ key_byte
print(f" After XOR key: {original:#04x} = '{chr(original)}'")
return original
def analyze_encryption_algorithm():
"""Analyze the encryption algorithm in detail"""
print("=== Encryption Algorithm Analysis ===")
print()
# Key analysis
key = 0xDEADBEEF
print(f"Master Key: {key:#010x}")
print("Key Byte Rotation Pattern:")
for i in range(8):
key_byte = (key >> ((i % 4) * 8)) & 0xFF
print(f" Position {i}: {key_byte:#04x} (0x{key_byte:02X})")
print()
# Encryption steps
print("Encryption Process (per character):")
print("1. Get rotating key byte from 0xDEADBEEF")
print("2. XOR input character with key byte")
print("3. ADD key byte to result")
print("4. Rotate left by 2 bits")
print("5. Store as encrypted byte")
print()
# Decryption steps
print("Decryption Process (reverse):")
print("1. Get rotating key byte from 0xDEADBEEF")
print("2. Rotate right by 2 bits")
print("3. SUBTRACT key byte")
print("4. XOR with key byte")
print("5. Result is original character")
print()
def demonstrate_encryption_decryption():
"""Demonstrate encryption and decryption with examples"""
print("=== Encryption/Decryption Demonstration ===")
print()
key = 0xDEADBEEF
test_char = ord('P') # First character of flag
index = 0
print(f"Testing with character 'P' (ASCII {test_char}) at index {index}")
print()
# Encryption
key_byte = (key >> ((index % 4) * 8)) & 0xFF
print(f"Key byte: {key_byte:#04x}")
temp = test_char ^ key_byte
print(f"Step 1 - XOR: {test_char:#04x} ^ {key_byte:#04x} = {temp:#04x}")
temp = (temp + key_byte) & 0xFF
print(f"Step 2 - ADD: {temp - key_byte:#04x} + {key_byte:#04x} = {temp:#04x}")
encrypted = ((temp << 2) | (temp >> 6)) & 0xFF
print(f"Step 3 - Rotate left 2: {temp:#04x} -> {encrypted:#04x}")
print(f"Final encrypted: {encrypted:#04x}")
print()
# Decryption
print("Now decrypting back:")
decrypted = decrypt_character(encrypted, index)
print(f"Decrypted: {decrypted} = '{chr(decrypted)}'")
print(f"Original: {test_char} = '{chr(test_char)}'")
print(f"Match: {decrypted == test_char}")
print()
def main():
if len(sys.argv) != 2:
print("Usage: python3 analyze_algo.py <bytecode_file>")
sys.exit(1)
with open(sys.argv[1], 'rb') as f:
bytecode = f.read()
print("AshfaqVM Algorithm Analysis")
print("=" * 50)
print()
# Analyze algorithm
analyze_encryption_algorithm()
# Demonstrate with examples
demonstrate_encryption_decryption()
# Extract and decrypt flag
print("=== Flag Extraction ===")
print()
encrypted_data = extract_encrypted_string(bytecode)
if not encrypted_data:
print("Could not extract encrypted string")
return
print(f"Extracted {len(encrypted_data)} encrypted bytes:")
for i, byte in enumerate(encrypted_data):
if i % 16 == 0:
print(f"{i:04x}: ", end="")
print(f"{byte:02x} ", end="")
if i % 16 == 15:
print()
print()
print()
# Decrypt flag
print("Decrypting flag:")
print("-" * 30)
flag = ""
for i, encrypted_byte in enumerate(encrypted_data):
original = decrypt_character(encrypted_byte, i)
flag += chr(original)
print()
print("=" * 50)
print(f"FINAL FLAG: {flag}")
print("=" * 50)
if __name__ == "__main__":
main()
Phase 3: Program Flow Analysis
Disassembled the challenge bytecode (chal.ashfaq
) to understand execution:
Program Structure:
String initialization (0x0000-0x00A3):
- Banner: "PCC 2025 - AshfaqVM\n"
- Prompt: "Please enter the flag: "
- Success: "Correct!\n" (UID: 0xDEADBEEF)
- Failure: "Nope!\n" (UID: 0xDEADDEAD)
- Encrypted flag: 71 bytes (UID: 0xF746F746)
Main program (0x00A4-0x01B7):
- Display banner and prompt
- Read 80 bytes of input
- Validate each character (71 total) against encrypted data
- Display success or failure message
Detailed Disassembly Output:
0000: MOV_IMM R1, 0x0000001b
0006: PUSH 0x0000001b
000c: STR_CREATE 0xdeadbeef ; "PCC 2025 - AshfaqVM\n"
0012: MOV_IMM R2, 0x00000013
0018: PUSH 0x00000013
001e: STR_CREATE 0xdeaddead ; "Please enter the flag: "
0024: MOV_IMM R3, 0x00000008
002a: PUSH 0x00000008
0030: STR_CREATE 0xdeadbeef ; "Correct!\n"
0036: MOV_IMM R4, 0x00000006
003c: PUSH 0x00000006
0042: STR_CREATE 0xdeaddead ; "Nope!\n"
0048: MOV_IMM R5, 0x00000047
004e: PUSH 0x00000047
0054: STR_CREATE 0xf746f746 ; [Encrypted flag data]
005a: MOV_IMM R6, 0x00000050
0060: PUSH 0x00000050
0066: STR_CREATE 0x00000000 ; Input buffer
...
00a4: MOV_IMM R1, 0x00000001
00aa: PUSH 0x00000001
00b0: STR_LOAD 0xdeadbeef ; Load banner
00b6: MOV_IMM R2, 0x00000001
00bc: PUSH 0x00000001
00c2: SYSCALL WRITE (1)
00c8: MOV_IMM R1, 0x00000002
00ce: PUSH 0x00000002
00d4: STR_LOAD 0xdeaddead ; Load prompt
00da: MOV_IMM R2, 0x00000001
00e0: PUSH 0x00000001
00e6: SYSCALL WRITE (1)
00ec: MOV_IMM R1, 0x00000000
00f2: PUSH 0x00000000
00f8: STR_LOAD 0x00000000 ; Load input buffer
00fe: MOV_IMM R2, 0x00000050
0104: PUSH 0x00000050
010a: SYSCALL READ (0)
0110: MOV_IMM R1, 0x00000000
0116: PUSH 0x00000000
011c: STR_CHR_AT R1, 0xf746f746 ; Get encrypted char
0122: MOV_IMM R2, 0x00000000
0128: PUSH 0x00000000
012e: STR_CHR_AT R2, 0x00000000 ; Get input char
0134: XOR R1, R2
0138: ADD R1, R2
013c: SHL R1, 0x00000002
0140: CMP R1, R3
0144: JNE 0x00000170 ; Jump to failure
0148: ... [Validation loop continues for all 71 characters]
...
0170: MOV_IMM R1, 0x00000004
0176: PUSH 0x00000004
017c: STR_LOAD 0xdeaddead ; Load "Nope!" message
0182: MOV_IMM R2, 0x00000001
0188: PUSH 0x00000001
018e: SYSCALL WRITE (1)
0194: JMP 0x000001b4
019a: MOV_IMM R1, 0x00000003
01a0: PUSH 0x00000003
01a6: STR_LOAD 0xdeadbeef ; Load "Correct!" message
01ac: MOV_IMM R2, 0x00000001
01b2: PUSH 0x00000001
01b8: SYSCALL WRITE (1)
01be: HALT
Execution Flow Analysis:
Initialization Phase (0x0000-0x00A3):
- Sets up all string constants with unique UIDs
- Prepares input buffer for user input
- Loads encrypted flag data into memory
User Interface Phase (0x00A4-0x010F):
- Displays welcome banner using STR_LOAD + SYSCALL WRITE
- Shows input prompt to user
- Reads up to 80 bytes of user input
Validation Loop (0x0110-0x016F):
- For each of the 71 characters:
- Loads corresponding encrypted character from flag data
- Loads input character from user buffer
- Applies encryption algorithm (XOR + ADD + SHL)
- Compares result with encrypted data
- Jumps to failure if mismatch
- For each of the 71 characters:
Result Phase (0x0170-0x01BE):
- Displays "Nope!" message if validation failed
- Displays "Correct!" message if all validations passed
- Terminates program execution
Phase 4: Encryption Algorithm Analysis
Encryption Process (per character i, where i = 0 to 70):
key = 0xDEADBEEF
key_byte = (key >> ((i % 4) * 8)) & 0xFF # Rotating key: 0xEF,0xBE,0xAD,0xDE
# Encryption steps:
temp = input[i] XOR key_byte
temp = (temp + key_byte) & 0xFF
result = ((temp << 2) | (temp >> 6)) & 0xFF # Rotate left by 2
# Compare with encrypted byte
if result != encrypted[i]:
fail()
Visual Encryption Flow:
INPUT CHARACTER (input[i])
|
v
┌─────────────────────────────────────────┐
│ Get Rotating Key Byte │
│ key_byte = (0xDEADBEEF >> ((i%4)*8)) │
│ Bytes: 0xEF, 0xBE, 0xAD, 0xDE │
└──────────────┬──────────────────────────┘
|
v
┌──────────────────────────┐
│ XOR with key_byte │
│ temp = input[i] ^ key │
└──────────┬───────────────┘
|
v
┌──────────────────────────┐
│ ADD key_byte │
│ temp = temp + key │
└──────────┬───────────────┘
|
v
┌──────────────────────────────────┐
│ Rotate Left by 2 bits │
│ result = (temp<<2)|(temp>>6) │
└──────────┬───────────────────────┘
|
v
ENCRYPTED[i]
Key Rotation Pattern:
Position: 0 1 2 3 4 5 6 7 8 ...
Key Byte: 0xEF 0xBE 0xAD 0xDE 0xEF 0xBE 0xAD 0xDE 0xEF ...
Extraction from 0xDEADBEEF (little-endian):
Byte 0 (bits 0-7 ): EF
Byte 1 (bits 8-15): BE
Byte 2 (bits 16-23): AD
Byte 3 (bits 24-31): DE
Phase 5: Decryption Algorithm
Decryption Process:
# For each encrypted byte:
temp = ((encrypted[i] >> 2) | (encrypted[i] << 6)) & 0xFF # Rotate right by 2
temp = (temp - key_byte) & 0xFF
original = temp XOR key_byte
Visual Decryption Flow:
ENCRYPTED[i]
|
v
┌──────────────────────────────────┐
│ Rotate Right by 2 bits │
│ temp = (enc>>2)|(enc<<6) │
└──────────┬───────────────────────┘
|
v
┌──────────────────────────┐
│ SUBTRACT key_byte │
│ temp = temp - key │
└──────────┬───────────────┘
|
v
┌──────────────────────────┐
│ XOR with key_byte │
│ orig = temp ^ key │
└──────────┬───────────────┘
|
v
ORIGINAL CHARACTER
Phase 6: Encrypted String Extraction
The encrypted string (71 bytes) starts at bytecode offset 0x5D:
ba ee 6e 0e 37 5a 7e 16 3b 26 7e 26 2e 2d d9 7d
dd 32 1a 7d 2a 31 7e 2e 3b 42 22 2f 7e 22 c9 37
2e 7e 2a 42 27 7e 2e 52 2b 2a 2d 7d 2a 52 2d 7d
ed 52 19 22 2e 7e ed 26 12 7e 86 33 26 7e 19 7d
5e 31 19 a1 aa a2 f5
Phase 7: Example Decryption
Decrypting First Character ('P'):
ENCRYPTED: 0xBA
INDEX: 0
Step 1: Get key byte
key_byte = (0xDEADBEEF >> 0) & 0xFF = 0xEF
Step 2: Rotate Right 2
Binary: 10111010
After rotate: 10101110 = 0xAE
Step 3: Subtract key
temp = 0xAE - 0xEF = -0x41
With underflow: (0xAE + 0x100) - 0xEF = 0x1AE - 0xEF = 0xBF
Step 4: XOR
original = 0xBF ^ 0xEF = 0x50 = 'P'
Result: 'P' (correct!)
Verification
Tested the extracted flag with the actual binary:
$ echo "PCC{`1f_y0u_us3d_gpt_t0_s0lv3_th1s_pl5_sh4r3_th3_ch4ts_cuz_Y0u_4_G04TTT`}" | ./ashfaq-vm chal.ashfaq
PCC 2025 - AshfaqVM
Please enter the flag: Correct!
Security Analysis
Cryptographic Properties
- Cipher Type: Stream cipher with rotating key
- Key Length: 32 bits (weak by modern standards)
- Block Size: 8 bits (byte-wise)
- Rounds: 1 (single-pass)
- Diffusion: Limited (only 2-bit rotation)
- Confusion: Moderate (XOR + ADD operations)
Weaknesses:
- Small Keyspace: Only 4 rotating bytes (32-bit key)
- Known Plaintext: Format "PCC..." is predictable
- No Salt/IV: Same input always produces same output
- Deterministic: No randomness in encryption
Strengths:
- Multi-step Transform: XOR + ADD + Rotation makes it non-trivial
- Custom Implementation: Not a standard cipher (obfuscation through novelty)
- Requires VM Understanding: Can't attack encryption without understanding VM
Attack Vectors
1. Known Plaintext Attack ✓
- We know the flag starts with "PCC{"
- Can derive key bytes from known characters
- Success: Flag extracted
2. Brute Force Key
- Key space: 2^32 ≈ 4 billion
- Feasible with modern hardware
- Not necessary due to weakness #1
3. VM Analysis ✓
- Reverse engineer the VM
- Extract encryption algorithm
- Reverse the process
- Success: Method used
Technical Challenges Overcome
- Complex VM Architecture: Multiple memory regions, string storage system
- Custom Instruction Set: ~40 different opcodes with various formats
- Mixed Code and Data: Strings embedded in bytecode stream
- Multi-step Encryption: XOR, addition, and bit rotation
- Rotating Key: Key byte changes every 4 characters
Tools Created
All tools are fully functional and can be used for similar VM challenges:
- Complete Disassembler: Full bytecode analysis with all opcode support
- Full VM Emulator: Execution tracing with syscall implementation
- Algorithm Analysis: Encryption reversal and flag extraction scripts
Tool Usage Examples
1. Disassembler Usage
# Basic disassembly
python3 disassembler.py chal.ashfaq
# Output example:
# AshfaqVM Disassembly:
# ==================================================
# 0000: MOV_IMM R1, 0x0000001b
# 0006: PUSH 0x0000001b
# 000c: STR_CREATE 0xdeadbeef ; "PCC 2025 - AshfaqVM\n"
# 0012: MOV_IMM R2, 0x00000013
# ...
2. VM Emulator Usage
# Run with debug output
python3 vm_emulator.py chal.ashfaq --debug
# Output example:
# --- Instruction 1 ---
# PC: 0x00000000
# Opcode: 0x0c02
# MOV R1, 0x0000001b
#
# --- Instruction 2 ---
# PC: 0x00000006
# Opcode: 0x0d00
# PUSH 0x0000001b -> SP=0x00000ffc
# ...
3. Algorithm Analysis Usage
# Analyze and extract flag
python3 analyze_algo.py chal.ashfaq
# Output example:
# AshfaqVM Algorithm Analysis
# ==================================================
#
# === Encryption Algorithm Analysis ===
# Master Key: 0xdeadbeef
# Key Byte Rotation Pattern:
# Position 0: 0xef (0xEF)
# Position 1: 0xbe (0xBE)
# Position 2: 0xad (0xAD)
# Position 3: 0xde (0xDE)
# ...
#
# === Flag Extraction ===
# Character 0: encrypted=0xba, key_byte=0xef
# After rotate right: 0xae
# After subtract key: 0xbf
# After XOR key: 0x50 = 'P'
# ...
# FINAL FLAG: PCC{1f_y0u_us3d_gpt_t0_s0lv3_th1s_pl5_sh4r3_th3_ch4ts_cuz_Y0u_4_G04TTT}
Advanced Analysis Techniques
Memory Dump Analysis
# Custom memory analysis script
def analyze_memory_layout(vm):
print("=== Memory Layout Analysis ===")
print(f"Registers: {vm.registers}")
print(f"Stack Pointer: {vm.sp:#010x}")
print(f"Program Counter: {vm.pc:#010x}")
print(f"Flags: {vm.flags}")
# Dump string storage
print("\n=== String Storage ===")
for uid, string_data in vm.strings.items():
print(f"{uid:#010x}: {repr(string_data)}")
Bytecode Pattern Analysis
# Pattern analysis for unknown VMs
def analyze_bytecode_patterns(bytecode):
print("=== Bytecode Pattern Analysis ===")
# Find common opcodes
opcode_counts = {}
for i in range(0, len(bytecode)-1, 2):
opcode = struct.unpack('<H', bytecode[i:i+2])[0]
opcode_counts[opcode] = opcode_counts.get(opcode, 0) + 1
print("Most common opcodes:")
for opcode, count in sorted(opcode_counts.items(), key=lambda x: x[1], reverse=True):
print(f" 0x{opcode:04x}: {count} occurrences")
# Find embedded strings
print("\nEmbedded strings:")
for i in range(len(bytecode)-4):
if bytecode[i:i+4].isalpha():
# Potential string start
string_end = i
while string_end < len(bytecode) and bytecode[string_end].isprintable():
string_end += 1
if string_end - i > 3:
print(f" Offset {i:#06x}: {bytecode[i:string_end].decode('utf-8', errors='ignore')}")
Files Generated
VM_ARCHITECTURE.md
- Complete VM documentationdisassembler.py
- Bytecode disassemblervm_emulator.py
- VM emulatoranalyze_algo.py
- Algorithm reversal scriptfull_disasm.txt
- Complete disassembly listingSOLUTION.md
- Detailed solution writeupANALYSIS_SUMMARY.md
- Analysis summarymemory_analysis.py
- Custom memory analysis toolspattern_analysis.py
- Bytecode pattern recognition
Time Analysis
The challenge was solved through systematic reverse engineering:
- Understanding VM architecture: ~20%
- Tool development: ~30%
- Algorithm analysis: ~30%
- Flag extraction: ~20%
Key Learnings
- Custom VM Analysis: Systematic approach to understanding unknown VMs
- Tool Development: Building custom disassemblers and emulators
- Algorithm Reversal: Breaking down multi-step encryption
- IDA Pro Usage: Leveraging decompiled code for analysis
- Binary Analysis Skills: Understanding complex architectures
- Python Scripting: Automating analysis and decryption
Conclusion
This was an excellent custom VM challenge that required:
- Binary Analysis Skills (IDA Pro / decompilation)
- VM Architecture Understanding
- Algorithm Reversal
- Tool Development
- Python Scripting
The encryption is a custom stream cipher with mathematical operations (XOR, ADD) and bit manipulation (Rotate-left-2) with 4-byte key rotation.
While interesting for a CTF challenge, this cipher would be insecure for real-world use due to small key space and vulnerability to known-plaintext attacks. However, it's perfect for a reverse engineering challenge!
The meta-joke in the flag about using GPT is particularly fitting! 🐐
Challenge Author: Ashfaq
Difficulty: Medium-Hard (Custom VM + Crypto)
Solve Time: ~2 hours with systematic approach