Uddip Ranjan

Harp6x's Personal website

Malware Analysis Deep Dive: From Static to Dynamic Analysis

Introduction

Throughout my cybersecurity career, malware analysis has been one of the most challenging yet rewarding aspects of threat research. From my early days at Innobuzz conducting training sessions to analyzing nation-state malware at the Ministry of Defence, I’ve developed a comprehensive methodology that combines traditional static analysis with advanced dynamic techniques.

The Evolution of Malware Analysis

Traditional vs. Modern Approaches

The malware landscape has evolved dramatically:

This evolution requires analysts to adapt their methodologies continuously.

Static Analysis Fundamentals

Static analysis involves examining malware without executing it. This is always the first step in my analysis workflow.

File Metadata Analysis

# Basic file information
file suspicious_sample.exe
strings suspicious_sample.exe | grep -i "http\|ftp\|\.exe\|\.dll"
hexdump -C suspicious_sample.exe | head -20

# Hash calculation for threat intelligence
md5sum suspicious_sample.exe
sha1sum suspicious_sample.exe
sha256sum suspicious_sample.exe

PE Structure Analysis

For Windows executables, PE structure analysis reveals crucial information:

import pefile
import hashlib

def analyze_pe_structure(file_path):
    """Comprehensive PE file analysis"""
    
    try:
        pe = pefile.PE(file_path)
        
        analysis_results = {
            'file_info': {
                'md5': hashlib.md5(open(file_path, 'rb').read()).hexdigest(),
                'sha256': hashlib.sha256(open(file_path, 'rb').read()).hexdigest(),
                'size': len(open(file_path, 'rb').read())
            },
            'pe_info': {
                'machine_type': hex(pe.FILE_HEADER.Machine),
                'timestamp': pe.FILE_HEADER.TimeDateStamp,
                'entry_point': hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint),
                'image_base': hex(pe.OPTIONAL_HEADER.ImageBase)
            },
            'sections': [],
            'imports': [],
            'exports': [],
            'suspicious_indicators': []
        }
        
        # Analyze sections
        for section in pe.sections:
            section_info = {
                'name': section.Name.decode('utf-8').rstrip('\x00'),
                'virtual_address': hex(section.VirtualAddress),
                'raw_size': section.SizeOfRawData,
                'entropy': section.get_entropy()
            }
            
            # High entropy might indicate packing/encryption
            if section.get_entropy() > 7.0:
                analysis_results['suspicious_indicators'].append(
                    f"High entropy section: {section_info['name']} ({section.get_entropy():.2f})"
                )
            
            analysis_results['sections'].append(section_info)
        
        # Analyze imports
        if hasattr(pe, 'DIRECTORY_ENTRY_IMPORT'):
            for entry in pe.DIRECTORY_ENTRY_IMPORT:
                dll_name = entry.dll.decode('utf-8')
                functions = []
                
                for function in entry.imports:
                    if function.name:
                        func_name = function.name.decode('utf-8')
                        functions.append(func_name)
                        
                        # Check for suspicious API calls
                        if func_name in ['CreateProcess', 'WriteProcessMemory', 'VirtualAlloc']:
                            analysis_results['suspicious_indicators'].append(
                                f"Suspicious API: {func_name} from {dll_name}"
                            )
                
                analysis_results['imports'].append({
                    'dll': dll_name,
                    'functions': functions
                })
        
        return analysis_results
        
    except Exception as e:
        return {'error': f"PE analysis failed: {str(e)}"}

String Analysis

String analysis often reveals the most actionable intelligence:

import re

def extract_suspicious_strings(file_path):
    """Extract and categorize suspicious strings"""
    
    with open(file_path, 'rb') as f:
        data = f.read()
    
    # Convert to string, ignore errors
    try:
        text = data.decode('utf-8', errors='ignore')
    except:
        text = data.decode('latin1', errors='ignore')
    
    patterns = {
        'urls': re.findall(r'https?://[^\s<>"]{2,}', text, re.IGNORECASE),
        'ip_addresses': re.findall(r'\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b', text),
        'email_addresses': re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', text),
        'file_paths': re.findall(r'[A-Za-z]:\\[^<>:"|?*\s]+', text),
        'registry_keys': re.findall(r'HKEY_[A-Z_]+\\[^<>:"|?*\s]+', text, re.IGNORECASE),
        'crypto_indicators': re.findall(r'\b[A-Fa-f0-9]{32,}\b', text)  # Potential hashes/keys
    }
    
    return {k: list(set(v)) for k, v in patterns.items() if v}  # Remove duplicates

Dynamic Analysis Techniques

Dynamic analysis involves executing malware in a controlled environment to observe its behavior.

Sandbox Environment Setup

My standard analysis environment includes:

sandbox_configuration:
  hypervisor: VMware Workstation Pro
  operating_systems:
    - Windows 10 x64 (latest patches)
    - Windows 7 x86 (legacy analysis)
    - Ubuntu 20.04 LTS (Linux malware)
  
  monitoring_tools:
    process_monitor: ProcMon
    network_capture: Wireshark
    api_monitor: API Monitor
    memory_analysis: Volatility Framework
    
  isolation:
    network: Isolated virtual network
    snapshots: Pre-execution snapshots for rollback
    time_limits: 30-minute maximum execution

Behavioral Analysis Framework

import psutil
import time
import json
from datetime import datetime

class BehaviorAnalyzer:
    def __init__(self):
        self.baseline_processes = set(p.pid for p in psutil.process_iter())
        self.baseline_connections = set(self.get_network_connections())
        self.analysis_start = datetime.now()
        self.behaviors = []
    
    def get_network_connections(self):
        """Get current network connections"""
        connections = []
        for conn in psutil.net_connections():
            if conn.status == 'ESTABLISHED':
                connections.append(f"{conn.laddr.ip}:{conn.laddr.port} -> {conn.raddr.ip}:{conn.raddr.port}")
        return connections
    
    def monitor_process_creation(self):
        """Monitor for new process creation"""
        current_processes = set(p.pid for p in psutil.process_iter())
        new_processes = current_processes - self.baseline_processes
        
        for pid in new_processes:
            try:
                process = psutil.Process(pid)
                self.behaviors.append({
                    'timestamp': datetime.now().isoformat(),
                    'type': 'process_creation',
                    'details': {
                        'pid': pid,
                        'name': process.name(),
                        'cmdline': ' '.join(process.cmdline()),
                        'parent_pid': process.ppid()
                    }
                })
            except (psutil.NoSuchProcess, psutil.AccessDenied):
                pass
        
        self.baseline_processes = current_processes
    
    def monitor_network_activity(self):
        """Monitor for new network connections"""
        current_connections = set(self.get_network_connections())
        new_connections = current_connections - self.baseline_connections
        
        for conn in new_connections:
            self.behaviors.append({
                'timestamp': datetime.now().isoformat(),
                'type': 'network_connection',
                'details': {'connection': conn}
            })
        
        self.baseline_connections = current_connections
    
    def monitor_file_system(self, watch_paths):
        """Monitor file system changes in specified paths"""
        # Implementation would use file system monitoring
        # This is a simplified version
        pass
    
    def analyze_sample(self, sample_path, duration=300):
        """Run complete behavioral analysis"""
        print(f"Starting analysis of {sample_path}")
        
        # Execute the sample (in a real scenario, this would be more sophisticated)
        import subprocess
        process = subprocess.Popen(sample_path, shell=True)
        
        # Monitor behavior for specified duration
        end_time = time.time() + duration
        
        while time.time() < end_time:
            self.monitor_process_creation()
            self.monitor_network_activity()
            time.sleep(1)
        
        # Terminate the process if still running
        try:
            process.terminate()
        except:
            pass
        
        return {
            'sample': sample_path,
            'analysis_duration': duration,
            'behaviors': self.behaviors,
            'summary': self.generate_summary()
        }
    
    def generate_summary(self):
        """Generate analysis summary"""
        process_count = len([b for b in self.behaviors if b['type'] == 'process_creation'])
        network_count = len([b for b in self.behaviors if b['type'] == 'network_connection'])
        
        return {
            'total_behaviors': len(self.behaviors),
            'process_creations': process_count,
            'network_connections': network_count,
            'risk_assessment': self.assess_risk()
        }
    
    def assess_risk(self):
        """Simple risk assessment based on observed behaviors"""
        risk_score = 0
        
        # Risk factors
        process_count = len([b for b in self.behaviors if b['type'] == 'process_creation'])
        network_count = len([b for b in self.behaviors if b['type'] == 'network_connection'])
        
        if process_count > 5:
            risk_score += 30
        if network_count > 0:
            risk_score += 40
        
        # Check for suspicious process names
        suspicious_processes = ['cmd.exe', 'powershell.exe', 'reg.exe']
        for behavior in self.behaviors:
            if behavior['type'] == 'process_creation':
                if any(susp in behavior['details']['name'].lower() for susp in suspicious_processes):
                    risk_score += 20
        
        return min(risk_score, 100)

Advanced Analysis Techniques

Memory Forensics

Memory analysis reveals runtime behavior that file analysis cannot:

# Volatility Framework commands for memory analysis
volatility -f memory_dump.raw --profile=Win10x64_19041 imageinfo
volatility -f memory_dump.raw --profile=Win10x64_19041 pslist
volatility -f memory_dump.raw --profile=Win10x64_19041 netscan
volatility -f memory_dump.raw --profile=Win10x64_19041 malfind
volatility -f memory_dump.raw --profile=Win10x64_19041 yarascan -y malware_rules.yar

Network Traffic Analysis

from scapy.all import *

def analyze_network_traffic(pcap_file):
    """Analyze network traffic for malicious indicators"""
    
    packets = rdpcap(pcap_file)
    analysis = {
        'total_packets': len(packets),
        'protocols': {},
        'destinations': {},
        'suspicious_activities': []
    }
    
    for packet in packets:
        # Protocol analysis
        if packet.haslayer(IP):
            dst_ip = packet[IP].dst
            analysis['destinations'][dst_ip] = analysis['destinations'].get(dst_ip, 0) + 1
            
            # Check for suspicious destinations
            if is_suspicious_ip(dst_ip):
                analysis['suspicious_activities'].append({
                    'type': 'suspicious_destination',
                    'ip': dst_ip,
                    'timestamp': packet.time
                })
        
        # HTTP analysis
        if packet.haslayer(Raw):
            payload = packet[Raw].load.decode('utf-8', errors='ignore')
            if 'User-Agent:' in payload:
                user_agent = extract_user_agent(payload)
                if is_suspicious_user_agent(user_agent):
                    analysis['suspicious_activities'].append({
                        'type': 'suspicious_user_agent',
                        'user_agent': user_agent
                    })
    
    return analysis

def is_suspicious_ip(ip):
    """Check if IP is suspicious based on threat intelligence"""
    # This would integrate with threat intelligence feeds
    # For demo purposes, flagging certain IP ranges
    suspicious_ranges = ['91.', '185.', '194.']  # Example malicious IP prefixes
    return any(ip.startswith(prefix) for prefix in suspicious_ranges)

Automated Analysis Pipeline

Based on my experience analyzing thousands of samples, I developed this automated pipeline:

class MalwareAnalysisPipeline:
    def __init__(self):
        self.stages = [
            'file_identification',
            'static_analysis',
            'dynamic_analysis',
            'memory_analysis',
            'network_analysis',
            'report_generation'
        ]
    
    def analyze_sample(self, sample_path):
        """Complete automated analysis pipeline"""
        
        results = {'sample': sample_path, 'stages': {}}
        
        try:
            # Stage 1: File Identification
            results['stages']['file_identification'] = self.identify_file_type(sample_path)
            
            # Stage 2: Static Analysis
            results['stages']['static_analysis'] = {
                'pe_analysis': analyze_pe_structure(sample_path),
                'strings': extract_suspicious_strings(sample_path),
                'entropy': calculate_file_entropy(sample_path)
            }
            
            # Stage 3: Dynamic Analysis (if safe)
            if self.is_safe_to_execute(results['stages']['static_analysis']):
                analyzer = BehaviorAnalyzer()
                results['stages']['dynamic_analysis'] = analyzer.analyze_sample(sample_path)
            
            # Stage 4: Generate IOCs
            results['iocs'] = self.extract_iocs(results)
            
            # Stage 5: Threat Classification
            results['classification'] = self.classify_threat(results)
            
            return results
            
        except Exception as e:
            results['error'] = str(e)
            return results
    
    def extract_iocs(self, analysis_results):
        """Extract Indicators of Compromise"""
        iocs = {
            'file_hashes': [],
            'network_indicators': [],
            'registry_keys': [],
            'file_paths': []
        }
        
        # Extract from static analysis
        static = analysis_results['stages'].get('static_analysis', {})
        if 'strings' in static:
            strings = static['strings']
            iocs['network_indicators'].extend(strings.get('urls', []))
            iocs['network_indicators'].extend(strings.get('ip_addresses', []))
            iocs['registry_keys'].extend(strings.get('registry_keys', []))
            iocs['file_paths'].extend(strings.get('file_paths', []))
        
        # Extract from dynamic analysis
        dynamic = analysis_results['stages'].get('dynamic_analysis', {})
        if 'behaviors' in dynamic:
            for behavior in dynamic['behaviors']:
                if behavior['type'] == 'network_connection':
                    iocs['network_indicators'].append(behavior['details']['connection'])
        
        return iocs

Case Studies from the Field

Case Study 1: Nation-State Malware Analysis

During my time at the Ministry of Defence, I analyzed a sophisticated APT sample:

Initial Observations:

Static Analysis Findings:

Dynamic Analysis Results:

{
  "execution_flow": [
    "Unpacking routine executed",
    "Process hollowing of legitimate Windows binary",
    "Registry persistence established",
    "C2 communication initiated"
  ],
  "network_indicators": [
    "https://legitimate-looking-domain.com/api/v1/status",
    "DNS queries to compromised domains"
  ],
  "persistence_methods": [
    "HKEY_CURRENT_USER\\Software\\Microsoft\\Windows\\CurrentVersion\\Run"
  ]
}

Case Study 2: Banking Trojan Analysis

Sample: Emotet variant targeting financial institutions

Key Findings:

Attribution Indicators:

Tools and Techniques Comparison

Static Analysis Tools

Tool Strength Use Case Cost
IDA Pro Advanced disassembly Complex reverse engineering Commercial
Ghidra Free NSA tool General reverse engineering Free
PEiD Packer detection Quick packer identification Free
Strings String extraction Basic IOC discovery Free

Dynamic Analysis Tools

Tool Strength Use Case Cost
Cuckoo Sandbox Automated analysis High-volume processing Free
Any.run Interactive analysis Manual behavior observation Freemium
Joe Sandbox Comprehensive reports Enterprise analysis Commercial
Process Monitor Real-time monitoring Live system analysis Free

Advanced Evasion Techniques (What to Watch For)

Anti-Analysis Techniques

Modern malware employs sophisticated evasion:

def detect_analysis_evasion(sample_behavior):
    """Detect common analysis evasion techniques"""
    
    evasion_indicators = []
    
    # VM detection techniques
    vm_artifacts = [
        'VMware', 'VirtualBox', 'QEMU', 'Xen',
        'vmtoolsd.exe', 'VBoxService.exe'
    ]
    
    # Sandbox detection
    sandbox_artifacts = [
        'cuckoo', 'analyst', 'malware', 'sample',
        'virus', 'sandbox'
    ]
    
    # Time-based evasion
    if sample_behavior.get('execution_time', 0) < 60:
        evasion_indicators.append('Short execution time (possible sleep evasion)')
    
    # Check for environment enumeration
    processes = sample_behavior.get('process_creations', [])
    for process in processes:
        if any(artifact in process.lower() for artifact in vm_artifacts):
            evasion_indicators.append(f'VM detection attempt: {process}')
        
        if any(artifact in process.lower() for artifact in sandbox_artifacts):
            evasion_indicators.append(f'Sandbox detection attempt: {process}')
    
    return evasion_indicators

Best Practices and Lessons Learned

Analysis Workflow

  1. Always start with static analysis - Never execute unknown samples first
  2. Use isolated environments - Dedicated analysis networks and systems
  3. Document everything - Maintain detailed analysis notes and screenshots
  4. Validate findings - Cross-reference IOCs with threat intelligence
  5. Share intelligence - Contribute findings to community databases

Common Pitfalls

Future of Malware Analysis

Analysis Evolution

Conclusion

Malware analysis remains both an art and a science. The key to effective analysis is:

  1. Systematic Approach: Following consistent methodologies while adapting to new techniques
  2. Continuous Learning: The threat landscape evolves rapidly, requiring constant skill development
  3. Tool Mastery: Understanding both the capabilities and limitations of analysis tools
  4. Intelligence Integration: Connecting analysis findings to broader threat intelligence

Based on my experience analyzing thousands of samples across different organizations, the analysts who succeed are those who combine technical depth with strategic thinking, always asking not just “what does this malware do?” but “what does this tell us about the adversary’s objectives and capabilities?”


This deep dive represents techniques and methodologies developed over years of hands-on malware analysis. For specific questions about advanced techniques or tool recommendations, feel free to reach out.