Malware Analysis Deep Dive: From Static to Dynamic Analysis

Introduction

Throughout my cybersecurity career, malware analysis has been one of the most challenging yet rewarding aspects of threat research. From my early days at Innobuzz conducting training sessions to analyzing nation-state malware at the Ministry of Defence, I’ve developed a comprehensive methodology that combines traditional static analysis with advanced dynamic techniques.

The Evolution of Malware Analysis

Traditional vs. Modern Approaches

The malware landscape has evolved dramatically:

Early 2000s: Simple file-based analysis, signature detection
2010s: Packed malware, polymorphic techniques, sandbox evasion
2020s: Living-off-the-land, fileless malware, AI-powered evasion

This evolution requires analysts to adapt their methodologies continuously.

Static Analysis Fundamentals

Static analysis involves examining malware without executing it. This is always the first step in my analysis workflow.

File Metadata Analysis

# Basic file information
file suspicious_sample.exe
strings suspicious_sample.exe | grep -i "http\|ftp\|\.exe\|\.dll"
hexdump -C suspicious_sample.exe | head -20

# Hash calculation for threat intelligence
md5sum suspicious_sample.exe
sha1sum suspicious_sample.exe
sha256sum suspicious_sample.exe

PE Structure Analysis

For Windows executables, PE structure analysis reveals crucial information:

import pefile
import hashlib

def analyze_pe_structure(file_path):
    """Comprehensive PE file analysis"""
    
    try:
        pe = pefile.PE(file_path)
        
        analysis_results = {
            'file_info': {
                'md5': hashlib.md5(open(file_path, 'rb').read()).hexdigest(),
                'sha256': hashlib.sha256(open(file_path, 'rb').read()).hexdigest(),
                'size': len(open(file_path, 'rb').read())
            },
            'pe_info': {
                'machine_type': hex(pe.FILE_HEADER.Machine),
                'timestamp': pe.FILE_HEADER.TimeDateStamp,
                'entry_point': hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint),
                'image_base': hex(pe.OPTIONAL_HEADER.ImageBase)
            },
            'sections': [],
            'imports': [],
            'exports': [],
            'suspicious_indicators': []
        }
        
        # Analyze sections
        for section in pe.sections:
            section_info = {
                'name': section.Name.decode('utf-8').rstrip('\x00'),
                'virtual_address': hex(section.VirtualAddress),
                'raw_size': section.SizeOfRawData,
                'entropy': section.get_entropy()
            }
            
            # High entropy might indicate packing/encryption
            if section.get_entropy() > 7.0:
                analysis_results['suspicious_indicators'].append(
                    f"High entropy section: {section_info['name']} ({section.get_entropy():.2f})"
                )
            
            analysis_results['sections'].append(section_info)
        
        # Analyze imports
        if hasattr(pe, 'DIRECTORY_ENTRY_IMPORT'):
            for entry in pe.DIRECTORY_ENTRY_IMPORT:
                dll_name = entry.dll.decode('utf-8')
                functions = []
                
                for function in entry.imports:
                    if function.name:
                        func_name = function.name.decode('utf-8')
                        functions.append(func_name)
                        
                        # Check for suspicious API calls
                        if func_name in ['CreateProcess', 'WriteProcessMemory', 'VirtualAlloc']:
                            analysis_results['suspicious_indicators'].append(
                                f"Suspicious API: {func_name} from {dll_name}"
                            )
                
                analysis_results['imports'].append({
                    'dll': dll_name,
                    'functions': functions
                })
        
        return analysis_results
        
    except Exception as e:
        return {'error': f"PE analysis failed: {str(e)}"}

String Analysis

String analysis often reveals the most actionable intelligence:

import re

def extract_suspicious_strings(file_path):
    """Extract and categorize suspicious strings"""
    
    with open(file_path, 'rb') as f:
        data = f.read()
    
    # Convert to string, ignore errors
    try:
        text = data.decode('utf-8', errors='ignore')
    except:
        text = data.decode('latin1', errors='ignore')
    
    patterns = {
        'urls': re.findall(r'https?://[^\s<>"]{2,}', text, re.IGNORECASE),
        'ip_addresses': re.findall(r'\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b', text),
        'email_addresses': re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', text),
        'file_paths': re.findall(r'[A-Za-z]:\\[^<>:"|?*\s]+', text),
        'registry_keys': re.findall(r'HKEY_[A-Z_]+\\[^<>:"|?*\s]+', text, re.IGNORECASE),
        'crypto_indicators': re.findall(r'\b[A-Fa-f0-9]{32,}\b', text)  # Potential hashes/keys
    }
    
    return {k: list(set(v)) for k, v in patterns.items() if v}  # Remove duplicates

Dynamic Analysis Techniques

Dynamic analysis involves executing malware in a controlled environment to observe its behavior.

Sandbox Environment Setup

My standard analysis environment includes:

sandbox_configuration:
  hypervisor: VMware Workstation Pro
  operating_systems:
    - Windows 10 x64 (latest patches)
    - Windows 7 x86 (legacy analysis)
    - Ubuntu 20.04 LTS (Linux malware)
  
  monitoring_tools:
    process_monitor: ProcMon
    network_capture: Wireshark
    api_monitor: API Monitor
    memory_analysis: Volatility Framework
    
  isolation:
    network: Isolated virtual network
    snapshots: Pre-execution snapshots for rollback
    time_limits: 30-minute maximum execution

Behavioral Analysis Framework

import psutil
import time
import json
from datetime import datetime

class BehaviorAnalyzer:
    def __init__(self):
        self.baseline_processes = set(p.pid for p in psutil.process_iter())
        self.baseline_connections = set(self.get_network_connections())
        self.analysis_start = datetime.now()
        self.behaviors = []
    
    def get_network_connections(self):
        """Get current network connections"""
        connections = []
        for conn in psutil.net_connections():
            if conn.status == 'ESTABLISHED':
                connections.append(f"{conn.laddr.ip}:{conn.laddr.port} -> {conn.raddr.ip}:{conn.raddr.port}")
        return connections
    
    def monitor_process_creation(self):
        """Monitor for new process creation"""
        current_processes = set(p.pid for p in psutil.process_iter())
        new_processes = current_processes - self.baseline_processes
        
        for pid in new_processes:
            try:
                process = psutil.Process(pid)
                self.behaviors.append({
                    'timestamp': datetime.now().isoformat(),
                    'type': 'process_creation',
                    'details': {
                        'pid': pid,
                        'name': process.name(),
                        'cmdline': ' '.join(process.cmdline()),
                        'parent_pid': process.ppid()
                    }
                })
            except (psutil.NoSuchProcess, psutil.AccessDenied):
                pass
        
        self.baseline_processes = current_processes
    
    def monitor_network_activity(self):
        """Monitor for new network connections"""
        current_connections = set(self.get_network_connections())
        new_connections = current_connections - self.baseline_connections
        
        for conn in new_connections:
            self.behaviors.append({
                'timestamp': datetime.now().isoformat(),
                'type': 'network_connection',
                'details': {'connection': conn}
            })
        
        self.baseline_connections = current_connections
    
    def monitor_file_system(self, watch_paths):
        """Monitor file system changes in specified paths"""
        # Implementation would use file system monitoring
        # This is a simplified version
        pass
    
    def analyze_sample(self, sample_path, duration=300):
        """Run complete behavioral analysis"""
        print(f"Starting analysis of {sample_path}")
        
        # Execute the sample (in a real scenario, this would be more sophisticated)
        import subprocess
        process = subprocess.Popen(sample_path, shell=True)
        
        # Monitor behavior for specified duration
        end_time = time.time() + duration
        
        while time.time() < end_time:
            self.monitor_process_creation()
            self.monitor_network_activity()
            time.sleep(1)
        
        # Terminate the process if still running
        try:
            process.terminate()
        except:
            pass
        
        return {
            'sample': sample_path,
            'analysis_duration': duration,
            'behaviors': self.behaviors,
            'summary': self.generate_summary()
        }
    
    def generate_summary(self):
        """Generate analysis summary"""
        process_count = len([b for b in self.behaviors if b['type'] == 'process_creation'])
        network_count = len([b for b in self.behaviors if b['type'] == 'network_connection'])
        
        return {
            'total_behaviors': len(self.behaviors),
            'process_creations': process_count,
            'network_connections': network_count,
            'risk_assessment': self.assess_risk()
        }
    
    def assess_risk(self):
        """Simple risk assessment based on observed behaviors"""
        risk_score = 0
        
        # Risk factors
        process_count = len([b for b in self.behaviors if b['type'] == 'process_creation'])
        network_count = len([b for b in self.behaviors if b['type'] == 'network_connection'])
        
        if process_count > 5:
            risk_score += 30
        if network_count > 0:
            risk_score += 40
        
        # Check for suspicious process names
        suspicious_processes = ['cmd.exe', 'powershell.exe', 'reg.exe']
        for behavior in self.behaviors:
            if behavior['type'] == 'process_creation':
                if any(susp in behavior['details']['name'].lower() for susp in suspicious_processes):
                    risk_score += 20
        
        return min(risk_score, 100)

Advanced Analysis Techniques

Memory Forensics

Memory analysis reveals runtime behavior that file analysis cannot:

# Volatility Framework commands for memory analysis
volatility -f memory_dump.raw --profile=Win10x64_19041 imageinfo
volatility -f memory_dump.raw --profile=Win10x64_19041 pslist
volatility -f memory_dump.raw --profile=Win10x64_19041 netscan
volatility -f memory_dump.raw --profile=Win10x64_19041 malfind
volatility -f memory_dump.raw --profile=Win10x64_19041 yarascan -y malware_rules.yar

Network Traffic Analysis

from scapy.all import *

def analyze_network_traffic(pcap_file):
    """Analyze network traffic for malicious indicators"""
    
    packets = rdpcap(pcap_file)
    analysis = {
        'total_packets': len(packets),
        'protocols': {},
        'destinations': {},
        'suspicious_activities': []
    }
    
    for packet in packets:
        # Protocol analysis
        if packet.haslayer(IP):
            dst_ip = packet[IP].dst
            analysis['destinations'][dst_ip] = analysis['destinations'].get(dst_ip, 0) + 1
            
            # Check for suspicious destinations
            if is_suspicious_ip(dst_ip):
                analysis['suspicious_activities'].append({
                    'type': 'suspicious_destination',
                    'ip': dst_ip,
                    'timestamp': packet.time
                })
        
        # HTTP analysis
        if packet.haslayer(Raw):
            payload = packet[Raw].load.decode('utf-8', errors='ignore')
            if 'User-Agent:' in payload:
                user_agent = extract_user_agent(payload)
                if is_suspicious_user_agent(user_agent):
                    analysis['suspicious_activities'].append({
                        'type': 'suspicious_user_agent',
                        'user_agent': user_agent
                    })
    
    return analysis

def is_suspicious_ip(ip):
    """Check if IP is suspicious based on threat intelligence"""
    # This would integrate with threat intelligence feeds
    # For demo purposes, flagging certain IP ranges
    suspicious_ranges = ['91.', '185.', '194.']  # Example malicious IP prefixes
    return any(ip.startswith(prefix) for prefix in suspicious_ranges)

Automated Analysis Pipeline

Based on my experience analyzing thousands of samples, I developed this automated pipeline:

class MalwareAnalysisPipeline:
    def __init__(self):
        self.stages = [
            'file_identification',
            'static_analysis',
            'dynamic_analysis',
            'memory_analysis',
            'network_analysis',
            'report_generation'
        ]
    
    def analyze_sample(self, sample_path):
        """Complete automated analysis pipeline"""
        
        results = {'sample': sample_path, 'stages': {}}
        
        try:
            # Stage 1: File Identification
            results['stages']['file_identification'] = self.identify_file_type(sample_path)
            
            # Stage 2: Static Analysis
            results['stages']['static_analysis'] = {
                'pe_analysis': analyze_pe_structure(sample_path),
                'strings': extract_suspicious_strings(sample_path),
                'entropy': calculate_file_entropy(sample_path)
            }
            
            # Stage 3: Dynamic Analysis (if safe)
            if self.is_safe_to_execute(results['stages']['static_analysis']):
                analyzer = BehaviorAnalyzer()
                results['stages']['dynamic_analysis'] = analyzer.analyze_sample(sample_path)
            
            # Stage 4: Generate IOCs
            results['iocs'] = self.extract_iocs(results)
            
            # Stage 5: Threat Classification
            results['classification'] = self.classify_threat(results)
            
            return results
            
        except Exception as e:
            results['error'] = str(e)
            return results
    
    def extract_iocs(self, analysis_results):
        """Extract Indicators of Compromise"""
        iocs = {
            'file_hashes': [],
            'network_indicators': [],
            'registry_keys': [],
            'file_paths': []
        }
        
        # Extract from static analysis
        static = analysis_results['stages'].get('static_analysis', {})
        if 'strings' in static:
            strings = static['strings']
            iocs['network_indicators'].extend(strings.get('urls', []))
            iocs['network_indicators'].extend(strings.get('ip_addresses', []))
            iocs['registry_keys'].extend(strings.get('registry_keys', []))
            iocs['file_paths'].extend(strings.get('file_paths', []))
        
        # Extract from dynamic analysis
        dynamic = analysis_results['stages'].get('dynamic_analysis', {})
        if 'behaviors' in dynamic:
            for behavior in dynamic['behaviors']:
                if behavior['type'] == 'network_connection':
                    iocs['network_indicators'].append(behavior['details']['connection'])
        
        return iocs

Case Studies from the Field

Case Study 1: Nation-State Malware Analysis

During my time at the Ministry of Defence, I analyzed a sophisticated APT sample:

Initial Observations:

File size: 2.3MB (unusually large for initial payload)
High entropy sections (7.8+) indicating packing
No obvious strings in static analysis

Static Analysis Findings:

Custom packer with anti-analysis techniques
Legitimate digital signature (stolen certificate)
Import table obfuscation

Dynamic Analysis Results:

{
  "execution_flow": [
    "Unpacking routine executed",
    "Process hollowing of legitimate Windows binary",
    "Registry persistence established",
    "C2 communication initiated"
  ],
  "network_indicators": [
    "https://legitimate-looking-domain.com/api/v1/status",
    "DNS queries to compromised domains"
  ],
  "persistence_methods": [
    "HKEY_CURRENT_USER\\Software\\Microsoft\\Windows\\CurrentVersion\\Run"
  ]
}

Case Study 2: Banking Trojan Analysis

Sample: Emotet variant targeting financial institutions

Key Findings:

Modular architecture with plugin system
Email harvesting capabilities
Credential theft targeting specific banks
P2P communication for resilience

Attribution Indicators:

Code similarities to known Emotet samples
Infrastructure overlaps with previous campaigns
TTP alignment with TA542 group

Tools and Techniques Comparison

Static Analysis Tools

Tool	Strength	Use Case	Cost
IDA Pro	Advanced disassembly	Complex reverse engineering	Commercial
Ghidra	Free NSA tool	General reverse engineering	Free
PEiD	Packer detection	Quick packer identification	Free
Strings	String extraction	Basic IOC discovery	Free

Dynamic Analysis Tools

Tool	Strength	Use Case	Cost
Cuckoo Sandbox	Automated analysis	High-volume processing	Free
Any.run	Interactive analysis	Manual behavior observation	Freemium
Joe Sandbox	Comprehensive reports	Enterprise analysis	Commercial
Process Monitor	Real-time monitoring	Live system analysis	Free

Advanced Evasion Techniques (What to Watch For)

Anti-Analysis Techniques

Modern malware employs sophisticated evasion:

def detect_analysis_evasion(sample_behavior):
    """Detect common analysis evasion techniques"""
    
    evasion_indicators = []
    
    # VM detection techniques
    vm_artifacts = [
        'VMware', 'VirtualBox', 'QEMU', 'Xen',
        'vmtoolsd.exe', 'VBoxService.exe'
    ]
    
    # Sandbox detection
    sandbox_artifacts = [
        'cuckoo', 'analyst', 'malware', 'sample',
        'virus', 'sandbox'
    ]
    
    # Time-based evasion
    if sample_behavior.get('execution_time', 0) < 60:
        evasion_indicators.append('Short execution time (possible sleep evasion)')
    
    # Check for environment enumeration
    processes = sample_behavior.get('process_creations', [])
    for process in processes:
        if any(artifact in process.lower() for artifact in vm_artifacts):
            evasion_indicators.append(f'VM detection attempt: {process}')
        
        if any(artifact in process.lower() for artifact in sandbox_artifacts):
            evasion_indicators.append(f'Sandbox detection attempt: {process}')
    
    return evasion_indicators

Best Practices and Lessons Learned

Analysis Workflow

Always start with static analysis - Never execute unknown samples first
Use isolated environments - Dedicated analysis networks and systems
Document everything - Maintain detailed analysis notes and screenshots
Validate findings - Cross-reference IOCs with threat intelligence
Share intelligence - Contribute findings to community databases

Common Pitfalls

Rushing to execute: Static analysis often provides sufficient intelligence
Insufficient isolation: Malware can escape poorly configured sandboxes
Ignoring metadata: File creation times, paths, and attributes provide context
Over-relying on automation: Manual analysis catches what automated tools miss

Future of Malware Analysis

Emerging Trends

AI-Powered Malware: Malware using machine learning for evasion
Fileless Attacks: Living-off-the-land techniques reducing file-based artifacts
Cloud-Native Threats: Malware designed for cloud environments
IoT Malware: Embedded system threats requiring specialized analysis

Analysis Evolution

ML-Assisted Analysis: Using AI to identify patterns and classify samples
Behavior-Based Detection: Focus on actions rather than signatures
Threat Intelligence Integration: Real-time feeds enhancing analysis context
Collaborative Analysis: Shared platforms for community-driven research

Conclusion

Malware analysis remains both an art and a science. The key to effective analysis is:

Systematic Approach: Following consistent methodologies while adapting to new techniques
Continuous Learning: The threat landscape evolves rapidly, requiring constant skill development
Tool Mastery: Understanding both the capabilities and limitations of analysis tools
Intelligence Integration: Connecting analysis findings to broader threat intelligence

Based on my experience analyzing thousands of samples across different organizations, the analysts who succeed are those who combine technical depth with strategic thinking, always asking not just “what does this malware do?” but “what does this tell us about the adversary’s objectives and capabilities?”

This deep dive represents techniques and methodologies developed over years of hands-on malware analysis. For specific questions about advanced techniques or tool recommendations, feel free to reach out.

Uddip Ranjan

Harp6x's Personal website