Cyber Pulse Academy

Latest News

Collection

The Attacker's Critical Harvest & How to Stop It

Collection (TA0009)

The Attacker's Critical Harvest & How to Stop It


Collection is the phase where attackers, having gained access to your environment, systematically gather the data they came to steal. It's the quiet, methodical harvest before the storm of exfiltration.


Introduction: The "So What?" Hook

Collection is the cybersecurity equivalent of a thief rummaging through your house after picking the lock, they're not just trespassing; they're actively searching for your cash, jewelry, and sensitive documents.


Why does this tactic matter so much? Because this is the critical pivot from "we have a foothold" to "we have the goods." Success in Collection directly enables data theft, espionage, and ransomware leverage. If defenders fail to detect activity at this stage, they effectively hand the keys to the kingdom to the adversary, making the eventual breach inevitable and far more damaging.


White Label 7522c3f8 collection 1

The Core Analogy: The Corporate Archeologist

Imagine your network is an ancient, sprawling library archive (like the Library of Alexandria). The attacker isn't a smash-and-grab vandal. They are a determined, unethical archeologist who has secretly tunneled into the basement (Initial Access).

Their goal isn't to burn the library down, yet. It's to methodically locate and catalog the most valuable scrolls. They check the main catalog (file directory listing), search for specific keywords in scroll titles (data searching), carefully unroll promising documents to photograph them (screen capture), and even make notes about which shelves contain the real treasures (target data location). They gather all this into their satchel, preparing for the moment they can sneak it all out through their tunnel (Exfiltration).

This is Collection: the meticulous, targeted gathering of information before the final theft.



Vocabulary Decoder Ring

  • Data Staging: The process of consolidating collected files into a single directory (like a "staging area") before exfiltration. Why it matters here: It's a major detection opportunity, attackers creating zip files or moving gigabytes of data to one folder is highly suspicious.

  • Input Capture: Techniques like keylogging or clipboard capture that steal data as it's entered or used by a user. Why it matters here: This is how attackers bypass encryption to grab credentials and sensitive data before it's protected.

  • Archive Collected Data: Compressing (e.g., using RAR, 7zip) or encrypting the stolen data to prepare it for transfer. Why it matters here: This action often triggers AV/EDR alerts for suspicious command-line archiving and creates large, uncommon file types in unexpected locations.

  • Data from Network Shared Drive: Targeting files on SMB, NFS, or SharePoint network shares. Why it matters here: It shifts the focus from individual endpoints to the network, requiring defenders to monitor access patterns to sensitive file servers.

  • Screen Capture: Taking screenshots or recording the victim's desktop. Why it matters here: It's a stealthy way to collect information that isn't stored in a file, like dashboards, emails open on screen, or confidential diagrams.


The Attacker's Playbook (Red Team View)

Red Team Analogy: The Unethical Archeologist at Work

From the attacker's seat, Collection is a race against time and detection. The goal is to be the most efficient archeologist possible: find the crown jewels (domain admin hashes, source code, financial records) without tripping any pressure plates (alerts). The feeling is one of focused intensity. The methodology involves using native tools (Living-off-the-Land Binaries) as much as possible to blend in, while systematically exploring every "room" in the digital library.

Common Collection Techniques

  • T1005 - Data from Local System: Searching and collecting files from the local victim's file system.
  • T1074 - Data Staged: Aggregating stolen data into a central location (like `C:\Windows\Temp\report.zip`) prior to exfiltration.
  • T1056 - Input Capture: Using keyloggers or API hooks to capture user keystrokes, clipboard data, or GUI inputs.
  • T1113 - Screen Capture: Capturing screenshots or recording the desktop to obtain sensitive information.
  • T1530 - Data from Cloud Storage: Targeting data stored in cloud repositories like AWS S3 buckets, Azure Blobs, or Google Drive.

The Attacker's Toolbox

Mimikatz (for harvesting credentials from memory), LaZagne (for extracting passwords from local applications), and Nigilant32 (for live memory capture and analysis). For more advanced groups, custom PowerShell scripts are the tool of choice for stealthy Collection.

Command-Line Glimpse: The Archeologist's Notes

# Example: Using built-in commands to stage sensitive documents before exfil # An attacker might search for and copy finance-related files to a staging folder findstr /s /i /m "confidential*quarterly*" *.docx *.xlsx *.pdf > filelist.txt # Searches for files containing those keywords and saves the list for /f "tokens=*" %a in (filelist.txt) do copy "%a" C:\Windows\Temp\Stage\ # Loops through the list and copies each file to the staging directory cd C:\Windows\Temp\Stage && "C:\Program Files\7-Zip\7z.exe" a -tzip archive.zip * -pMaliciousPassword # Archives all collected files with a password for "encryption"


The Defender's Handbook (Blue Team View)

Blue Team Analogy: The Library's Silent Alarm System

As a defender, you're not patrolling every aisle 24/7. You've installed motion sensors near the rare book section (sensitive file servers), weight sensors on shelves (monitoring for bulk file reads), and acoustic detectors that listen for the sound of a camera shutter (API calls for screen capture). Your goal is to detect the archeologist's activity the moment they stop browsing and start collecting.

SOC Reality Check: What You'll Actually See

These are the log lines that should make your spine straighten:

  • Windows Event ID 4663 (File Access): An unusual process (`powershell.exe`, `cmd.exe`) successfully accessing hundreds of files in `\\fileserver\Finance\` in a short timeframe.
  • Sysmon Event ID 1 (Process Creation): `7z.exe`, `rar.exe`, or `makecab.exe` spawned from a user's temporary directory or a suspicious parent process, with command lines compressing files in `C:\Windows\Temp\`.
  • EDR Alert: "Process mimikatz.exe or reflective DLL load detected performing credential dumping from lsass memory."

Threat Hunter’s Eye: A Specific Hypothesis

Hunt for anomalous archive utility execution. Most users don't run 7-Zip or WinRAR from the command line. Look for processes named `7z.exe`, `rar.exe`, or `powershell.exe` using `Compress-Archive`, where the command line includes paths to common staging locations (`Temp`, `Public`, `Recycle.Bin`) and the source files are from multiple, disparate user or business directories. Correlate this with outbound network connections shortly after.

Defensive Tools & Categories

Endpoint Detection and Response (EDR) platforms are crucial for detecting process-level Collection (keyloggers, screen capture). Data Loss Prevention (DLP) solutions can flag on the movement of sensitive data patterns. File Integrity Monitoring (FIM) or User and Entity Behavior Analytics (UEBA) can spot abnormal access to large volumes of files.

Blue Team Command: A Sigma Rule Snippet

title: Suspicious Archive Creation in Staging Directory description: Detects use of archiving tools in temporary or staging directories, common in data collection phases. logsource: category: process_creation product: windows detection: selection: Image|endswith: - '\7z.exe' - '\rar.exe' - '\powershell.exe' CommandLine|contains: - ' a ' # 7z add command - ' -tzip' - 'Compress-Archive' CurrentDirectory|contains: - '\Temp' - '\TEMP' - '\Windows\Temp' - '\Users\Public' condition: selection falsepositives: - Legitimate admin or user software installation/compression tasks level: high

White Label d87ce14c collection 2

Real-World Example: From Headlines to Logs

The SolarWinds Sunburst Attack

The SolarWinds compromise was a masterclass in stealthy, long-term Collection. After gaining access via the poisoned SolarWinds Orion update, the threat group (identified as NOBELIUM) spent months quietly exploring victim networks.

Explicit Connection: In the SolarWinds attack, the threat group NOBELIUM used Collection when they deployed a backdoor ("Sunburst") that enumerated Active Directory, browsed file systems, and collected system information. This allowed them to identify high-value targets (like email servers and security tools) and selectively gather specific emails and security credentials for their final espionage goals.



Mapping the MITRE ATT&CK Collection Landscape

Below is a high-level map of key Techniques under the Collection tactic (TA0009). Think of this as your index to the attacker's playbook. Each technique has numerous sub-techniques for deeper study.

Technique ID Name Brief Purpose
T1560 Archive Collected Data Compress or encrypt stolen data to prepare for exfiltration and avoid detection.
T1113 Screen Capture Capture screenshots or record the victim's desktop to obtain information.
T1005 Data from Local System Collect files from the local file system of the compromised host.
T1074.001 Data Staged: Local Data Staging Gather and group stolen files in a central location on the victim system before exfil.
T1056 Input Capture Intercept user input via keyloggers, GUI capture, or clipboard monitoring.
T1530 Data from Cloud Storage Collect data from cloud storage services (AWS S3, Azure, Google Drive).
T1213 Data from Information Repositories Gather data from databases, wikis, SharePoint, or other shared knowledge stores.


Key Takeaways & Immediate Actions

For Everyone:

  • Collection is the "shopping cart" phase of the cyber attack. The attacker has gotten in and is now filling their cart with your most valuable data.
  • Detecting Collection is about spotting anomalous behavior, not just malicious files, look for abnormal file access, archiving, and data aggregation.

For Leadership:

  • Business Risk: Successful Collection directly enables devastating data breaches, intellectual property theft, and provides the leverage needed for ransomware extortion. It turns a security incident into a business catastrophe.

For Defenders:

  • Enable and tune file access auditing on critical file servers and data shares. Alert on users or processes accessing hundreds of files in minutes.
  • Monitor for execution of archiving utilities (7z, rar, PowerShell Compress-Archive) from unexpected processes or locations, especially if they target staging directories.
  • Implement robust credential hygiene (regular rotation, privileged access management) to reduce the value of credentials an attacker might collect via input capture or memory dumping.


Further Learning & References


Leave a Comment

Your email address will not be published. Required fields are marked *

Ask ChatGPT
Set ChatGPT API key
Find your Secret API key in your ChatGPT User settings and paste it here to connect ChatGPT with your Courses LMS website.
Certification Courses
Hands-On Labs
Threat Intelligence
Latest Cyber News
MITRE ATT&CK Breakdown
All Cyber Keywords

Every contribution moves us closer to our goal: making world-class cybersecurity education accessible to ALL.

Choose the amount of donation by yourself.