Cyber Pulse Academy

Latest News
T1593.003 MITRE ATT&CK

Code Repositories

Adversaries may search public code repositories for information about victims that can be used during targeting , including leaked credentials, API keys, internal URLs, and configuration secrets exposed in source code.

Tactic: TA0043 , Reconnaissance

Simulation , Repository Recon in Action

The following CSS-only animation simulates an adversary browsing a public GitHub repository, discovering sensitive files, and extracting hardcoded secrets from source code and configuration files.

This simulation demonstrates how sensitive credentials are frequently exposed in public repositories , from database passwords and AWS access keys to payment processing secrets and JWT signing keys.

Why It Matters

Public code repositories like GitHub, GitLab, and Bitbucket have become one of the most prolific sources of leaked credentials and sensitive organizational data in modern cyber operations. The scale of the problem is staggering: in 2024 alone, GitHub detected more than 39 million leaked secrets across its platform, and Snyk reported 28.65 million hardcoded secrets were added to public GitHub in 2025 , a 25% increase year-over-year. According to GitHub's own analysis, over 80% of data breaches involving leaked credentials trace back to sensitive information exposed in public repositories. Perhaps most alarmingly, GitGuardian found that 70% of leaked secrets remain active for two years or more, providing adversaries with extended windows of opportunity.

39M+ Secrets leaked on GitHub (2024) GitHub Insights
28.65M New hardcoded secrets (2025) Snyk Research
80% Breaches linked to repo credentials CTI Labs / GitHub
70% Leaked secrets still active after 2 years GitGuardian 2025
5M Web servers exposing Git metadata Security Affairs 2026
250K Servers leaking deployment credentials Security Affairs 2026
📄 HAFNIUM Reference: Microsoft's analysis of Chinese state-sponsored group HAFNIUM revealed that the group regularly searched public GitHub repositories to discover leaked corporate credentials belonging to their targets. These credentials were then used to gain initial access to victim environments through credential-based attacks (T1589.001). See CISA Advisory AA23-320A for details on threat actor techniques.

The types of sensitive information commonly found in public repositories include: API keys for cloud services (AWS, Azure, GCP), database credentials, encryption keys, internal infrastructure URLs, CI/CD pipeline tokens, SSH private keys, OAuth tokens, and payment processing secrets. The attack surface extends beyond GitHub to include GitLab, Bitbucket, and self-hosted Git instances , nearly 5 million web servers were found exposing Git metadata in 2026, with 250,000 leaking deployment credentials through exposed .git/config files.

GitHub Code Search Advanced query-based code search
GitGuardian Automated secrets detection
TruffleHog Git history secret scanning
gitleaks Open-source secret detector
GitDorker GitHub dorking automation
Nuclear Repo enumeration tool

Key Terms & Concepts

Term Definition
Code Repository A version-controlled storage system (e.g., GitHub, GitLab, Bitbucket) where developers store and collaborate on source code. Repositories may be public (accessible to anyone) or private (restricted to authorized users).
Hardcoded Secret A credential, API key, token, or password embedded directly in source code rather than being loaded from a secure vault or environment variable at runtime.
Secret Sprawl The uncontrolled spread of secrets across multiple code repositories, branches, and systems, often resulting from developers copying configuration files or credentials across projects.
Git History The complete record of all changes made to a repository, including deleted files. Even if a developer removes a secret from the current code, it remains accessible in the Git history unless the entire history is rewritten.
GitHub Dorking The practice of using advanced search queries on GitHub to find specific types of sensitive information, such as credentials, API keys, internal URLs, or configuration files within public repositories.
Pre-Commit Hook A script that runs automatically before a Git commit is finalized, commonly used to scan code for secrets before they are pushed to a repository.
Secret Scanning Automated detection of exposed credentials in code repositories using pattern-matching tools. Platforms like GitHub offer built-in secret scanning that detects known secret formats across all public repositories.
.env File A configuration file conventionally used to store environment variables including database credentials, API keys, and other sensitive runtime configuration. Frequently accidentally committed to public repositories.
🏠 Everyday Analogy: Imagine leaving your house keys under the doormat. Now imagine that doormat is in the middle of a busy public park, and anyone walking by can see the keys. That's essentially what happens when developers commit credentials to public code repositories , the "door" to the organization's infrastructure is wide open, and adversaries only need to walk by and pick up the keys. Even if you later remove the keys from the doormat (delete the file), the footprints showing where the keys were placed remain permanently recorded in the park's security camera footage (Git history). Anyone who knows to check the cameras can still find them.

Real-World Scenario

JM

Jordan Martinez

Senior Backend Developer , Acme Aerospace

A skilled developer working on Acme Aerospace's production infrastructure, Jordan accidentally committed a .env.production file containing database credentials, AWS access keys, and a Stripe payment secret to a public GitHub repository during a late-night debugging session.

❌ Before , The Mistake

On November 3rd, 2024, at 11:47 PM, Jordan was debugging a production database connection issue. Frustrated after hours of troubleshooting, they copied the production .env file into the repository root directory to test locally. The file contained 6 critical secrets: the PostgreSQL admin password, AWS access key and secret key, a Stripe live API key, a Redis password, and the JWT signing secret. Jordan intended to delete the file the next morning but forgot. The .gitignore file was configured to exclude .env but not .env.production.

Severity: CRITICAL , 6 secrets exposed for 47 days

✅ After , The Discovery

On December 20th, Acme Aerospace's security team received an alert from their GitGuardian integration: 6 secrets detected in a public repository. Investigation revealed the AWS keys had been used to spin up $14,200 in unauthorized cloud compute resources. The Stripe key had been tested against the payment API 47 times. The database password had been used in 3 failed login attempts against their production PostgreSQL instance from IP addresses in Eastern Europe. Total incident cost: $87,000 including forensic investigation, credential rotation across 12 services, security audit, and lost productivity.

Remediation: Complete , All secrets rotated, pre-commit hooks deployed

🔒 Key Lesson: The .gitignore file only prevents files from being tracked , it does nothing to protect files that have already been committed. Once a secret enters Git history, it persists even after deletion. Organizations must use GitGuardian or Snyk to scan both active files and entire Git commit history. See also T1589.001 , Credentials and T1592.002 , Software for related techniques.

Step-by-Step Protection Guide

Follow these seven critical steps to protect your organization from code repository-based reconnaissance. Each step includes actionable measures aligned with CISA guidance and NIST Cybersecurity Framework recommendations.

  1. Audit All Public Repositories

    Conduct a comprehensive inventory of every public repository associated with your organization. Search GitHub, GitLab, and Bitbucket for repositories created by current and former employees. Use tools like GitGuardian and gitleaks to scan existing repositories for accidentally committed secrets, API keys, and credentials. Pay special attention to forked repositories and archived projects.

    PREVENT T1596 , Open Technical Databases
  2. Deploy Pre-Commit Secret Scanning

    Install pre-commit hooks (such as detect-secrets, gitleaks, or TruffleHog) on every developer workstation to automatically scan code for secrets before commits are created. Configure CI/CD pipelines to fail builds when secrets are detected. This prevents credentials from entering the codebase at the source. See T1592.002 , Software for related software identification techniques.

    PREVENT
  3. Implement Proper .gitignore Configuration

    Maintain strict .gitignore rules that exclude all environment files (.env, .env.*), credential files, private keys, and configuration directories. Use template files (e.g., .env.example) with placeholder values instead of real credentials. Review and update .gitignore rules quarterly as new file types and tools are adopted.

    PREVENT
  4. Rotate All Previously Exposed Secrets

    If any secret has been committed to a public repository , even briefly and even if deleted immediately , treat it as compromised. Rotate database passwords, API keys, tokens, SSH keys, and encryption keys. Use automated secret rotation where available (AWS IAM access keys, GitHub tokens, etc.). Monitor for unauthorized usage of previously exposed credentials for at least 90 days post-rotation. Reference T1589.001 , Credentials for credential management best practices.

    RESPOND
  5. Enable Platform Secret Scanning

    Activate GitHub's native secret scanning (available for free on public repositories) and push protection features. Configure GitLab's secret detection in CI/CD pipelines. Set up automated alerts so security teams are notified within minutes when a new secret is detected. Enable GitHub's secret scanning partner program to automatically revoke exposed tokens with participating service providers.

    DETECT T1593.002 , Search Engines
  6. Adopt Secrets Management Solutions

    Replace all hardcoded credentials with secrets pulled from secure vaults at runtime. Deploy enterprise secrets management tools such as HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, or GitHub Actions Secrets. Use platform-native secret injection mechanisms so developers never need to handle raw credentials directly. This eliminates the root cause of secret exposure entirely.

    PREVENT
  7. Monitor for Repository-Based Reconnaissance

    Implement continuous monitoring of public code repositories for organizational information leakage. Track mentions of your company name, domain names, internal project names, and employee usernames across GitHub, GitLab, and Bitbucket. Set up Google Alerts and GitHub search notifications. Investigate any forked repositories or code snippets that reference your infrastructure. Leverage threat intelligence from CISA and the MITRE ATT&CK framework to stay current on adversary techniques.

    DETECT T1593 , Search Open Websites/Domains

Common Mistakes & Best Practices

❌ Common Mistakes

Deleting files instead of rotating secrets. Removing a secret file from the current branch does nothing , the credential remains in Git history and can be recovered by anyone who clones or forks the repository.
Relying solely on .gitignore. Gitignore only prevents untracked files from being staged. If a file was ever committed, .gitignore will not remove it from history. Many developers discover this too late.
Using personal GitHub accounts for work code. When employees leave, their personal repositories may contain proprietary code and credentials. There's no organizational control over personal accounts.
Ignoring forked repositories. Forking preserves the entire Git history, including all previously committed secrets. An adversary can fork your repo and access all historical commits even after you clean the original.
Not scanning Git history. Scanning only the current HEAD commit misses secrets in earlier commits, branches, stashes, and tags. Full-history scanning with TruffleHog or gitleaks is essential.

✅ Best Practices

Treat every exposed secret as compromised. Immediately rotate any credential that has ever appeared in a public repository, regardless of how briefly. Assume adversaries captured it within minutes.
Use pre-commit hooks universally. Deploy secret scanning hooks across all developer machines and CI/CD pipelines. Tools like detect-secrets, gitleaks, and TruffleHog catch 95%+ of accidental credential commits.
Enforce organization-level repository policies. Use GitHub Organization settings to require 2FA, restrict repository creation, enable push protection, and mandate secret scanning for all repositories.
Adopt zero-trust secrets management. Use HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault to inject secrets at runtime. Never store secrets in code, configuration files, or container images.
Implement continuous monitoring. Monitor GitHub, GitLab, and Bitbucket for organizational mentions, forked repositories, and leaked credentials. Use GitGuardian, GitHub Secret Scanning, and custom alerts.

Red Team vs Blue Team

🔴 Red Team , Attacker Perspective

01
Target Identification: Use GitHub code search with advanced queries to find repositories belonging to the target organization. Search for company name, domain names, internal project codenames, and employee usernames. Identify repositories that appear to contain infrastructure or deployment code.
02
Secret Harvesting: Search for sensitive file patterns using GitHub dorks: filename:.env password, filename:credentials aws_access_key, filename:id_rsa. Clone repositories and run automated secret scanning tools (TruffleHog, gitleaks) against the full Git history.
03
Infrastructure Mapping: Extract internal URLs, database connection strings, API endpoints, and server hostnames from configuration files. Build a complete map of the target's internal infrastructure including subdomains, service ports, and technology stack.
04
Credential Validation: Test discovered credentials against target services. Attempt database connections with exposed PostgreSQL/MySQL credentials. Validate AWS keys using the STS GetCallerIdentity API. Test API keys against documented endpoints. See T1589.001 , Credentials.
05
Code Analysis for Vulnerabilities: Review source code for hardcoded logic flaws, insecure authentication patterns, vulnerable dependency versions (identifying T1592.002 targets), and proprietary business logic that reveals attack surfaces and exploitation paths.

🔵 Blue Team , Defender Perspective

01
Repository Inventory & Classification: Maintain a complete inventory of all repositories associated with the organization. Classify each as public or private. Audit for orphaned repositories from former employees, discontinued projects, and unintentionally publicized private repos.
02
Automated Secret Scanning: Deploy GitHub Secret Scanning, GitGuardian, or Snyk across all repositories. Configure push protection to block commits containing detected secrets. Set up real-time alerts to security teams for immediate investigation and response.
03
Developer Security Training: Educate all developers on the dangers of committing secrets, proper use of secrets management tools, .gitignore best practices, and the irreversibility of Git history. Include real-world breach case studies (HAFNIUM, Uber, Toyota) in training materials.
04
Incident Response Playbooks: Maintain documented procedures for responding to discovered secret leaks: immediate secret rotation, Git history sanitization (git-filter-repo or BFG Repo Cleaner), notification to affected service providers, and forensic investigation. Coordinate with CISA for significant incidents.
05
External Reconnaissance Defense: Proactively search public repositories for organizational information using the same techniques adversaries use. Monitor GitHub, GitLab, and Bitbucket for company mentions, leaked configurations, and exposed infrastructure details. Leverage T1596 , Open Technical Databases monitoring for correlated threats.

Threat Hunter's Eye

Threat hunters can detect code repository reconnaissance by monitoring for specific patterns of behavior that indicate an adversary is systematically searching public repositories for organizational secrets. The following indicators help security teams identify this activity early and respond before credentials can be weaponized.

🔍 Repository Enumeration Pattern

Adversaries often clone or fork multiple repositories from the same organization in rapid succession. Monitor for unusual spikes in repository access patterns, especially from unfamiliar user accounts or IP addresses. Track GitHub API access logs for queries targeting your organization's repositories, particularly searches that include terms like "password," "secret," "key," "token," or "credential" combined with your company name or domain.

🔒 Credential Validation Attempts

After harvesting credentials from public repositories, adversaries validate them by attempting authentication against target services. Monitor for failed login attempts using recently exposed credential formats, unusual geographic locations attempting database or API authentication, and AWS CloudTrail events showing GetCallerIdentity calls from unrecognized principals. Correlate with T1589.001 , Credentials threat intelligence.

📊 Infrastructure Reconnaissance Correlation

Adversaries combine repository data with other reconnaissance sources. Monitor for correlation between publicly exposed infrastructure details (from code repositories) and active scanning of those targets. If internal URLs, database hostnames, or API endpoints discovered in your code suddenly receive external traffic or vulnerability scan probes, this strongly indicates repository-based reconnaissance has progressed to active targeting.

📄 Fork and Clone Anomalies

Watch for suspicious forking activity on public repositories , especially forks created by newly-registered accounts, accounts with no other activity, or accounts from regions not associated with your organization. Adversaries fork repositories to preserve Git history (including deleted secrets) before the organization can clean them. Use GitHub's audit log to track fork events and clone operations.

🔎 Hunting Queries for Security Teams

# GitHub Code Search Dorks (defensive monitoring)

org:"your-org-name" filename:.env password

org:"your-org-name" filename:credentials aws_access_key

org:"your-org-name" extension:pem PRIVATE KEY

org:"your-org-name" filename:id_rsa

org:"your-org-name" DB_PASSWORD OR SECRET_KEY OR API_KEY

Take Action Now

🛡 Secure Your Code Repositories Today

Every secret exposed in a public repository is a potential entry point for adversaries. The statistics are clear , 80% of credential-related breaches trace back to publicly exposed code. Act now to audit, detect, and remediate secret exposure across all your repositories.

📚 Continue Your Reconnaissance Education

Code repository reconnaissance is one sub-technique within MITRE ATT&CK's broader Search Open Websites/Domains tactic. Understanding the full spectrum of adversary reconnaissance methods helps security teams build comprehensive defense strategies.

🕑 Quick Self-Assessment Checklist

☐ Have you scanned all public repositories for your organization's name and domain?

☐ Are pre-commit hooks deployed to all developer workstations for secret scanning?

☐ Is GitHub Secret Scanning / Push Protection enabled for all repositories?

☐ Have all previously exposed secrets been rotated, even if they were "only briefly" committed?

☐ Do you use a secrets management vault (HashiCorp Vault, AWS Secrets Manager, etc.)?

☐ Is developer security training conducted at least quarterly, including secret management?

☐ Are orphaned repositories from former employees audited and secured?

☐ Do you monitor for suspicious fork/clone activity on your public repositories?

DONATE · SUPPORT

We keep threat intelligence free. No paywalls, no ads. Your donation directly funds server infrastructure, research, and tools. Every contribution - no matter the size - makes this platform sustainable.
100% of your support goes to the platform. No corporate sponsors, just the community.
ROOT::DONATE
Ask ChatGPT
Set ChatGPT API key
Find your Secret API key in your ChatGPT User settings and paste it here to connect ChatGPT with your Courses LMS website.
Certification Courses
Hands-On Labs
Threat Intelligence
Latest Cyber News
MITRE ATT&CK Breakdown
All Cyber Keywords

Every contribution moves us closer to our goal: making world-class cybersecurity education accessible to ALL.

Choose the amount of donation by yourself.