Adversaries may search public code repositories for information about victims that can be used during targeting , including leaked credentials, API keys, internal URLs, and configuration secrets exposed in source code.
Tactic: TA0043 , ReconnaissanceThe following CSS-only animation simulates an adversary browsing a public GitHub repository, discovering sensitive files, and extracting hardcoded secrets from source code and configuration files.
Database password, AWS keys, Stripe key, JWT secret found in plaintext
This simulation demonstrates how sensitive credentials are frequently exposed in public repositories , from database passwords and AWS access keys to payment processing secrets and JWT signing keys.
Public code repositories like GitHub, GitLab, and Bitbucket have become one of the most prolific sources of leaked credentials and sensitive organizational data in modern cyber operations. The scale of the problem is staggering: in 2024 alone, GitHub detected more than 39 million leaked secrets across its platform, and Snyk reported 28.65 million hardcoded secrets were added to public GitHub in 2025 , a 25% increase year-over-year. According to GitHub's own analysis, over 80% of data breaches involving leaked credentials trace back to sensitive information exposed in public repositories. Perhaps most alarmingly, GitGuardian found that 70% of leaked secrets remain active for two years or more, providing adversaries with extended windows of opportunity.
The types of sensitive information commonly found in public repositories include: API keys for cloud services (AWS, Azure, GCP), database credentials, encryption keys, internal infrastructure URLs, CI/CD pipeline tokens, SSH private keys, OAuth tokens, and payment processing secrets. The attack surface extends beyond GitHub to include GitLab, Bitbucket, and self-hosted Git instances , nearly 5 million web servers were found exposing Git metadata in 2026, with 250,000 leaking deployment credentials through exposed .git/config files.
| Term | Definition |
|---|---|
| Code Repository | A version-controlled storage system (e.g., GitHub, GitLab, Bitbucket) where developers store and collaborate on source code. Repositories may be public (accessible to anyone) or private (restricted to authorized users). |
| Hardcoded Secret | A credential, API key, token, or password embedded directly in source code rather than being loaded from a secure vault or environment variable at runtime. |
| Secret Sprawl | The uncontrolled spread of secrets across multiple code repositories, branches, and systems, often resulting from developers copying configuration files or credentials across projects. |
| Git History | The complete record of all changes made to a repository, including deleted files. Even if a developer removes a secret from the current code, it remains accessible in the Git history unless the entire history is rewritten. |
| GitHub Dorking | The practice of using advanced search queries on GitHub to find specific types of sensitive information, such as credentials, API keys, internal URLs, or configuration files within public repositories. |
| Pre-Commit Hook | A script that runs automatically before a Git commit is finalized, commonly used to scan code for secrets before they are pushed to a repository. |
| Secret Scanning | Automated detection of exposed credentials in code repositories using pattern-matching tools. Platforms like GitHub offer built-in secret scanning that detects known secret formats across all public repositories. |
| .env File | A configuration file conventionally used to store environment variables including database credentials, API keys, and other sensitive runtime configuration. Frequently accidentally committed to public repositories. |
A skilled developer working on Acme Aerospace's production infrastructure, Jordan accidentally committed a .env.production file containing database credentials, AWS access keys, and a Stripe payment secret to a public GitHub repository during a late-night debugging session.
On November 3rd, 2024, at 11:47 PM, Jordan was debugging a production database connection issue. Frustrated after hours of troubleshooting, they copied the production .env file into the repository root directory to test locally. The file contained 6 critical secrets: the PostgreSQL admin password, AWS access key and secret key, a Stripe live API key, a Redis password, and the JWT signing secret. Jordan intended to delete the file the next morning but forgot. The .gitignore file was configured to exclude .env but not .env.production.
Severity: CRITICAL , 6 secrets exposed for 47 days
On December 20th, Acme Aerospace's security team received an alert from their GitGuardian integration: 6 secrets detected in a public repository. Investigation revealed the AWS keys had been used to spin up $14,200 in unauthorized cloud compute resources. The Stripe key had been tested against the payment API 47 times. The database password had been used in 3 failed login attempts against their production PostgreSQL instance from IP addresses in Eastern Europe. Total incident cost: $87,000 including forensic investigation, credential rotation across 12 services, security audit, and lost productivity.
Remediation: Complete , All secrets rotated, pre-commit hooks deployed
.gitignore file only prevents files from being tracked , it does nothing to protect files that have already been committed. Once a secret enters Git history, it persists even after deletion. Organizations must use GitGuardian or Snyk to scan both active files and entire Git commit history. See also T1589.001 , Credentials and T1592.002 , Software for related techniques.
Follow these seven critical steps to protect your organization from code repository-based reconnaissance. Each step includes actionable measures aligned with CISA guidance and NIST Cybersecurity Framework recommendations.
Conduct a comprehensive inventory of every public repository associated with your organization. Search GitHub, GitLab, and Bitbucket for repositories created by current and former employees. Use tools like GitGuardian and gitleaks to scan existing repositories for accidentally committed secrets, API keys, and credentials. Pay special attention to forked repositories and archived projects.
PREVENT T1596 , Open Technical DatabasesInstall pre-commit hooks (such as detect-secrets, gitleaks, or TruffleHog) on every developer workstation to automatically scan code for secrets before commits are created. Configure CI/CD pipelines to fail builds when secrets are detected. This prevents credentials from entering the codebase at the source. See T1592.002 , Software for related software identification techniques.
PREVENTMaintain strict .gitignore rules that exclude all environment files (.env, .env.*), credential files, private keys, and configuration directories. Use template files (e.g., .env.example) with placeholder values instead of real credentials. Review and update .gitignore rules quarterly as new file types and tools are adopted.
If any secret has been committed to a public repository , even briefly and even if deleted immediately , treat it as compromised. Rotate database passwords, API keys, tokens, SSH keys, and encryption keys. Use automated secret rotation where available (AWS IAM access keys, GitHub tokens, etc.). Monitor for unauthorized usage of previously exposed credentials for at least 90 days post-rotation. Reference T1589.001 , Credentials for credential management best practices.
RESPONDActivate GitHub's native secret scanning (available for free on public repositories) and push protection features. Configure GitLab's secret detection in CI/CD pipelines. Set up automated alerts so security teams are notified within minutes when a new secret is detected. Enable GitHub's secret scanning partner program to automatically revoke exposed tokens with participating service providers.
DETECT T1593.002 , Search EnginesReplace all hardcoded credentials with secrets pulled from secure vaults at runtime. Deploy enterprise secrets management tools such as HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, or GitHub Actions Secrets. Use platform-native secret injection mechanisms so developers never need to handle raw credentials directly. This eliminates the root cause of secret exposure entirely.
PREVENTImplement continuous monitoring of public code repositories for organizational information leakage. Track mentions of your company name, domain names, internal project names, and employee usernames across GitHub, GitLab, and Bitbucket. Set up Google Alerts and GitHub search notifications. Investigate any forked repositories or code snippets that reference your infrastructure. Leverage threat intelligence from CISA and the MITRE ATT&CK framework to stay current on adversary techniques.
DETECT T1593 , Search Open Websites/Domainsfilename:.env password, filename:credentials aws_access_key, filename:id_rsa. Clone repositories and run automated secret scanning tools (TruffleHog, gitleaks) against the full Git history.Threat hunters can detect code repository reconnaissance by monitoring for specific patterns of behavior that indicate an adversary is systematically searching public repositories for organizational secrets. The following indicators help security teams identify this activity early and respond before credentials can be weaponized.
Adversaries often clone or fork multiple repositories from the same organization in rapid succession. Monitor for unusual spikes in repository access patterns, especially from unfamiliar user accounts or IP addresses. Track GitHub API access logs for queries targeting your organization's repositories, particularly searches that include terms like "password," "secret," "key," "token," or "credential" combined with your company name or domain.
After harvesting credentials from public repositories, adversaries validate them by attempting authentication against target services. Monitor for failed login attempts using recently exposed credential formats, unusual geographic locations attempting database or API authentication, and AWS CloudTrail events showing GetCallerIdentity calls from unrecognized principals. Correlate with T1589.001 , Credentials threat intelligence.
Adversaries combine repository data with other reconnaissance sources. Monitor for correlation between publicly exposed infrastructure details (from code repositories) and active scanning of those targets. If internal URLs, database hostnames, or API endpoints discovered in your code suddenly receive external traffic or vulnerability scan probes, this strongly indicates repository-based reconnaissance has progressed to active targeting.
Watch for suspicious forking activity on public repositories , especially forks created by newly-registered accounts, accounts with no other activity, or accounts from regions not associated with your organization. Adversaries fork repositories to preserve Git history (including deleted secrets) before the organization can clean them. Use GitHub's audit log to track fork events and clone operations.
# GitHub Code Search Dorks (defensive monitoring)
org:"your-org-name" filename:.env password
org:"your-org-name" filename:credentials aws_access_key
org:"your-org-name" extension:pem PRIVATE KEY
org:"your-org-name" filename:id_rsa
org:"your-org-name" DB_PASSWORD OR SECRET_KEY OR API_KEY
Every secret exposed in a public repository is a potential entry point for adversaries. The statistics are clear , 80% of credential-related breaches trace back to publicly exposed code. Act now to audit, detect, and remediate secret exposure across all your repositories.
Code repository reconnaissance is one sub-technique within MITRE ATT&CK's broader Search Open Websites/Domains tactic. Understanding the full spectrum of adversary reconnaissance methods helps security teams build comprehensive defense strategies.
☐ Have you scanned all public repositories for your organization's name and domain?
☐ Are pre-commit hooks deployed to all developer workstations for secret scanning?
☐ Is GitHub Secret Scanning / Push Protection enabled for all repositories?
☐ Have all previously exposed secrets been rotated, even if they were "only briefly" committed?
☐ Do you use a secrets management vault (HashiCorp Vault, AWS Secrets Manager, etc.)?
☐ Is developer security training conducted at least quarterly, including secret management?
☐ Are orphaned repositories from former employees audited and secured?
☐ Do you monitor for suspicious fork/clone activity on your public repositories?
Every contribution moves us closer to our goal: making world-class cybersecurity education accessible to ALL.
Choose the amount of donation by yourself.