A comprehensive, open-source tool for analyzing GitHub organization statistics including repository metrics, contributor activity, code quality insights, and revolutionary multi-organization analysis with GitHub Apps.
Analyze multiple organizations in a single command! Use the new --org-ids
parameter to process multiple GitHub organizations simultaneously, combining all data into unified output files while maintaining organization attribution.
# Analyze multiple organizations in one command
github-org-stats --org-ids "org1:install_id1,org2:install_id2,org3:install_id3" --format all
- π Single Command Multi-Org: Analyze multiple organizations in one run with
--org-ids
- π Unified Output: All repository data combined into single files with organization attribution
- π Enhanced Excel Reports: Additional "Organization_Breakdown" sheet for multi-org analysis
- π Smart Authentication: Automatic GitHub App token management across organizations
- β‘ Efficient Processing: Intelligent distribution of API limits across organizations
- Repository Analysis: Comprehensive metrics including stars, forks, issues, languages, and activity
- Contributor Insights: Detailed contributor analysis with bot filtering capabilities
- Code Quality Metrics: Language statistics, dependency analysis, and security insights
- GitHub App Integration: Enterprise-grade authentication for analyzing multiple organizations
- Flexible Output: JSON, CSV, and Excel formats with rich formatting
- Advanced Filtering: Include/exclude forks, archived repos, empty repos, and bot accounts
- Rate Limit Management: Intelligent rate limiting and retry mechanisms
- Error Handling: Robust error handling with detailed logging and recovery
- Language Name Sanitization: Intelligent handling of problematic language names (C#, C++, F#) to prevent Excel column conflicts
- Dependency Analysis: Detect and analyze dependencies from package.json, requirements.txt, Gemfile, pom.xml, build.gradle, Cargo.toml, and go.mod
- Submodule Detection: Identify and catalog Git submodules
- GitHub Actions Integration: Analyze workflow configurations and recent runs
- Branch Protection Analysis: Check default branch protection settings
- Release Tracking: Monitor latest releases and version information
- Security Insights: Collaborator analysis, team permissions, and admin detection
- Bot Detection: Advanced bot account filtering with configurable patterns
- Performance Optimization: Adaptive batch sizing and memory management
- Python 3.7 or higher
- pip package manager
git clone https://github.com/zoharbabin/github-org-stats.git
cd github-org-stats
pip install -e .
pip install github-org-stats
Analyze multiple organizations in a single command:
# Set environment variables
export GITHUB_APP_ID=12345
export GITHUB_PRIVATE_KEY_PATH=/path/to/private-key.pem
# Analyze multiple organizations
github-org-stats \
--org-ids "org1:install_id1,org2:install_id2,org3:install_id3" \
--include-forks \
--include-archived \
--exclude-bots \
--max-repos 6000 \
--days-back 365 \
--format all \
--output-dir ./multi_org_reports
Analyze a single GitHub organization:
# With personal access token
python github_org_stats.py --org your-org --token ghp_your_token_here
# With GitHub App
python github_org_stats.py \
--org your-org \
--app-id 12345 \
--private-key /path/to/private-key.pem \
--installation-id 67890
# Generate all output formats with comprehensive analysis
python github_org_stats.py \
--org your-org \
--token ghp_token \
--format all \
--exclude-bots \
--include-forks \
--include-archived \
--max-repos 1000 \
--days-back 365 \
--output-dir ./reports
--token
- GitHub personal access token--app-id
- GitHub App ID for authentication--private-key
- Path to GitHub App private key file--installation-id
- GitHub App installation ID (supports multiple: "org1:id1,org2:id2" or single: "12345")--installation-ids
- Alias for --installation-id
--org
- GitHub organization name to analyze (single organization mode)--org-ids
- NEW Multiple organizations with installation IDs in format "org1:id1,org2:id2" (multi-organization mode)--repos
- Specific repositories to analyze (space-separated list)--days-back
- Number of days to look back for activity (default: 30)
The new --org-ids
parameter enables analyzing multiple organizations in a single command run:
- Format:
"org1:installation_id1,org2:installation_id2,org3:installation_id3"
- All data is combined into unified output files
- Excel output includes an additional "Organization_Breakdown" sheet
- Each repository record includes an "organization" field
- Cannot be used together with
--org
(choose single or multi-organization mode)
--output-dir
- Output directory for reports (default: output)--format
- Output format: json, csv, excel, all (default: excel)--config
- Configuration file path (JSON format)
--log-level
- Logging level: DEBUG, INFO, WARNING, ERROR, CRITICAL (default: INFO)--log-file
- Log file path (default: console only)
--include-forks
- Include forked repositories in analysis--include-archived
- Include archived repositories in analysis--max-repos
- Maximum number of repositories to analyze (default: 100)--exclude-bots
- Exclude bot accounts from contributor analysis and commit statistics--include-empty
- Include repositories with no commits in the specified timeframe
For individual use or small-scale analysis:
- Go to GitHub Settings β Developer settings β Personal access tokens
- Generate a new token with these permissions:
repo
- Full control of private repositoriesread:org
- Read organization membershipread:user
- Read user profile data
# Using token directly
python github_org_stats.py --org your-org --token ghp_your_token_here
# Using environment variable
export GITHUB_TOKEN=ghp_your_token_here
python github_org_stats.py --org your-org --token $GITHUB_TOKEN
For enterprise use, multi-organization analysis, and higher rate limits:
-
Go to GitHub Settings β Developer settings β GitHub Apps
-
Create a new GitHub App with these permissions:
- Repository permissions:
- Contents: Read
- Issues: Read
- Metadata: Read
- Pull requests: Read
- Actions: Read
- Organization permissions:
- Members: Read
- Administration: Read
- Repository permissions:
-
Generate and download a private key
-
Install the app on target organizations
-
Note the App ID and Installation IDs
# Single organization
python github_org_stats.py \
--org your-org \
--app-id 12345 \
--private-key /path/to/private-key.pem \
--installation-id 67890
# Multiple organizations
python github_org_stats.py \
--org your-org \
--app-id 12345 \
--private-key /path/to/private-key.pem \
--installation-id "org1:111,org2:222,org3:333"
export GITHUB_APP_ID=12345
export GITHUB_PRIVATE_KEY_PATH=/path/to/private-key.pem
python github_org_stats.py --org your-org
Professional Excel workbook with multiple sheets:
- Repository_Data: Complete repository information with all metrics and organization attribution
- Summary: High-level statistics across all analyzed organizations
- π Organization_Breakdown: Per-organization statistics (multi-org mode only)
- Contributors: Top contributors analysis with contribution counts
- Languages: Language distribution and code statistics
- Errors: Error tracking and debugging information
Structured JSON with complete data hierarchy:
Single Organization:
{
"organizations": ["your-org"],
"analysis_mode": "single-organization",
"analyzed_at": "2025-05-29T22:30:00",
"total_repositories": 150,
"repositories": [...]
}
Multi-Organization:
{
"organizations": ["org1", "org2", "org3"],
"analysis_mode": "multi-organization",
"analyzed_at": "2025-05-29T22:30:00",
"total_repositories": 450,
"repositories": [
{
"organization": "org1",
"name": "repo1",
"full_name": "org1/repo1",
...
}
]
}
Flattened data suitable for spreadsheet analysis and data processing tools, with organization column for multi-org analysis.
Use a JSON configuration file for complex setups:
python github_org_stats.py --config config/example_config.json --org your-org
Example configuration:
{
"authentication": {
"app_id": 12345,
"private_key_path": "/path/to/private-key.pem",
"installation_mappings": {
"org1": 67890,
"org2": 11111
}
},
"analysis": {
"days_back": 60,
"max_repos": 200,
"include_forks": false,
"exclude_bots": true
},
"output": {
"format": "excel",
"output_dir": "./reports"
}
}
Intelligent handling of problematic programming language names that can cause issues in Excel exports:
- C# β CSharp: Prevents conflicts with C language statistics
- C++ β CPlusPlus: Avoids Excel column name parsing issues
- F# β FSharp: Ensures proper Excel compatibility
Benefits:
- Eliminates Excel column name conflicts
- Preserves accurate language statistics and byte counts
- Maintains data integrity across all output formats
- Automatic transformation with comprehensive logging
Example:
// Before sanitization
"languages": {"C#": 1500000, "C": 800000, "C++": 500000}
// After sanitization
"languages": {"CSharp": 1500000, "C": 800000, "CPlusPlus": 500000}
Automatically detects and analyzes dependencies from:
- Node.js: package.json
- Python: requirements.txt
- Ruby: Gemfile
- Java: pom.xml
- Gradle: build.gradle
- Rust: Cargo.toml
- Go: go.mod
Advanced bot account filtering with configurable patterns:
- GitHub Actions bots
- Dependabot and Renovate
- Code quality bots (CodeCov, SonarCloud)
- Security bots (Snyk, WhiteSource)
- Custom bot patterns
- Workflow count and status
- Recent workflow runs
- Action configuration analysis
- Branch protection settings
- Collaborator permissions
- Team access analysis
- Admin user identification
Run the comprehensive test suite:
cd tests
python test_github_org_stats.py
Run specific test categories:
python test_github_org_stats.py --category auth
python test_github_org_stats.py --category data
python test_github_org_stats.py --category excel
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Install in development mode:
pip install -e .[dev]
- Make your changes
- Run tests:
python -m pytest tests/
- Run linting:
black . && flake8
- Commit your changes:
git commit -m 'Add amazing feature'
- Push to the branch:
git push origin feature/amazing-feature
- Open a Pull Request
Development dependencies are defined in pyproject.toml
and can be installed with:
pip install -e .[dev]
This project uses:
- Black for code formatting
- Flake8 for linting
- MyPy for type checking
Error: Authentication required
Solution: Ensure you provide either --token
or both --app-id
and --private-key
Rate limit exceeded
Solution:
- Use GitHub App authentication for higher limits
- Reduce
--max-repos
value - Increase
--days-back
to reduce API calls
403 Forbidden
Solution:
- Verify token has required permissions (
repo
,read:org
,read:user
) - For GitHub Apps, ensure proper installation and permissions
MemoryError or system slowdown
Solution:
- Reduce
--max-repos
value - Use
--format json
or--format csv
instead of Excel - Process organizations in smaller batches
Enable debug logging for detailed troubleshooting:
python github_org_stats.py \
--org your-org \
--token your-token \
--log-level DEBUG \
--log-file debug.log
For large organizations:
# Optimize for speed
python github_org_stats.py \
--org large-org \
--token your-token \
--max-repos 500 \
--days-back 30 \
--exclude-bots \
--format json
Analyze multiple organizations in a single run using the new --org-ids
parameter:
# Using environment variables (recommended)
export GITHUB_APP_ID=12345
export GITHUB_PRIVATE_KEY_PATH=/secure/enterprise-key.pem
python github_org_stats.py \
--org-ids "kaltura:68242466,kaltura-ps:68357040" \
--include-forks \
--include-archived \
--exclude-bots \
--include-empty \
--max-repos 6000 \
--days-back 365 \
--format all \
--log-level INFO \
--log-file multi_org_analysis.log \
--output-dir ./multi_org_reports
python github_org_stats.py \
--org-ids "first-org:11111,second-org:22222,third-org:33333" \
--app-id 12345 \
--private-key /secure/enterprise-key.pem \
--include-forks \
--include-archived \
--exclude-bots \
--include-empty \
--max-repos 9000 \
--days-back 365 \
--format all \
--log-level INFO \
--log-file multi_org_analysis.log \
--output-dir ./multi_org_reports
For analyzing a single organization:
python github_org_stats.py \
--org enterprise-org \
--installation-id 67890 \
--include-forks \
--include-archived \
--exclude-bots \
--include-empty \
--max-repos 3000 \
--days-back 365 \
--format all \
--log-level INFO \
--log-file enterprise_analysis.log \
--output-dir ./enterprise_reports
python github_org_stats.py \
--org enterprise-org \
--app-id 12345 \
--private-key /secure/enterprise-key.pem \
--installation-id 67890 \
--include-forks \
--include-archived \
--exclude-bots \
--include-empty \
--max-repos 3000 \
--days-back 365 \
--format all \
--log-level INFO \
--log-file enterprise_analysis.log \
--output-dir ./enterprise_reports
python github_org_stats.py \
--org your-org \
--token ghp_token \
--max-repos 10 \
--days-back 7 \
--format json
python github_org_stats.py \
--org your-org \
--token ghp_token \
--include-forks \
--include-archived \
--exclude-bots \
--include-empty \
--max-repos 1000 \
--days-back 365 \
--format all \
--log-level INFO \
--log-file comprehensive_analysis.log \
--output-dir ./comprehensive_reports
For organizations with many repositories:
python github_org_stats.py \
--org large-org \
--installation-id 99999 \
--include-forks \
--include-archived \
--exclude-bots \
--include-empty \
--max-repos 5000 \
--days-back 365 \
--format all \
--log-level INFO \
--log-file large_org_analysis.log \
--output-dir ./large_org_reports
This project is licensed under the MIT License - see the LICENSE file for details.
- Documentation: This comprehensive README and example configurations
- Issues: Report bugs or request features via GitHub Issues
- Discussions: Join the conversation in GitHub Discussions
- Thanks to all contributors who have helped improve this tool
- Built with PyGithub for GitHub API access
- Inspired by the need for comprehensive GitHub organization analysis
- Special thanks to the open-source community for feedback and contributions
Made with β€οΈ by the open-source community
Version 1.1.0 | Changelog | Contributing Guidelines
- π Multi-Organization Analysis: Analyze multiple GitHub organizations in a single command
- π§ Enhanced Authentication: Better environment variable support and GitHub App integration
- π Improved Excel Output: Organization breakdown sheets and enhanced reporting
- π οΈ Better Error Handling: More robust authentication and API error management
- π Updated Documentation: Comprehensive examples and usage guides
Ready to analyze your GitHub organizations? Get started with the Quick Start guide!