When working with GitHub repositories in bash scripts, you often need to extract just the repository name from various URL formats. GitHub URLs can come in several forms:
git://github.com/user/repo.git
git@github.com:user/repo.git
https://github.com/user/repo.git
The challenge is to create a bash solution that reliably extracts "repo" from any of these formats.
Here's a robust bash function that handles all common GitHub URL formats:
extract_repo_name() {
local url="$1"
# Remove protocol prefixes
url=${url#git://}
url=${url#git@}
url=${url#https://}
# Remove domain part
url=${url#github.com[/:]}
# Remove .git suffix if present
url=${url%.git}
# Extract the last part after /
repo_name=${url##*/}
echo "$repo_name"
}
Let's verify it works with all URL formats:
extract_repo_name "git://github.com/some-user/my-repo.git"
# Output: my-repo
extract_repo_name "git@github.com:some-user/my-repo.git"
# Output: my-repo
extract_repo_name "https://github.com/some-user/my-repo.git"
# Output: my-repo
For those who prefer a one-liner using sed:
echo "git@github.com:some-user/my-repo.git" | sed -E 's/.*github.com[/:][^/]*\/([^.]*).*/\1/'
The solution handles most cases, but you might want to add validation for:
- URLs without .git extension
- URLs with multiple path segments
- Invalid URLs
This technique is useful for:
- Automating git operations in scripts
- Generating local directory names from URLs
- Creating log messages with repository names
When working with GitHub repositories in bash scripts, we commonly encounter three main URL formats:
git://github.com/user/repo.git
git@github.com:user/repo.git
https://github.com/user/repo.git
The key challenge is creating a solution that works reliably across all these formats while handling edge cases like:
- Different protocol prefixes (git://, git@, https://)
- Optional .git suffix
- Potential subdirectories or branch names
Here's a comprehensive bash function that handles all cases:
extract_repo_name() {
local url=$1
# Remove protocol prefixes
url=${url#git://}
url=${url#git@}
url=${url#https://}
# Remove domain part
url=${url#github.com[:/]}
# Remove .git suffix if present
url=${url%.git}
# Extract the last path component
repo_name=${url##*/}
echo "$repo_name"
}
Let's verify it works with all URL formats:
# Test cases
extract_repo_name "git://github.com/some-user/my-repo.git" # Output: my-repo
extract_repo_name "git@github.com:some-user/my-repo.git" # Output: my-repo
extract_repo_name "https://github.com/some-user/my-repo.git" # Output: my-repo
extract_repo_name "https://github.com/org/another.repo" # Output: another.repo
For those who prefer regular expressions:
extract_with_regex() {
[[ $1 =~ ([^/:]+)/?$ ]] && echo "${BASH_REMATCH[1]%.git}"
}
The solution should also work with:
- URLs without .git suffix
- URLs with additional path components
- URLs with port numbers or authentication
# Additional test cases
extract_repo_name "git@github.com:user/repo" # no .git
extract_repo_name "https://github.com/org/subdir/repo.git" # with subdir
The parameter expansion method is generally faster than regex for simple cases. Benchmark with:
time for i in {1..1000}; do extract_repo_name "https://github.com/user/repo.git"; done