Most developers can write brilliant application code but freeze when faced with a 502 Bad Gateway error. The reality is that understanding basic system administration makes you a more versatile and effective programmer. Here's what you should know:
# Process management ps aux | grep python # Find running Python processes kill -9 [PID] # Force kill unresponsive process # Network diagnostics netstat -tulnp # List all listening ports traceroute google.com # Network path analysis # Disk space analysis df -h # Show disk usage du -sh * | sort -hr # Find space hogs
Developers often debug using IDE tools but ignore server logs. For web applications:
# Nginx/Apache error logs tail -f /var/log/nginx/error.log # Real-time log monitoring grep "500" /var/log/apache2/error.log # Filter HTTP 500 errors # Application log best practice (Python example) import logging logging.basicConfig( filename='/var/log/myapp.log', level=logging.ERROR, format='%(asctime)s - %(levelname)s - %(message)s' )
When you need to access databases or internal services:
ssh -L 3306:localhost:3306 user@production-server # MySQL tunnel ssh -D 8080 user@staging-server # SOCKS proxy for web debugging
Every developer should implement these in their applications:
# Firewall basics (UFW example) sudo ufw allow 22/tcp # SSH sudo ufw allow 80/tcp # HTTP sudo ufw enable # SSH security in /etc/ssh/sshd_config PasswordAuthentication no PermitRootLogin no AllowUsers developer1 developer2
Basic shell scripting knowledge saves hours:
#!/bin/bash # Simple deployment script rsync -avz --delete ./dist/ user@server:/var/www/html/ ssh user@server "systemctl restart nginx"
strace -p [PID]
- Trace system callslsof -i :8080
- Find processes using port 8080journalctl -xe
- Systemd service debugging
The line between developer and sysadmin is blurring in modern DevOps environments. Investing time in these skills will make you more self-sufficient and valuable.
As programmers, we often focus on writing clean code while taking infrastructure for granted - until we face production issues without sysadmin support. Here's what I've learned through painful experience about maintaining systems while shipping code.
# Process management
ps aux | grep python
kill -9 [PID]
top/htop
# Network diagnostics
netstat -tulnp
ss -plant
traceroute google.com
ping -c 4 8.8.8.8
# Disk space management
df -h
du -sh *
lsof | grep deleted # Find processes holding deleted files
Every developer should understand:
# Basic cron syntax
* * * * * /path/to/command arg1 arg2
# Common patterns:
0 */6 * * * # Every 6 hours
@reboot # Run on startup
30 3 * * 1-5 # 3:30 AM weekdays
# Log rotation example in /etc/logrotate.d/
/var/log/myapp.log {
daily
missingok
rotate 7
compress
delaycompress
notifempty
create 640 root adm
postrotate
/usr/bin/systemctl reload myapp.service > /dev/null
endscript
}
When you're your own sysadmin:
# ~/.ssh/config example
Host myserver
HostName 192.168.1.100
User deploy
Port 2222
IdentityFile ~/.ssh/deploy_key
ServerAliveInterval 60
# Key generation (Ed25519 recommended)
ssh-keygen -t ed25519 -a 100 -f ~/.ssh/production_key
# Basic hardening in /etc/ssh/sshd_config
PermitRootLogin no
PasswordAuthentication no
MaxAuthTries 3
ClientAliveInterval 300
Essential commands to memorize:
# When things break:
journalctl -xe -u nginx --no-pager # Service logs
dmesg | tail -20 # Kernel messages
strace -p [PID] # Process system calls
nc -zv example.com 443 # Port testing
# Filesystem repair:
fsck /dev/sda1
mount -o remount,rw / # Remount read-write if frozen
Quick diagnostics every dev should know:
# Memory usage
free -m
vmstat 1
# IO bottlenecks
iostat -x 1
iotop
# Network throughput
iftop -nP
nethogs
# Python-specific example
import resource
soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE)
resource.setrlimit(resource.RLIMIT_NOFILE, (hard, hard))
Avoid these common pitfalls:
# Simple encrypted backup with rsync
rsync -avz -e "ssh -p 2222" /critical/data user@backup:/storage \
--delete --backup --backup-dir=$(date +\%Y-\%m-\%d)
# MySQL backup example
mysqldump -u root -p --single-transaction \
--routines --triggers --all-databases \
| gpg -c --batch --passphrase mysecret \
> backup_$(date +\%F).sql.gpg
# Test restores monthly!
Recognize your limits - some scenarios require experts:
- RAID array failures
- Complex network partitioning issues
- Kernel panics without clear logs
- Security breaches requiring forensics