Many developers encounter this puzzling scenario: seemingly random form submissions from bots targeting obscure web forms. While CAPTCHA offers a solution, understanding the underlying motives helps build better defenses.
Several technical reasons explain this behavior:
- Vulnerability Scanning: Bots test form handlers for injection flaws (SQL/XSS)
- Email Harvesting: Even fake submissions help map server responses
- Resource Consumption: DDoS precursor by testing server capacity
- SEO Spam: Attempting to create backlinks through form-generated emails
Here are practical solutions with implementation examples:
1. Server-Side Validation
// PHP example of time-based validation
$submit_time = $_SERVER['REQUEST_TIME_FLOAT'];
$form_load_time = $_POST['form_load_time'];
if (($submit_time - $form_load_time) < 3) {
header('HTTP/1.0 403 Forbidden');
die('Suspicious activity detected');
}
2. Hidden Honeypot Field
<!-- HTML form snippet -->
<input type="text" name="website" style="display:none;" tabindex="-1" autocomplete="off">
// PHP validation
if (!empty($_POST['website'])) {
// This is a bot - discard submission
log_spam_attempt();
exit;
}
3. Behavioral Analysis
// JavaScript mouse movement tracker
let mouseMoved = false;
document.addEventListener('mousemove', () => {
mouseMoved = true;
});
document.forms[0].addEventListener('submit', (e) => {
if (!mouseMoved) {
e.preventDefault();
// Likely bot submission
}
});
For critical applications, consider:
- Rate limiting via nginx or Apache modules
- IP reputation services (Cloudflare, Akamai)
- Machine learning based anomaly detection
Implement logging to identify patterns:
// PHP logging example
$spam_log = [
'ip' => $_SERVER['REMOTE_ADDR'],
'user_agent' => $_SERVER['HTTP_USER_AGENT'],
'timestamp' => time(),
'form_data' => $_POST
];
file_put_contents('spam_log.json', json_encode($spam_log)."\n", FILE_APPEND);
Analyze these logs periodically to update your defenses against evolving bot techniques.
Many developers encounter a puzzling scenario: seemingly random form submissions from non-existent users, targeting forms that aren't even linked from the main site. This isn't personal targeting - your small-town application just happened to appear in automated scans.
Automated scripts constantly crawl the web searching for form endpoints because:
- Testing for SQL injection vulnerabilities
- Collecting email addresses for spam lists
- Exploiting unsecured form handlers to send spam
- Looking for open redirect opportunities
- Simple reconnaissance for future attacks
These bots typically work in phases:
1. Discovery phase:
- Google dorking (e.g., inurl:subscribe.php)
- Directory brute-forcing
- Following all links from sitemaps
2. Analysis phase:
- Identifying form parameters
- Checking for common vulnerabilities
3. Exploitation phase:
- Submitting test data
- Attempting XSS/SQLi payloads
- Harvesting data
Here are multiple layers of defense you can implement:
1. Basic Protection
// Simple honeypot field
<input type="text" name="website" style="display:none;">
// Server-side validation
if (!empty($_POST['website'])) {
// This is likely a bot
die();
}
2. Intermediate Solutions
// Rate limiting with Redis
$redis = new Redis();
$redis->connect('127.0.0.1');
$key = 'form_submit:' . $_SERVER['REMOTE_ADDR'];
if ($redis->get($key) > 5) {
header('HTTP/1.1 429 Too Many Requests');
die('Rate limit exceeded');
}
$redis->incr($key);
$redis->expire($key, 3600);
3. Advanced Techniques
// JavaScript challenge (bots often don't execute JS)
<script>
document.addEventListener('DOMContentLoaded', function() {
var token = Math.random().toString(36).substring(2);
document.getElementById('form_token').value = token;
});
</script>
<input type="hidden" id="form_token" name="form_token">
While CAPTCHA works, consider these alternatives:
- Time-based validation (human users take time to fill forms)
- Browser fingerprinting
- Behavioral analysis (mouse movements, typing patterns)
- Proof-of-work challenges
Implement logging to understand attack patterns:
// Log form submission attempts
$logData = [
'timestamp' => time(),
'ip' => $_SERVER['REMOTE_ADDR'],
'user_agent' => $_SERVER['HTTP_USER_AGENT'],
'form_data' => $_POST
];
file_put_contents('form_submissions.log',
json_encode($logData) . PHP_EOL,
FILE_APPEND);
For truly orphaned forms like yours, the most secure solution is complete removal. In IIS:
- Open Internet Information Services (IIS) Manager
- Navigate to the site and locate the form file
- Right-click and select "Delete"
- Consider adding URL rewriting rules to block access