Idempotent Ansible Playbook: Creating Non-Root User & Securing SSH While Maintaining Re-Run Capability


8 views

When hardening new Linux servers, we typically want to:

  1. Create a privileged non-root user
  2. Disable root SSH access
  3. Disable password authentication

But this creates a chicken-and-egg problem - after the first run, you can't re-run the playbook as root!

The key is implementing conditions in your playbook to handle both initial bootstrap and subsequent runs. Here's a robust approach:


---
- name: Secure server bootstrap
  hosts: all
  gather_facts: yes
  vars:
    admin_user: "deploy"
    ssh_port: 2222
    
  tasks:
    - name: Check if admin user exists
      ansible.builtin.stat:
        path: "/home/{{ admin_user }}"
      register: admin_user_home
      changed_when: false
    
    - name: Create admin user (if missing)
      ansible.builtin.user:
        name: "{{ admin_user }}"
        groups: "sudo"
        shell: "/bin/bash"
        state: present
        append: yes
      when: not admin_user_home.stat.exists
    
    - name: Set up authorized_keys
      ansible.posix.authorized_key:
        user: "{{ admin_user }}"
        state: present
        key: "{{ lookup('file', '~/.ssh/id_rsa.pub') }}"
    
    - name: Ensure sudoers file exists
      ansible.builtin.file:
        path: "/etc/sudoers.d/{{ admin_user }}"
        state: touch
        mode: "0440"
    
    - name: Configure passwordless sudo
      ansible.builtin.lineinfile:
        path: "/etc/sudoers.d/{{ admin_user }}"
        line: "{{ admin_user }} ALL=(ALL) NOPASSWD:ALL"
        validate: "visudo -cf %s"
    
    - name: Configure SSHD (only if we're connecting as root)
      block:
        - name: Update SSH config
          ansible.builtin.template:
            src: templates/sshd_config.j2
            dest: /etc/ssh/sshd_config
            validate: "sshd -t -f %s"
            mode: "0600"
          notify: restart sshd
        
        - name: Ensure root has temporary SSH key
          ansible.posix.authorized_key:
            user: "root"
            state: present
            key: "{{ lookup('file', '~/.ssh/id_rsa.pub') }}"
            exclusive: no
          when: ansible_user == "root"
      
      when: ansible_user == "root"
    
  handlers:
    - name: restart sshd
      ansible.builtin.service:
        name: sshd
        state: restarted

Port {{ ssh_port }}
PermitRootLogin no
PubkeyAuthentication yes
PasswordAuthentication no
ChallengeResponseAuthentication no
UsePAM yes
X11Forwarding no
PrintMotd no
AcceptEnv LANG LC_*
Subsystem sftp /usr/lib/openssh/sftp-server

Follow this workflow:

  1. First run (as root): ansible-playbook -i inventory.ini bootstrap.yml -u root -k
  2. Subsequent runs: ansible-playbook -i inventory.ini bootstrap.yml -u {{ admin_user }}

For production environments, add these safeguards:

  • Emergency breakglass access (separate bastion host)
  • Playbook pre-flight checks for connectivity
  • Multi-factor authentication setup
  • Fail2ban integration

When automating server provisioning with Ansible, we often encounter a chicken-and-egg problem with SSH security hardening. The standard workflow involves:

  1. Initial root login (often with password authentication)
  2. Creating a privileged non-root user
  3. Disabling root login and password authentication

This creates an idempotency challenge - after the first run, subsequent playbook executions fail because we've locked out the initial access method. Here's how to solve this elegantly.

The key is implementing conditional logic that detects the current server state:


- name: Check if sudo user exists
  ansible.builtin.command: grep -q '^{{ sudo_username }}:' /etc/passwd
  register: user_exists
  ignore_errors: yes
  changed_when: false

- name: Create sudo user if doesn't exist
  ansible.builtin.user:
    name: "{{ sudo_username }}"
    groups: sudo
    append: yes
    shell: /bin/bash
    state: present
  when: not user_exists.rc == 0

Implement a phased approach where security hardening only occurs after initial setup:


- name: Phase 1 - Initial setup (runs as root)
  hosts: new_servers
  vars:
    initial_root_login: true
  tasks:
    - include_tasks: tasks/create_user.yml

- name: Phase 2 - Security hardening (runs as sudo user)
  hosts: new_servers
  vars:
    initial_root_login: false
  tasks:
    - include_tasks: tasks/harden_ssh.yml

Here's a comprehensive playbook that handles all scenarios:


---
- name: Bootstrap server securely
  hosts: all
  gather_facts: false
  vars:
    sudo_username: deploy
    ssh_port: 2222

  tasks:
    - name: Check if we can connect as sudo user
      ansible.builtin.command: true
      delegate_to: "{{ inventory_hostname }}"
      become: yes
      become_user: "{{ sudo_username }}"
      register: can_connect_as_sudo
      ignore_errors: yes
      changed_when: false

    - block:
        - name: Create sudo user (root phase)
          ansible.builtin.user:
            name: "{{ sudo_username }}"
            groups: sudo
            shell: /bin/bash
            state: present
            ssh_key: "{{ lookup('file', '~/.ssh/id_rsa.pub') }}"

        - name: Ensure sudo user can passwordless sudo
          ansible.builtin.lineinfile:
            path: /etc/sudoers.d/{{ sudo_username }}
            create: yes
            line: "{{ sudo_username }} ALL=(ALL) NOPASSWD:ALL"
            validate: 'visudo -cf %s'

      when: not can_connect_as_sudo is success

    - name: Harden SSH config (sudo phase)
      block:
        - name: Disable root login
          ansible.builtin.lineinfile:
            path: /etc/ssh/sshd_config
            regexp: '^PermitRootLogin'
            line: "PermitRootLogin no"
            state: present
            backup: yes

        - name: Disable password authentication
          ansible.builtin.lineinfile:
            path: /etc/ssh/sshd_config
            regexp: '^PasswordAuthentication'
            line: "PasswordAuthentication no"
            state: present

        - name: Change SSH port
          ansible.builtin.lineinfile:
            path: /etc/ssh/sshd_config
            regexp: '^Port'
            line: "Port {{ ssh_port }}"
            state: present

        - name: Restart sshd
          ansible.builtin.service:
            name: sshd
            state: restarted
      when: can_connect_as_sudo is success or inventory_hostname in groups['new_servers']

For enterprise deployments, consider these additional measures:

  • Use Ansible vault for sensitive variables
  • Implement MFA for SSH access
  • Configure fail2ban for brute force protection
  • Set up centralized logging for SSH attempts
  • Create separate playbooks for initial provisioning vs ongoing maintenance