Ansible troubleshooting
UNREACHABLE / SSH connection failed
Host down, wrong ansible_host, security group blocks port 22, or
key not used. Test manually:
ssh -i key user@host. Set
ansible_ssh_private_key_file or use ssh-agent — see
SSH lab. Increase timeout:
ansible_ssh_timeout=60. For bastion jumps, configure
ansible_ssh_common_args ProxyJump.
Permission denied / become failed
User lacks passwordless sudo. Add become: true and
become_user: root, or run with --ask-become-pass.
sudoers may restrict commands. Test:
ansible web -b -a "whoami" — should return root.
Python interpreter not found
Target needs Python for most modules. Bootstrap with raw module:
ansible -m raw -a "apt install -y python3" on Debian/Ubuntu.
Set ansible_python_interpreter=/usr/bin/python3 in inventory for
mixed distros or minimal images.
Task always shows changed
Not idempotent — command/shell without
creates, removes, or changed_when.
Switch to a proper module (apt, copy with checksum).
Commands with timestamps in output trigger false changes.
Template or variable undefined
Jinja error: variable not in scope. Check spelling, group_vars,
host_vars, role defaults vs vars
precedence. Debug with ansible.builtin.debug: var=myvar. Typos in
hostvars['otherhost'] when delegating facts.
Handler not running
Handlers run only when a task notifys them and that task reports
changed. If the task shows ok, handler is skipped.
Handlers flush at end of play — fatal error before flush skips them. Use
meta: flush_handlers to force mid-play.
Module not found / wrong collection
FQCN required in Ansible 2.10+ (ansible.builtin.copy). Install
missing collection:
ansible-galaxy collection install community.general. Pin versions
in requirements.yml so CI matches laptop.
Playbook very slow
Fact gathering on hundreds of hosts — disable with
gather_facts: false when not needed. Lower forks default (5) —
increase -f 20 if network and targets allow. Mitogen strategy
plugin speeds SSH (third party). Package modules with update_cache
on every run hit apt mirrors repeatedly — cache wisely.
Debugging workflow
1. Connectivity
ansible all -m ping
ansible problematic_host -m ping -vvv2. Dry run
ansible-playbook site.yml --check --diff -l web013. Single task
ansible-playbook site.yml --start-at-task "Task name" -vvv
# Or: ansible web -m ansible.builtin.service -a "name=nginx state=started" -b