“A few local admin accounts” looks pragmatic on network devices in the short term; over time it produces three problems: no trail, no roles, no rollback. And as the team grows, “who logged into which device when?” turns into a question nobody can answer. That’s why TACACS+ isn’t simply a “login” tool — it’s an operational audit layer.
The goal: solve three things at once
What makes TACACS+ valuable in production is three uses, working together:
- Authentication: who logged in?
- Authorization: which role does this person hold and which commands can they run?
- Accounting: what did they do, which command did they run, how long was the session?
Minimum architecture: two TACACS+ servers + identity source + log pipeline
A practical deployment needs:
- TACACS+ servers (HA): at least 2 nodes (separate site / zone where possible)
- Identity source: AD/LDAP/SSO (with group mapping)
- Role policy: group → role → permitted command set
- Log pipeline: TACACS+ accounting logs → central log / SIEM
Two extras that pay outsized returns:
- Break-glass: controlled emergency access
- Config backup: automatic “running-config snapshot” after every change
Role design: start with 3 roles (expand later)
The starting roles that work best in the field:
- ReadOnly: show / diagnose (no config)
- Operator: limited config (clearly bounded — interface up/down, BGP neighbor reset and similar)
- Admin: full privilege (held by few people)
Management cost rises with the role count. So the first goal isn’t “perfect RBAC” but rather cutting local accounts and producing a trail.
Command authorization: prove the boundary technically
Command control delivers two crucial benefits:
- It cuts down on accidental “dangerous” commands
- It backs the “separation of duties” audit clause with technical proof
A practical approach:
- High-risk commands like “config terminal / write memory / reload” stay admin-only
- Operational intervention commands (e.g. specific resets) belong to the operator role with a tight boundary
- Keep the readonly role at “show only”
Accounting: turn the session record into ‘incident evidence’
TACACS accounting data accelerates two things during an incident:
- It answers “who changed what?” within minutes
- It puts evidence — not assumptions — into the postmortem
Minimum log fields:
- user, device, source IP
- session start / end
- command(s) executed
- result (permit / deny)
Resilience: what happens when TACACS+ is down?
The “fail-open or fail-closed?” call matters here:
- Fail-open: when TACACS is down, local fallback keeps access alive (operations survive, risk goes up)
- Fail-closed: when TACACS is down, access is cut (security wins, but in a crisis it can be catastrophic)
A common balance in the field:
- On critical production devices, controlled local fallback (break-glass) plus strong alarms
- On the management network, an out-of-band channel plus a “TACACS down” runbook
Break-glass: make emergency access systematic
Break-glass is not “let’s write the password down somewhere”. A healthy model:
- Time-bound activation (e.g. 30 min)
- Two-person approval (at least on critical segments)
- Mandatory session recording
- Rehearsals (a few times per year)
Operational checklist (first 30 days)
- Local admin account inventory completed
- 3 roles defined (ReadOnly / Operator / Admin)
- Command allowlist draft written
- Accounting logs flowing into central log
- TACACS healthcheck + alarm in place
- Break-glass procedure and rehearsal calendar written
Conclusion
Standing up AAA on network devices with TACACS+ is more than centralizing identity: it secures operations through roles, command control, and an audit trail. The most solid approach I’ve seen in the field is to start with few roles, run accounting from day one, and tie the TACACS outage / break-glass scenario into a runbook. Set up that way, TACACS+ stops being a “security project” and becomes the secure surface where daily operations actually run.