Lessons from a technician

Introduction

I entered IT because I enjoy solving problems.

Early in my career the most rewarding moments often involved troubleshooting something complex: restoring a failed server, recovering corrupted data, or fixing a network outage .

Those moments feel heroic.

Systems come back online.
Users are grateful.
Problems are solved.

But over time something becomes clear.

The technicians who build the most reliable environments are rarely the ones performing the most dramatic recoveries.

They are the ones quietly designing systems where those failures happen less often in the first place.

This article is a reflection on lessons that took me years to understand. Many of them came from mistakes, shortcuts and assumptions that seemed reasonable at the time.

None of these ideas are revolutionary. Most experienced engineers already know them.

But they are the principles I wish someone had emphasised when I first started working in IT.

Documentation Matters More Than Hero Troubleshooting

Every IT environment eventually develops a hero technician.

When something breaks, everyone calls the same person. They know where everything lives. They remember which server runs which service and why a configuration was changed three years ago.

The problem is that this knowledge usually lives in one place: their head.

When environments rely on memory instead of documentation, they become fragile.

Two very different IT environments

Memory-driven infrastructure

One technician knows everything
Troubleshooting takes hours
Changes feel risky
New staff struggle to learn the system

Documented infrastructure

Multiple people understand the system
Problems are diagnosed quickly
Infrastructure evolves safely
Knowledge persists when people leave

Documentation rarely feels exciting.

But over time it quietly becomes one of the most valuable assets in any IT environment.

Automation Beats Manual Skill

Early in my career I spent huge amounts of time doing things manually.

Building machines.
Creating accounts.
Deploying software.
Checking logs.

At the time that felt like productivity.

Eventually you realise manual processes do not scale.

As environments grow, repetitive tasks begin to consume the majority of your time. Automation changes that completely.

A few small scripts or management tools can quietly remove hours of routine work every week.

Examples appear everywhere in real environments.

Automatically provisioning user accounts when new staff or students join
Deploying applications across hundreds of machines without visiting each one
Patching devices automatically overnight instead of manually checking updates
Generating system health reports instead of manually reviewing logs
Alerting technicians when disks or certificates are about to expire
Automatically enrolling devices into management systems when they are first powered on

None of these tasks are technically difficult. But they become extremely time-consuming when repeated hundreds or thousands of times.

Over the lifetime of an environment, automation compounds.

One script that saves ten minutes a day will save more than 60 hours of technician time each year.

That time can then be spent improving infrastructure instead of repeating routine tasks.

Security Fundamentals Are Often Ignored

Most security incidents are not sophisticated.

They happen because basic practices are missing.

Across many environments the same problems appear repeatedly:

shared administrator accounts
systems that have not been patched for months
flat internal networks with no segmentation
services exposed unnecessarily to the internet

These problems rarely appear all at once. They accumulate slowly as environments evolve.

A common pattern in real environments

Systems are built correctly at the start.
Over time small changes are made.
Documentation falls behind.
Temporary fixes become permanent.

Eventually the environment no longer resembles the original design.

Security improves dramatically when teams consistently address the fundamentals.

The NCSC 10 Steps to Cyber Security framework exists largely because those fundamentals matter more than complex tools.

Backups Are Still Broken in 2026

One of the most uncomfortable truths in IT is how many backup systems do not actually work.

Backups fail silently. Storage fills up. Jobs stop running.

Everything appears fine until the moment data is actually needed.

Backup maturity

Level 1 - False confidence
Backups exist but restoration has never been tested.

Level 2 - Monitored backups
Backup jobs are checked and failures investigated.

Level 3 - Recovery ready
Restoration procedures are tested regularly and documented.

A backup strategy is only complete when recovery has been verified.

Monitoring Prevents Most Emergencies

Many outages follow a predictable pattern.

A disk fills up.
A certificate expires.
A service stops responding.

Users discover the problem first.

Monitoring reverses this.

How incidents unfold

Reactive environment

Issue occurs
Users report outage
Technician investigates
Service restored hours later

Monitored environment

Monitoring detects anomaly
Alert triggered automatically
Technician investigates early
Issue resolved before disruption

Monitoring rarely receives much attention.

The Most Important Skill Is Systems Thinking

Over time the biggest shift in perspective is understanding that IT is not about individual machines.

It is about systems.

Servers, networks, identity platforms, security controls and applications all interact with each other.

Small design decisions can have large downstream effects.

Technicians who understand these relationships build more stable environments.

Compound Skills in IT

Foundations

Documentation

Monitoring

Backups

Operational Skills

Automation

Security

Device Management

System Design

Identity & Access

Networking

Infrastructure Architecture

Each skill reinforces the others. Over time they compound into environments that are easier to maintain, easier to secure, and far more resilient.

Final Thoughts

Early in your career it is easy to believe that technical excellence comes from solving difficult problems quickly.

The best technicians build environments where those problems appear less frequently in the first place.

That shift from reactive troubleshooting to proactive engineering is where the real growth in IT happens.

Introduction#

Documentation Matters More Than Hero Troubleshooting#

Automation Beats Manual Skill#

Security Fundamentals Are Often Ignored#

Backups Are Still Broken in 2026#

Monitoring Prevents Most Emergencies#

The Most Important Skill Is Systems Thinking#

Final Thoughts#