In the world of enterprise computing, the IBM i platform is renowned for its reliability. However, that stability can sometimes lead to a dangerous complacency. When a system is "always on," it is easy to fall into the trap of reactive troubleshooting: waiting for users to report slowdowns or discovering disk space issues long after they should have been addressed.
In a recent webinar hosted by Fortra, What You Should Be Monitoring on IBM i, Senior Solutions Engineer Bob Butcher and Senior Support Analyst Terri Preston emphasized that the key to maintaining system health isn't just about watching metrics: it’s about shifting from reactive firefighting to proactive operational awareness.
The Challenge of Modern IBM i Management
Today’s IT teams face a unique set of challenges. As Bob Butcher noted, the "tribal knowledge" of long-time administrators is retiring, leaving newer staff, often with backgrounds in networking or programming rather than dedicated IBM i administration, to manage these critical systems. Without the deep, historical context of how the system behaves, these teams need tools that bridge the gap, providing clear, actionable signals rather than just raw data.
Moving to Exception-Based Monitoring
The core philosophy shared during the webinar is the move toward "managing by exception." Instead of manually checking every subsystem, job queue, and CPU percentage throughout the day, modern monitoring tools like Robot Monitor allow teams to define what "normal" looks like and only alert them when things drift outside those parameters. This approach delivers four critical values to IT operations:
- Visibility: Having a clear operational signal across the entire environment.
- Faster Response: Identifying issues before they become business-impacting events.
- Consistency: Ensuring every shift and every operator is looking at the same standard set of performance indicators.
- Confidence: Empowering newer staff to handle issues with the support of automated thresholds and guidance.
What Should You Actually Monitor?
A practical IBM i monitoring approach should help teams improve visibility, reduce manual system checking, and respond faster to operational issues. Tools like Robot Monitor can support that approach by helping administrators focus on the system conditions that deserve attention, rather than forcing them to dig through every possible metric manually.
While it is tempting to monitor everything, the speakers stressed the importance of focusing on indicators that directly impact system health and business operations. Key areas highlighted for consistent monitoring include:
- System Performance: CPU usage, disk space (in both Sysbase and IASP), and temporary storage.
- Job and Subsystem Health: Tracking critical subsystems to ensure they are active and identifying which jobs are consuming excessive resources.
- Application Activity: Monitoring message queues and specific application-level performance metrics.
- Operational Integrity: Keeping an eye on PTF levels, logical replication (like MIMIX or Robot HA), and even Power HA or VIOS environments.
The Power of Dashboards
A critical part of the presentation was the demonstration of how dashboards transform raw data into actionable intelligence. By creating custom dashboards, administrators can visualize the specific health of their production systems.
For example, Terri demonstrated a storage dashboard that allowed her to drill down from total system storage into specific libraries, identifying exactly which ones were growing and why. This level of granularity, combined with the ability to take immediate action, such as holding or ending a runaway job directly from the interface, is what separates a simple monitoring tool from a robust operational workflow.
The takeaway from the webinar is clear: IBM i monitoring should not be a chore, but a strategy. By implementing proactive, exception-based monitoring, teams can reduce operational risks, improve system visibility, and ensure that their IBM i environments remain the reliable backbone of the enterprise. If you find your team spending more time troubleshooting than innovating, it may be time to rethink what you are monitoring and how you are responding to the signals your system is already sending.
Whether your team is trying to reduce manual checks, improve operational response, or build a cleaner view of system health, the first step is getting your IBM i monitoring priorities straight.
Want to See Robot Monitor in Action?
Let us show you how Robot Monitor can help you get a handle on performance and application monitoring. There’s no prep work required on your part. We’ll find a time that fits your schedule and familiarize you with the software in under an hour.