How to build an ironclad on-premises monitoring server
On-premises solutions are not ready for the scrap heap, despite the death knell being sounded for a few years now.
There remain many sound reasons to keep data, apps, and software safe and sound on-premises – not least of which is having full control over your own data. And being able to access it during an Internet or service outage is essential.
Retaining (or deploying) on-premises solutions also means you need an IT monitoring system that is ironclad - watching, assessing and alerting you to any breaches or potential security issues.
Here is how you can build a resilient monitoring server in your data centre, one that is impervious to outside threats.
Every step counts
There are many considerations when installing a monitoring server locally. Here I will discuss some of the steps you can take to ensure that your monitoring server is available during a service outage at your organization.
1. Name resolution
• Add local host file entries of your database server or other monitoring servers. This should include forward and reverse name resolution. You can also add entries for key infrastructure like core routers, critical app servers, etc. Be aware though, if there are any changes to the addresses you will need to update your host file.
Another option is to use IP addresses rather than host names to avoid relying on your DNS infrastructure, although this can have the disadvantage that it’s easier to maintain or end up monitoring the "wrong" device.
2. Shared infrastructure
• For a resilient monitoring server, local storage is much preferred. Shared storage such as SAN, (No SAN or shared storage), local RAID array, HDD, SSD, hybrid, or anything locally installed.
3. Platform
• If possible, using a physical server can be a life saver when mysterious and unexplained outages occur with the virtualisation platform. This request may seem at odds with the goals of your IT organisation, but the cost savings on virtualising your monitoring server can be far outweighed by the impact of a single undetected outage.
As an alternative, some protection against failure could be added if you are required to use a virtual server. Some options could include OS-level HA across multiple physical hosts, application-level redundancy such as DR sync, and database copies.
4. Database
• A database dedicated to the monitoring application is preferred. Using a shared database can lead to unintended consequences like patching, updates, and upgrades in support of other applications using the same database. Much like virtualisation, other issues can arise from noisy neighbours who like to use more than their fair share of resources.
5. Local Collector
• When the network fails, a local collector installed in your monitored environment can continue to process events from agents or agentless and save to a buffer when connectivity is restored. Some local collectors can also perform actions, notifications, and fix-it scripts while disconnected from the mothership.
6. Backups
• Make sure you have copies of any configurations including users, dashboards, searches, etc. This is in addition to centralized backup software solutions. You can save these on a shared drive, but a properly paranoid monitoring administrator will also save to other destinations that are infosec approved.
7. Local User Accounts
• Have a backup set of locally authenticated accounts you and your users can revert to when shared authentication services fail. LDAP and AD can fail and prevent you and your users from accessing the monitoring consoles.
All software fails, the goal is to fail at a different time from your monitored infrastructure. The key to successful monitoring, whether on-premises or in the cloud, is having the right technology.
Opsview can help; it monitors operating systems, networks, cloud, VMs, containers, databases, applications, and more. Learn more about ITRS Opsview by clicking here.
You can read the entire article on building a resilient monitoring server by clicking below.