Digital transformation, the FCA, and operational resilience
Since the UK Financial Conduct Authority (FCA)’s new operational resilience laws came into force in March 2022, financial services institutions have seen increased pressure to build a secure and resilient IT environment.
But due to the growing complexity of IT environments, that adaptability is more difficult. Over 80% of financial institutions said that their environments had changed more in the 12 months prior to November 2021, than over the company’s entire lifespan.
And failures are increasing. A recent study said there are more than 365 operational issues per working year – over one per day - and they seem to be climbing in frequency. These new regulations may be the highest level of control that has been placed on a company. It is a matter of great responsibility – corporate and individual, as a senior member of the bank must personally sign the contract with the regulator.
Thus, both are regulated by these new regulations, and companies can’t just treat the regulations as a cost of doing business. To follow the rules, certain measures must be enacted – if the right operational risk measures are taken, then the pressure on the operational resilience of an IT environment will be lower, giving the environment the freedom to perform flawlessly,
The mission
Operational resilience is a necessity; there are a number of threats and risks out there that a financial services firm ought to be wary of - from the Ukraine-Russia conflict to a change in exchange rates. Changes to world economics can come from many situations, and often bring with them trading values that IT environments may not be equipped to deal with.
Theoretically, the answer is simple – keep an observant eye over it all, in real time, so that the instant something goes wrong an automated solution can be put in place (ideally prior to any impacts upon the service itself). Reality is that the IT estate is too complex with too many parameters to worry about, so that active monitoring is hard to achieve across the whole process. However, there are some key tenets which help reduce the risk of an issue becoming a problem and causing an outage.
Key tenets
There are four key tenets which can help in following these new regulations.
1. Robust Architectures. Look at all the things that could fail and decide what you would do if they did fail, whether it is as large as a whole data centre or as small as an individual web server. Also, make sure that your services architecture is incredibly robust. That way, if one thing fails it does not take your entire service down, including the electricity to your data centre and connections to the Internet.
2. Test! A lot of testing is focused on whether new features are working but doesn’t check for performance. Are you doing performance testing properly? Are you testing all the failure scenarios of that new online service to make sure that you’re going to catch any failures before they go live?
3. Be cautious. Be careful with changes. About 70-80% of failures happen because of a change that humans have made, causing fragility. So, change management must be managed from a risk point of view, and balance how many changes the business needs to keep up to date with the current demand with no problems and outages.
4. Observe. Can you detect a problem before it’s caused an incident? Set the detection tools so they can correct things, particularly if you’ve got a good resilient architecture. Spot the server that has gone down and bring it back before a second and third server go down.
Operational risk (things which may go wrong) and operational resilience (recovering from incidents) are by nature intertwined, and yet some only to focus on the former or the later independently. They are a combined responsibility. And if you use them together, you have a better chance of running your organisation without having issues of resilience.
The future
We believe that every organisation should have a Chief Resilience Officer, someone to look at the things that could go wrong, how they would be detected and what the remediation would be. Whatever strategies your company uses to take on risk management, you should adopt a holistic approach, combining operational risk with operational resilience management.
From experience with our clients, we are learning that the more people start a digital transformation, the more outages they appear to be having. They need to invest more in transparent testing and try to make their IT environment more robust, keeping the customers and the regulators happy.
You can read the original article in IBS Intelligence here.
Click below to learn how ITRS Group can help your organisation achieve operational resilience.