Articles

Disaster Recovery

Business continuity and disaster recovery are the processes and procedures that return your business systems - hardware, software and data - to full operations following a natural or man-made disaster. As businesses increasingly rely on IT for their mission-critical operations, it is essential to have plans in place to ensure your business viability is not at risk from a critical incident. Here, we look at a few different levels of data recovery:

  1. No disaster plan at all
  2. No disaster plan, but good backup procedures
  3. A disaster plan, with no resources in place
  4. A 'cold site' disaster recovery solution
  5. A 'split site' disaster recovery solution
  6. A 'warm site' disaster recovery solution
  7. A 'hot site' disaster recovery solution
  8. What level of protection is right for you?

 

No disaster plan at all

Despite the risks, millions of businesses globally have no formal business continuity or disaster recovery plan in place. Should a disaster occur, panic and confusion tend to be the result and timely recovery of data, software and hardware is not possible. The chances are very high that these businesses will never recover.

A simple server crash, equipment failure, power surge or human error is all it takes for a critical database to be wiped. Fires, floods, viruses, unauthorised users or hackers can play havoc with your entire business systems. Unlike hardware or software, data is an even more valuable asset that cannot be replaced.

If you work in a company which has an IT department that does not plan for disasters, then it is absolutely essential that an effective plan is developed before it is too late. Fortunately there are many reliable and cost-effective solutions available to safeguard your business.

 

No disaster plan, but good backup procedures

The absolute minimum companies must do - even the smallest business - to prevent a disaster from wiping out business information is to back up the data on your computers daily and store the back-ups offsite at a secure archival company. Never store it at employee's homes.

That way even if your hardware and software is ruined, you can still replace it and load it up with all your irreplaceable data. If your IT department is not making good backups of at least the critical systems at least every single day, then it is simply not doing its job.

Another important thing to remember about backups is that they must be tested regularly to make sure they are working. Nothing is more frustrating than to need a backup and find that the data is corrupt or non-existent.

Another smart and reasonably simple step is to build fault tolerance into all of your critical systems. This means installing RAID drives - disk drives which are redundant copies of each other - clustered systems and other types of local recovery procedures that at least provide an extra layer of protection.

 

A disaster plan, with no resources in place

Once you have a good backup and archival procedure and your critical systems are fault tolerant, the next step is to put together procedures for remote disaster recovery. This simply means you ask and answer the question, "What do we do if the computer centre is utterly destroyed?" You might, for example, make arrangements with another division or company to share equipment and space if either is struck by disaster. Agreements need to be made with critical computer vendors to quickly ship new systems in the event of an emergency. This kind of planning is a good first step, although recovery would be slow in the event of disaster so you need to be sure your business can afford a few days of downtime if required.

 

A ‘cold site’ disaster recovery solution

A simple yet effective business backup solution, a cold site is simply a reserved area on a data centre where your business can set up new equipment in the event of a disaster. This is a popular disaster recovery method because it tends to be less expensive than other options, yet still gives a company the ability to survive a true disaster.

If you outsource your disaster recovery to a third party, then odds are they will establish this form of disaster recovery solution. This will work as long as your planning is good, your backups are sound and your documentation is excellent. Of course, extended downtime in the event of a disaster must be acceptable for a cold site to be a valid option. Plan on 24 hours for critical systems and as long as a week for less important functions.

 

A 'split site' disaster recovery solution

If your organisation is large enough, it may be feasible to house the IT department across more than one location. In the event of a disaster to one site, operations can then reasonably simply shift to the other site and any new equipment needed could be purchased as necessary as long as the backups were properly maintained. The advantage to this method is it eliminates the need for the major up-front costs of building a dedicated disaster centre.

As your organisation will need to purchase or lease the equipment in the warm site, this option does involve more set-up costs than a cold site, but has the advantage of being able to get your business systems up and running much faster. Even sites with multiple applications can generally be back to full operation within 24 hours.

 

A 'hot site' disaster recovery solution

A hot site is a premium level of disaster recovery where the business IT systems and up-to-date data are duplicated and maintained at a separate data centre. In this scenario, a duplicate computer centre is set up in a remote location with communication lines set up and actively copying data at all times. The site has a duplicate of every critical server, with data that is up-to-date to within hours, minutes or even seconds. At the highest level it even has desks, phones and whatever else is necessary for operations to continue if the worst happens.

Following a disaster, your business can very quickly 'switch' to the hot site with minimal disruption. This is the ultimate in disaster preparation, reserved for companies with excellent management and highly skilled IT staff. Hot sites are expensive, difficult to set up and require constant maintenance, but in the event of a disaster operations can continue with a minimum of downtime. This is a popular option for institutions such as finance companies and stock exchanges where downtime is not an option.

 

What level of protection is right for you?

Before determining exactly what business continuity and disaster recovery plans you need in place for your business it is essential to analyse your systems, data and requirements and develop a solution that cost effectively meets your needs today and into the future.

TRT's specialised data protection strategies and solutions are all about finding the most efficient and cost effective way for your IT department to achieve the appropriate recovery objectives.

If you would like to arrange a free and confidential discussion on our Disaster Recovery and Data Protection services for your organisation please contact us.

Challening Times

Today's IT teams are under pressure from all sides. They're asked to squeeze more from increasingly complex and mission-critical IT platforms, in an environment of tightening budgets, strict regulation and an acute ongoing skills shortage.

Much has been written about the IT skills shortage globally and how it is limiting business growth. Its impact has been felt across all industries, and is showing little sign of abating. However the sharp end of the stick is being felt by those businesses (small & large) running critical applications on proprietary server and storage platforms. Supporting these technologies requires personnel with in-depth vendor-specific training and many years of experience in niche areas.

The gradual tightening of supply and a recent spike in demand has led to significant salary increases to keep these specialists in house. The result is that organisations are needing to spend ever-larger portions of an already stretched IT operating budget to support these critical systems. If not attended to by management, the situation will slowly worsen until it reaches a critical tipping point such as a resignation of the key person, a company merger or a major new project.

 

The Risks

The strategic importance of IT within businesses along with the IT skills shortage has highlighted an array of potentially serious and growing business risks faced by all businesses:

  • Key Person Risk
  • Infrastructure Complexity Risk
  • Financial Risk
  • Delivery Risk
  • Infrastructure Failure Risk

 

Key person risks

The risk that has the potential to cause the most serious consequence in the shortest period of time is the loss of key people in a short time frame. Many IT departments have one or two people managing their business-critical infrastructure. The knowledge these people acquire over time represents a single point of failure, which can easily slip off a CIO / CFO's radar. The unexpected loss of such staff can have disastrous consequences for the uptime of IT infrastructure that the business operates on.

With the loss of a staff member, you also lose a wealth of intellectual property about your IT systems that has often been built up over a number of years and is rarely adequately documented. In the current environment, losing your top IT talent is realistically a matter of when, rather than if. No longer should CFO's expect that their valuable IT team members will stay with the company for five to 10 years, in many cases two to three years is more the norm.

It is interesting to note that while businesses commonly spend tens of thousands of dollars on highly improbable events such as fire or a plane crashing onto the data centre, they are ready to rely upon one or two individuals to keep their business systems running day to day without recognising and eliminating this single point of failure. The reality is that these same people are in high demand by global IT companies willing to lure them with interesting projects, training and, most importantly, higher salaries.

 

Infrastructure complexity risk

Over time, core business applications become increasingly layered, complex and customised as well as becoming more fundamental to the operations of the business. For example this occurs as businesses incorporate functional databases such as Oracle, add comprehensive storage and backup consolidation or include clustering/high availability software. While the result is a more robust mission-critical system, as the number of layers grow so does the expertise of the personnel required to support this infrastructure. This becomes a single point of failure as support and administration of the systems are rarely provisioned for. This is risky enough in normal business conditions, but with the added urgency of a disaster or system crash the resulting downtime can be significant.

 

Financial risk

The greater the scarcity of these specialists, the greater the risk IT departments will need to spend above their operating and capital budgets to achieve their objectives. Should you lose a key staff member, finding someone to manage this mission critical architecture and complete projects in progress can be a difficult and costly exercise. The result is often a compromise - either pay more to support the same systems or settle for a less skilled person at the same salary. The latter is an acceptance of reduced service levels or project outcomes.

Irrespective of economic conditions, finding, hiring and keeping experienced specialist IT resources to maintain this enterprise IT infrastructure is more expensive. Not to mention the additional cost of meeting compliance requirements, particularly in regulated sectors such as financial institutions, government departments and utilities. More regulation means more complexity in IT which makes it more expensive to support such critical systems. The bottom line is that costs are increasing, even if your requirements remain the same.

 

Delivery risk

Opportunity costs are incurred whenever strategic revenue generating business initiatives are compromised and when important project deadlines are put at risk. Strategic initiatives are vital for taking advantage of new market opportunities which generate revenue for the business. When these initiatives involve changes to core enterprise systems, this can quickly become a major bottleneck for a project.

Whether it's a major programming requirement or a simple set of configuration changes, if there are only one or two people authorised and skilled up to make these changes, critical projects can be held up for weeks if not months waiting for the required specialists to become available. Not only can this make projects more expensive if external resources are required, but a greater cost can be the lost revenue and competitive advantage that could have been gained if the project was completed on time.

 

Infrastructure failure risk

With tightening IT budgets, IT departments will focus primarily on supporting core applications and databases, without setting systems to proactively manage the underlying server and storage infrastructure layer of an environment. Over time these foundations can weaken, exposing them to a higher likelihood of failures which compromise service level agreements set between IT and business units. Such risks will usually develop gradually or with little visible impact and then become urgent when a mission-critical system failure compromises the business. This is particularly acute if the failures result in a breach of customer service commitments, data security or loss of sales or reputation.

In an environment where IT resources are over-stretched, IT departments will inevitably, and understandably, prioritise their work to look after customer service level agreements, regulatory administering the business requirements, business applications and critical databases before general application and database maintenance. In this regard IT infrastructure is often treated as 'the poor cousin' of the IT operating budget.

While the complex underpinning infrastructure can tick along for a time 'under the radar' of overstretched IT departments, without a system of proactive monitoring, assessment and management, these mission critical foundations will decay and eventually fail.

 

Adding Complexity to a Risky Situation

The growth and increased competition in the Global economy has resulted in previously less vital applications such as business intelligence and data warehousing becoming absolutely business-critical. This has meant that these applications are required to operate on enterprise IT, making demand for implementers and administrators of such systems increase substantially.

The sharp rises for demand in enterprise IT personnel and the small increases in supply in recent years has resulted in increasing salaries, higher staff turnover and many IT projects being delayed, going over budget or being scrapped altogether.

The supply side is also of concern for Australian businesses. Gaining the necessary skills to run mission-critical enterprise IT environments takes many years of experience and a high investment in training so supply cannot grow with spikes in demand. Plus, many of the current group of experienced enterprise IT administrators are on the path into business management roles or retirement.

For many IT managers the pace and size of this shift has caught them unawares. This enterprise IT skill shortage, combined with enterprise IT running an ever more important core suite of applications, has created an array of serious and growing risks for many businesses. The reality is that the demand for skilled and experienced IT professionals will continue to outstrip supply for the foreseeable future, driving up salary costs and increasing turnover rates. Add to this the lure of positions at cashed up global IT companies, the retirement of experienced baby boomers and the brain drain to larger overseas markets and the challenge to business is clear.

Having watched this situation develop gradually over the last decade, it is clear that many of businesses with enterprise IT are not aware of the growing risks or are simply choosing to take the attitude that they can 'wing it' in the event that something should happen or key staff move on.

The question is: How do businesses future-proof and cost-effectively manage their core enterprise IT systems in the face of the ongoing skills shortage?

The answer lies in building a sustainable support structure around this proprietary technology which:

  1. Meets the agreed service levels on these business critical systems;
  2. Maintains the budget framework given to IT by the business; and
  3. Allows the business to take full advantage of future market opportunities.

While that sounds great in theory, how does this work in action?

 

Taking Action

The first thing to do is to have a mindset shift that the value of IT support is not in the technical competency of a single IT 'guru', but rather in the commitment to constantly plan, execute and review systemised processes.

STEP 1: UNDERTAKE A REVIEW

The first step is to inform both management and the systems administration that you are commencing an IT operations review. It is important to emphasise that the goal is to eliminate single points of failure in supporting your mission critical systems and to increase productivity by engaging staff in strategic high-payoff activities.

Be aware that, as with all change, this process may be met with resistance from some staff, however ultimately the goal is make them more valuable to the organisation and give them more challenging and interesting work within the business.

 

STEP 2: CATEGORISE, DETAIL AND PRIORITISE IT TASKS

The next step is to document the range of tasks carried out by your IT staff and categorise these into specific activity groups before defining each as a high payoff or low payoff activity in order of importance. In any role, including the administration of enterprise IT systems, tasks can be split up into two main areas.

  1. Low-payoff tasks: These are tasks which are performed that bring an incremental value to the business which is lower than the hourly rate of employing that person. These are often operational type tasks such as IT maintenance which is important as it maintains uptime of critical systems, but is limited in the business benefits or the value it delivers on its own.
  2. High-payoff tasks: These are tasks which bring an incremental value to the business which is greater than the hourly rate of employing that person. These typically include strategic tasks such as developing new processes or products. They can deliver a high return, but are less time critical.

A balance needs to be struck between smoothly managing the day to day operations while proactively tackling new strategic initiatives in an environment where good people are scarce. Gaining a consensus on this list is important as it enables you to focus on strategies that can minimise the amount of time Systems Administrators spend on low payoff tasks and maximise the time spent on high payoff tasks which are more rewarding for employees and deliver a higher value to the company.

 

STEP 3: DECISION TIME

Now it's time to determine how you will resource your low payoff activities. There are three options:

  1. Do nothing - Under this option the risks outlined earlier are not eliminated and there are no further improvements in efficiencies.
  2. Defer / Dump it - This process identifies tasks that IT personnel decide are neither important nor urgent. A decision should be made to not do it or lower the priority
  3. Delegate it - With correct documentation, handover and training, you can eliminate single points of failure by assigning low payoff activities to a third party. This ensures your mission-critical systems are maintained by a dedicated resource and your people are free to focus on high payoff activities.

 

STEP 4: DOCUMENT A KNOWLEDGE BASE

In any complex IT environment, documenting processes, procedures, hardware and software configurations towards industry best practice is essential to ensure the effective administration of these systems within the organisation's budget framework. The urgent tasks of today need to be weighed up against the structural exposures of tomorrow. Documenting configurations, processes and other best practice activities into a knowledge base (KB) is critical, however in too many cases there is no formal KB set up.

What you will find is commonly referred to as the Pareto Principle. That 80 percent of the tasks which key personnel use in supporting a critical environment can be achieved with only 20 percent of their skill set. This means that you will initially only need to document 20 percent of the key tasks to make a significant difference.

By doing this you may be able to release 80 percent of senior staff's time for more strategic tasks, as these documented low payoff tasks can be outsourced to a third party IT service provider or junior IT staff. With quality IT service providers you can even have them assist you in the process of developing your own internal knowledge base by having them do the documentation for you. Such transparency removes much of the mystery around the knowledge of supporting the environment which is otherwise only known by one or two people in a small or medium business.

A comprehensive knowledge base of your administration tasks 'future-proofs' you from important skills walking out the door when key IT staff leave. This eliminates the single point of failure in your support of the mission critical systems and releases your senior staff to more valuable tasks for the company. At the same time you will find that the better the documentation the easier the uptake will be for more junior, lower cost staff to take on tasks, or, for documented tasks to be outsourced to a third party.

Outsourcing to a third party becomes relevant when there is no one to delegate these tasks to and management is not keen to employ more staff.

 

STEP 5: TRANSITION AND ONGOING REVIEW

No process improvement project is ever fully complete. There needs to be a constant process of review to make sure that you stay on top of changing external conditions and so that you continue to identify opportunities to further streamline your systems.

Such reviews should be held routinely as part of an annual review of support processes or whenever there has been a major impact on the support infrastructure such as a new application coming online or staff turnover. It is important to assess the success of the process change by answering questions such as:

  1. Are we gaining the business benefits which were expected?
  2. Are my people delivering more value by focusing on strategic rather than operational tasks?
  3. Are the third party providers we delegated to still meeting customer, business and regulatory service level agreements?

 

Selective Outsourcing

If you prioritise and document your tasks into high payoff and low payoff activities, you have the information you need to identify your areas of exposure and productivity gains. Decisions can start being made on how to best allocate your in-house staff and where delegating tasks to external partners might make sense.

Traditionally CIOs, CFOs and CEOs place a greater emphasis on the operational, low-payoff tasks of supporting and administration of their mission critical systems. After all, this is generally what IT staff are hired to do day-to-day - this is their 'job'. However, during the lifecycle of a business, external factors such as market changes, new regulations, mergers and acquisitions, internal restructuring or business growth mean that IT infrastructure demand strategic and high-payoff projects to upgrade them.

The result of this traditional attitude is that one-off new strategic projects are often outsourced at considerable expense to third parties, while the internal staff look after the existing systems and tasks that offer little opportunity to improve their productivity for the company. However, this has two implications for your staff and business. Internal staff can feel that they are left doing the routine work while consultants are coming in for the challenging, interesting and strategic IT work. Secondly, the routine maintenance is the lower value work so your existing salary expense goes to this, while you pay more to consultants for the high-value and more costly strategic projects.

Turning the traditional approach around can deliver significant benefits. By allowing the organisation's in house team to sink their teeth into new strategic projects helps them to remain challenged.

In many cases outsourcing important technical yet routine and time-consuming tasks such as systems administration, proactive monitoring, backups and disaster recovery can mitigate your risks and free up your team to focus on higher payoff strategic activities. This provides a greater return on your people and keeps them engaged and loyal as they are working on significant projects.

On top of freeing up your internal staff to do high payoff tasks, an outsourcing partner can be set binding service level agreements for maintenance that include guaranteed response times, guaranteed onsite service and guaranteed resolution times. These levels of service can be customised to your organisation, delivered for an agreed ongoing fee and penalties can be built into the contracts.

By guaranteeing the performance of your business-critical systems in this way, you minimise the risks, costs and long-term damage caused by downtime and enjoy peace of mind.

 

Conclusion: Taming the Gorilla

As businesses rely more and more on business-critical IT proprietary systems for their day to day operations, the risks and consequences of infrastructure failure, loss of key staff and financial blowouts are getting more real by the day. With talented and experienced staff becoming harder and more expensive to attract and retain, outsourcing infrastructure management is becoming more and more attractive as a way to future-proof your business against the IT skills crunch.

By systemising the support of their server and storage technology, businesses can gain a greater return from their existing IT budget, eliminate single points of failure, increase the productivity of IT staff, control costs and develop an IT long term solution.

With growing pressure to control costs yet increase performance, it's all about finding strategies that enable you to do more with less.

If you would like to arrange a free and confidential discussion on the potential efficiencies of an Outsourced Infrastructure Management Solution for your organisation please contact us.

TRT's Global Break Fix Service

With TRT's global Break Fix service, clients can outsource the difficulty of supplier management and enjoy the efficiencies and customer service from one, highly focused supplier.

A single global agreement with one multi-vendor party to maintain and support your global infrastructure offers:

  • A single set of global SLA's
  • Applied to any platform
  • Anywhere in the world
  • Centrally driven

TRT "Makes the Complex Simple" with the end result being "Value for Money".

Increasingly business leaders are faced with balancing costs with making smart investments to position the business for consistent and renewable growth. This demand has been driven by the market's rapid change of focus from growth to efficiency as the top priority. IT leaders are being asked to maintain service levels and improve IT performance all in an environment with fewer resources and shrinking budgets. That's why TRT's Global Break Fix services are so attractive. They are designed to reduce capital expenditure and overheads, yet carry strict service level agreements to ensure that your security, performance and business objectives are absolutely met.

"Multinational organisations with infrastructure assets scattered throughout the globe traditionally have maintenance and service agreements which are fragmented, regional and misaligned to the overall service levels required to support the business in a 24/7 global environment." says Domenic Romanelli, Managing Director of TRT.

The organic growth of the infrastructure over time and limited available resources and skills globally means IT departments inherit:

  • A multitude of complex and disparate regional contracts and suppliers;
  • Service level agreements which are fragmented and misaligned to the overall business's requirements;
  • An impaired ability to track IT assets globally; and
  • Onerous vendor management by internal resources.

This can result in excessive, unexpected and unbudgeted expenses.

If you would like to arrange a free and confidential discussion on the potential efficiencies of a Global Break Fix service solution for your organisation please contact us.

SAN

With the volume of digital information flowing through businesses growing exponentially, it's time to consider a centralised storage network that can meet your changing needs today and into the future.

SAN

While storage area networks (SANs) have long been popular among larger companies, today small and medium businesses are also reaping the benefits of such storage architecture.

Put simply, a SAN is a network of shared storage devices. Its primary purpose is to facilitate the data transfer between computer systems and storage devices, typically a disk array or tape drive/library. A SAN provides a framework for IT managers to attach remote computer storage devices to servers in such a way that the devices appear to be locally attached.

Despite the growth of SANs, many businesses still have older architecture where storage disks are attached to a single server. While this can be a cost-effective short-term solution for businesses with a basic set-up and minimal IT resources, it can be a false economy as data and storage demands increase. Many problems arise as the business grows including:

  • As you add staff, users who have no access to this server will not be able to retrieve the data.
  • As you add more servers, direct attached storage architecture can become an issue due to the complexity of accessing all the storage devices for server maintenance and data backup.
  • Direct attach storage does not allow the sharing of storage capacity with devices attached to other servers. Think of the complicated process of having to access one computer for customer information, another for billing and yet another for inventory.
  • The added maintenance requirements, unscheduled downtime and cumbersome backup processes can ultimately create more unbudgeted additional expense.

The major benefit of SAN storage is that any server can use any available storage device. SAN storage provides opportunities for centralised backup, disaster recovery via replication and increasing the security of your critical data. All of this can be achieved with a SAN at a much lower total cost of ownership than direct attach storage.

In today's environment of exploding data growth, strict privacy laws and mission-critical IT systems, direct attach storage will no longer cut it for growing businesses.

If you would like to arrange a free and confidential discussion on our Data Storage Management Services for your organisation please contact us.