System Manageability: The Underlying Resource Framework


Manageability of the enterprise is dependent upon the underlying resource framework which supports the other disciplines. It consists of:

Service Level AgreementsOrganizational StructureStaffingFacilities

Sustaining OperationsBusiness Systems ManagementInfrastructure Support

The following section defines each of the key areas required to ensure successful management of the enterprise.


Service Level Agreements

To ensure reliability, availability and serviceability objectives are met, overall service levels must be defined, implemented and maintained. Service level agreements are generally instituted between a department providing a service and the user who is bein g provided the service.

Service levels not only provide the user with a definition of what to expect regarding such things as system response time, mean time to repair, etc. Service levels also support those providing the service by defining measurable objectives which can be referenced in the event of a service disruption.

Documented Service Levels

For service levels to be meaningful they must be documented. The documenting of service levels provides a reference point for all parties involved. Service levels should not only be documented, the documentation should be distributed, agreed to and maintained.

 

Service Level Review

Review of service levels should occur on a regular basis. If an area is not meeting a particular service level the following should be defined:

  • What caused the service level miss?
  • How can the service level miss be prevented from occurring again?
  • Is the service level valid (i.e., is it an attainable service level)?

Although the first two points are obvious, the third item may not be. For a service level to be effective it must be attainable. If the service level cannot be met under ordinary circumstances, changes to that service level may be required. An unachievable service level, however, should not be confused with a regularly missed service level which is the result of poor performance.

 

Customer Satisfaction Customer satisfaction is the ultimate goal of service levels. For this reason, levels of reliability, serviceability and availability must meet the needs of the customer. However, service levels must also be realistic from the provider's standpoint. While a user may believe that a system should be available 100% of the time or, if down, should be fixed in a minimal amount of time, this level of service from the provider may not be possible. An effective service level is not only regularly attainable, but when met, satisfies the customer.

Return to top

Organizational Structure

Effective systems management requires an organizational structure to be defined which supports the needs of the enterprise. Ideally, the organization should be created along functional, rather than organizational, boundaries. This structure minimizes redundancy and provides users with a more clearly defined organization.

To ensure individuals within the organization and users outside the organization understand who performs each system management function, the following must be defined:

  • Organization Charts
  • Job Descriptions
  • Charters
Organizational Charts

Providing the enterprise with organizational charts is critical to ensure users know who is responsible for each area of the systems environment. The organizational chart should list each individual within the organization and define what they do and who they report to.

 

Job Descriptions

Job descriptions are used internally by the systems management organization to define what each individual is responsible for. The job description at a minimum should define overall responsibilities, objectives and performance measures.

 

Charters

It is important for the organization and its components to have a clearly defined charter. The charter must contain the vision of the organization, the mission and strategies. The vision defines what each area is striving to achieve, the mission defines the objective of the organization, and the strategy indicates how the organization will achieve its mission and ultimate vision.

The vision, mission, and strategies must be understood by those inside and outside the organization. Each individual within the organization should understand their role in contributing to vision and mission. Each department should have a clear and concise mission which should support the overall organizational mission.

Return to top

Staffing

Staffing plays a critical role in successful system management by allocating appropriate individuals to specific tasks and ensuring that those individuals have the skills necessary to perform their roles within the organization.
Within a changing environment, staffing becomes a tenuous process at best. Management is required to support projects utilizing new technology where skills may or may not be available. To compound the issue many projects require a specific skill set for a short period of time after which the skill may not be required.

Training/Education

In an environment where new technology is introduced on a daily basis, proper training and education are critical to successfully managing the enterprise.

Training must be provided in a manner which supports an individual's ability to do their job. As new technology is unveiled, training must be provided as proximate to the implementation date as possible. If training is provided too soon, newly acquired skills may not be utilized on a day-to-day basis and may be forgotten before the new technology is implemented.

Training and education need not be formal; brown-bag discussions and one to one training are oftentimes more beneficial than formal training sessions.

 

Mentoring

Mentoring plays an important role in developing an effective systems management infrastructure by providing a mechanism for individuals to learn from the experiences of others. With mentoring, individuals with certain skill sets are used as "role models" to promote those skills. Within changing environments, mentoring is key since it provides a mechanism for individuals to quickly learn the skills of another.

 

Career Pathing

Career pathing provides longer-term benefits by supporting an individuals need for personal fulfillment. A career path provides an employee with the road map necessary to achieve success. A career path need not lead only to a management position, many individuals may seek a more technical path. The goal is to provide individuals with real expectations regarding personal growth within the organization and a method for realizing that growth.

 

Motivation/Morale Although change may motivate certain individuals, others are intimidated by change. Consideration must be given to defining what is required to motivate individuals to excel within the changing environment. Morale within the organization must also be considered since high morale levels are necessary to ensure continued success.

Return to top

Facilities


In days past, facilities were centered around the needs of the mainframe and supporting peripherals. In today's world, decentralization of the processing environment requires that consideration be given to key components of the facilities infrastructure, including:

  • Location
  • Power
  • Security
  • Expansion
  • Network
  • Command Center

The following section provides definition for each area and the special circumstances which may exist as a result of the distributed environment.

Location

Location relates to the placement of systems within the enterprise. When mainframe systems were the only concern, location considerations were based upon finding space within the data center. In today's environment, while space is still an issue, midrange systems and servers may reside outside the data center.

Contrary to the mainframe environment where dedicated communications channels ensure adequate response even across large distances, today's smaller systems must contend with inadequate networks to support ever-increasing bandwidth demands. Often, placing a server in the data center may not be feasible since response time may be adversely affected.

 

Power/Air Conditioning

Uninterrupted power and proper cooling is paramount to minimizing system downtime. As systems are no longer restricted to the data center floor. The result is the need for procedures to define acceptable system locations and methods for providing adequate facilities to support them.

 

Security

Security in today's environment is also complicated by the fact that systems are distributed. Security in this instance does not involve data security, but rather system security. Mission critical systems must be protected with regard to access regardless of being midrange or mainframe.

 

Expansion

Facilities must address the need for growth within the data center. As systems come on line, space must be available to support these systems. Concerns regarding proximity and other facility components (e.g., power, security) must be addressed and planned for.

 

Network Facilities

The cornerstone of today's processing environment revolves around an adequate network infrastructure. Facilities must be created in such a manner to support both internal network considerations as well as external network issues.

 

Command Center

Providing a central point of control for today's complicated environment has many benefits including:

  • Centralized operations
  • Minimal points of contact
  • Consolidated systems management.

A command center for the mainframe environment was easily constructed since network and system issues were minimal. In today's environment, complicated systems are managed with an ever-growing number of tools. Although the need for a command center remains, creating such a facility is a large undertaking.

Return to top

Sustaining Operations

Continuous operations are required to stay competitive. As with most other areas of systems management, sustained operations has been made more difficult with the implementation of distributed systems. The key components critical to uninterrupted operations are:

  • Operational Standards
  • Documented Procedures
  • Automated Operations
Operational Standards

Operating standards are essential to a successful Operations environment and are based on the ability to reproduce consistent actions in every area of the environment.

 

Documented Procedures

Documented procedures provide the framework for successful Operations. Procedures are documented and when changed, are updated. The procedures are consistent in their presentation and are distributed for understanding throughout the organization.

 

Automated Operations To achieve centralized, exception-based management of the Operations environment, a continuous evolution towards automated operations is necessary. There are two basic forms of automated operations, lights dimmed and lights out.
Lights dimmed is the implementation of an Operations environment where minimal staff on-site is utilized to run the systems as required. Lights out operations requires no staff on site and can be remotely managed.

Return to top

Business Systems Management

To support business changes dictated by the more competitive environment of a global marketplace where low cost and superior performance are key to business success, business processes need to drive product development and deployment. Business process re-engineering is defined as a rethinking of the role technology plays in support of business functions. With the price/performance benefits of client/server computing, many systems applications are being architected and deployed on alternative computing platforms to the traditional central host-based MVS architectures of the past.

The primary obstacle in the deployment of these alternative platforms is the lack of a unified management system with which to support them. In the MVS host-based world, system management has evolved over the past two decades resulting in a cost breakdown of 20% for support and 80% for hardware and software. Within the client/server based computing environment, this relationship is reversed with 20% comprising the costs of hardware and software and 80% for management of the environment.

Besides the standard system development practices of full user-based acceptance testing to ensure quality systems meet functional requirements, standardization of the operating environment becomes even more critical. The elements integral to this standardization include:

  1. Production acceptance to ensure operational compliance
  2. Configuration management to ensure supportability
  3. Version control to ensure functional integrity
  4. Software distribution to ensure standardization.
Production Acceptance

Information Technology organizations can support an operating environment in a cost effective manner only by limiting complexity and reducing the number of nonstandard actions needed to provide service. With business drivers requiring multiple technology platforms, operational acceptance criteria must be defined for all components of this extended enterprise. This means that technology decisions must include supportability considerations and that standards be defined and enforced as part of the process of accepting systems into the production environment. Use of standardized middleware and system management tools should also be part of the acceptance criteria.

 

Configuration Management

In meeting the challenges of distributed systems management, users face a multitude of requirements that fall into two major categories: Event Management and Configuration Management. Configuration Management disciplines will increasingly take center stage in systems management as a means of controlling environmental quality, enforcing policies and reducing complexity.

Without a focus on Configuration Management, Event Management will be fighting a losing battle for availability, efficiency and service quality if the sources of events and user calls are not a key part of the management process. This requires configuration management tools and processes and can be resolved with a framework-based configuration management solution using pre-tested components to enable source problem reduction.

 

Version Control

As in the host system environment where modules are checked in and out of a source library and changes to production proceed from testing through a change management system to ensure system integrity, the client/server development and production environment needs similar tools and procedures for version control.

 

Software Distribution

In the host-based environment, software distribution is a centralized process controlled by change management procedures and staging tools. However, in the far more complex world of client/server computing, there are three key reasons why software distribution will continue to grow as a strategic management issue:

  1. The number of desktops deployed across enterprises continues to grow
  2. Major vendors are entering the market with enterprise offerings
  3. The advancing storage and processing capabilities of desktop workstations

These factors enable and/or "force" more organizations to load productivity applications on workstations, therefore requiring updates, maintenance and general configuration management. Software distribution tools which simplify the process of updating software components and ensure enterprise compliance with approved configurations are necessary to avoid service outage and resultant problem resolution resource impacts.

Return to top

Infrastructure Support

The software industry is both caught in and leading an architectural change that is sweeping the entire technology scene. Corporate users want to balance and optimize platforms so that application tasks can reside where they are most cost and technology effective, and where they can communicate and share information across centralized, distributed and personal systems. The cost and style of computing vary across platforms, and users are trying to achieve architectural balance that exploits these differences. The critical missing pieces in this networking puzzle are system software and middleware, which provide the software "glue" that gives applications levels of portability and connectivity.

A systems and network management framework that makes use of middleware objects simplifies this otherwise complex undertaking. A framework sits between the operating system and distributed system management functions, providing a common set of services to each. Without frameworks, each distributed system management function would have to provide this set of services (which is the problem with implementation of point solutions) and any changes made to one would need to be proliferated to the others, creating an update currency issue and significant procedural complications. Without synchronized information available in a framework, routine tasks such as software update distribution becomes complicated and failure prone if the system is not alerted that a network link is down or a node does not have the RAM, disk or processing power to accommodate the update.

Infrastructure support is an iterative process which defines technology trends and the support structure necessary to make them operable, then implements through procedures for software and hardware installation and maintenance and administration of the systems. While system versions are in operation, the next wave of technology to support upcoming business requirements is in the evaluation process, thus creating a dynamic and market-leading technological environment.

Software Installation and Maintenance

As systems become more complex, they risk becoming unmanageable. This can happen by either increasing the user population or increasing the number of managed elements. Environments that are manageable when running a small number of users risk becoming unmanageable as the user population grows or as the number of components grows. In an environment where the number of users must, to support business, increase significantly, the only way to maintain a controllable number of elements is to control the number of separate software components in use. To support this goal, system software installation and maintenance tools and procedures are necessary.

 

Hardware Installation and Maintenance

As with software components, the complexity of the supporting hardware environment determines its manageability. Standardization of the hardware environment is necessary to maintain high levels of system availability. To assist this effort, an up-to-date asset inventory is essential. With a systems management framework where automated component discovery is supported, the processes of upgrading system hardware will evolve into standard support procedures.

 

Application Research and Development

Research into the next wave of deployable technology must be driven by strategic business requirements with a constant focus on supportability, with the understanding that services that are too complex to be adequately supported will not provide the high level of quality and low cost of service that the global business market will reward. The support infrastructure and tools must be part of the application research testing process for cost effective technology options to be defined.

 

System Administration In complex environments, system administration comprises the ongoing monitoring of multiple interlinked components and, in order to be handled in a cost effective manner, requires an automated toolset deriving its policy information from a single source and producing management information in a holistic fashion. Component failure identification is necessary for recovery and prevention, however from a system availability perspective, a service outage no matter what the cause is a service level miss for the customer and should be centrally tracked toward the goal of customer satisfaction, continual service and product improvement.

Return to top

Copyright © JJ Kuhl 2002