Complete Mainframe to AWS Migration with Candid Partners

Migrating a mainframe is a complex project, and we at Candid Partners are focused on solving the toughest business problems using Amazon Web Services (AWS). Working closely with three professional services partners, we successfully migrated a legacy mainframe to AWS for a global enterprise client, and we are now actively working on similar initiatives.

In this post, I will describe the approach, architecture, and lessons learned from this real-life mainframe migration experience.

Candid Partners is an AWS Partner Network (APN) Advanced Consulting Partner with the AWS DevOps Competency. We see many enterprises are still hesitant to undertake large-scale projects involving legacy systems, despite the growth in cloud computing. Mainframes typically run an organization’s most vital systems and likely have run with stability for decades.

However, these systems now face risk due to attrition of tribal knowledge, missing source code, or general lack of knowledge of how some of the systems are used. Migrating mainframes to AWS provides agility with a pay-as-you-go cost model.

Program Approach

Initially, we approached the mainframe migration with a focus on cloud-native solutions and cloud-based capabilities. But we quickly realized that we must become reacquainted with mainframes. In this case, it was a 40-year-old mainframe that had been out of support for over a decade. Some of its data processing components were first built in 1974 and was not recompiled until 2017 in preparation for this AWS migration.

To prepare, we leaned on our strong program management practice. Any large-scale project will impact critical business processes across the enterprise, and it’s particularly important to not limit the scope to IT operations and application teams. The full project plan included establishing policies and procedures for project oversight and planning, defining leadership roles, developing change management and problem escalation procedures, and determining responsibilities for budgeting and license tracking.

Scoping and Analysis

Early on, we encountered some unique challenges that resulted from working with a legacy system. One of the most striking was the attrition of tribal knowledge. Due to the age of the system, many core individuals with institutional knowledge of the mainframe had retired or moved to other positions. This resulted in challenges obtaining usage information such as I/O rates and identifying source code.

Mainframe environments often contain batch processes and applications that are important but poorly understood, as well as code modules that are rarely or never used. To ensure that nothing important was left out, the application support team needed to discover source code for all mainframe applications in use. For this stage, we relied heavily on one of our partner’s specialized mainframe expertise and software tools to:

Comb through code repositories and other data stores to discover as much source code as possible.

Document all types of processes, including batch processes, transactional processes, and operational processes such as backups.

Map application dependencies.

Identify all integration points with external services and applications.

Extract usage information related to CPUs, storage, and networks that can be used to estimate requirements for cloud resources.

Non-functional requirements were also gathered during the analysis phase, including current state baselines. Examples of key non-functional requirements ranged from availability and scalability to compliance with industry standards and user experience. The majority of these requirements can be enhanced through hosting on AWS. Time for testing of these requirements was also built into the project plan prior to implementation to ensure it wouldn’t be delayed until the end of the project.

The information gathered during the analysis stage is critical for designing the target environment on the cloud platform and planning the migration. In this case, the entire discovery and analysis phase took three months to complete.

Architecture

Before any engineering began, the foundational cloud architecture was defined. This architecture matched platform features and available tools against the needs for performance, scalability, high availability, and networking. We also worked with the client to consider issues like process automation, security and compliance, user account management and access control, operations, archiving, and disaster recovery.

We followed the AWS Well-Architected Framework with infrastructure definitions set up with DevOps automation defining the actual infrastructure as code to allow changes throughout the process. Interestingly, we found there were several integrations that happened directly to the database as opposed to the online or batch interfaces. To control these integrations, we converted mainframe security server access control policies into database schema permissions.

From a sizing perspective, we found that MIPS, MSU, and ITR were not the greatest indicators of capacity on the mainframe because of simple versus complex instructions, processor optimizations, and because of multiprocessor overhead. Consequently, we identified the System z Processor Capacity Reference to determine how much usage information was taking place. In terms of complexity, it was a medium-sized mainframe with 28 business applications, 266 integration points, and 1.6 TB of live data.

For the migration methodology, we chose to use interface and subsystem emulation versus instruction emulation using one partner’s proprietary technology. While it’s possible to do x86 translation of mainframe instruction set at execution time, we realized better performance gains from interface and subsystem emulation for online and batch processing. We used R4 instances to run the application and database server, and we found they were more than capable to process billions of transactions through the system.

Figure 1 – AWS architecture for the migrated mainframe workload.

To operationalize the migration, the Candid team defined the foundational cloud infrastructure for the cloud platform on which the current source code would be migrated. The migration team designed the usage of Amazon Elastic Compute Cloud (Amazon EC2), security groups, routing, enterprise AD authorization controls, network access, and designed internal processes for operational support—spanning key functions such as networking, security, logging, and application support.

A CI/CD pipeline was created to ensure that no infrastructure changes could be made to the solution without proper automated analysis of the infrastructure as code, as well as manual approval gates for changes to the infrastructure environment. We also supported the solution through hypercare and performed knowledge transfer to OS, infrastructure, database, and application support teams.

Like many IT projects, system dependencies are where all the complexity lives. Since batch processing includes far more dependencies than online processing, the team concluded the best practice is to start the conversion of batch-based systems at the same time we start migrating online applications. The ability to show stakeholders real progress on the project by showing the early results of converting online-based applications to the cloud was essential to the success of the project.

Instead of waiting until all online applications are converted to start the migration of batch applications, we concluded we should start them at the same time. Batch-based applications are dependent on different datasets, different jobs, and integrations with other systems, so starting those conversions early reduces the overall risk of the project.

Licensing

Software licensing had a significant impact on the technical architecture. Initially, we assumed that to replicate the performance of a giant metal mainframe we would need to use AWS X1 instances. However, we discovered that costs would add up quickly if software costs are in the range of $10-$100,000 a license per core across the 128 vCPUs of an X1 instance. To minimize this impact, we worked closely with procurement to negotiate favorable licensing terms.

We found that 30 percent of the operating costs were software licenses supporting the mainframe software tools.

Change Management

Mainframes can be used by hundreds or even thousands of people within an enterprise. Many are accustomed to applications that haven’t changed in more than 10 years using green screen terminal interfaces. Due to this, we realized that change management would be integral to gain user acceptance of the migration.

We identified key influencers in each business unit and recruited them as change agents and brought the users front and center in the process. We furthered change management with an ongoing communications strategy to inform all key stakeholders about the process. Because of this change management strategy, by the end of the project our users were big supporters of the migration.

Project Plan

The options for the roll-out strategy was either a “big bang” or staged approach. With the big bang approach, all applications are rolled out at once. This is the approach we selected, as it allowed the datacenter to be decommissioned soonest. In the end, the time to the get our first app in the cloud was six months with work divided among four core professional service firms. That work included:

Architecture and program management: Cloud architecture and operational integration including security governance.

A high-level project plan follows in Figure 2. Within that project plan, approximately six months of preparation, including analysis and setup, was conducted before labor was brought on to start migrations.

To reduce unnecessary work and to minimize risks, our team rationalized the applications that resided on the mainframe. We prioritized the applications using a four-quadrant scale of business value, application complexity, costs, and performance. The sequencing for application migration was then developed based on the prioritization results with highest-value/lowest-complexity applications moving first. Ultimately, two low-business-value applications were not migrated.

Execution

A large aspect of a mainframe migration project is testing functional equivalence between the new AWS environment and the legacy mainframe. The migration team performed detailed testing at multiple levels and leveraged automated testing to accelerate the process. The automated testing included components, data assets, data obfuscation, integrations, and application functionality to gain more accurate performance testing of expected and worse-case scenarios. We built minimal scripting around the commercial application tools, but mostly open source tools were used for testing.

Candid Partners managed the logistics and execution to AWS, transforming applications that processed data in batches based on schedules to a modern architecture. Activities that took place within the plan’s blueprint required the migration team to:

Recompile x86 processor software, originally based on IBM mainframe technology.

Include access control capability for user authorization based on Active Directory.

We made arrangements to train end users, application support personnel, and developers on new application usage patterns, and how to take advantage of the agile capabilities of the cloud platform and tools.

Moving 40 years of integrations, processing, and interfaces to a modern environment ultimately came down to the management of a big project. The largest success factors for us were close management of dependencies between systems and teams, being able to establish and track timelines, and the ability to meet deliverables.

As a result of the migration, we have found that new technology exists to provide incredibly intelligent source code analysis, enabling faster and less risky migrations using hybrid migrations.

Customer Benefits

Migrating the customer mainframe to AWS with Candid Partners offered many benefits:

Customer is seeing a reduction in annual cost for the mainframe workload applications of approximately 72 percent including licensed software, application support, and infrastructure costs.

Performance was significantly better on AWS than previously realized on the mainframe.

Applications running on a modern, flexible infrastructure, with capacity that can be scaled on demand.

Opportunities to continually modernize technology, and to improve business agility with agile software development.

Learn More About Candid Partners

Candid Partners is an IT consulting firm that combines enterprise-class scale and process with born-in-the-cloud domain expertise. To ensure the success of mainframe to cloud migrations, we provide extensive knowledge of mainframe environment combined with AWS expertise. More than 80 percent of Candid’s employees hold AWS accreditations or certifications.

Candid owns unique tools and frameworks to facilitate the migration of applications, services, and infrastructure from on-premises datacenters to AWS: