Topics

Featured in Development

Understandability is the concept that a system should be presented so that an engineer can easily comprehend it. The more understandable a system is, the easier it will be for engineers to change it in a predictable and safe manner. A system is understandable if it meets the following criteria: complete, concise, clear, and organized.

Featured in Architecture & Design

Sonali Sharma and Shriya Arora describe how Netflix solved a complex join of two high-volume event streams using Flink. They also talk about managing out of order events and processing late arriving data, exploring keyed state for maintaining large state, fault tolerance of a stateful application, strategies for failure recovery, data validation batch vs streaming, and more.

Featured in Culture & Methods

Tim Cochran presents research gathered from ThoughtWorks' varied clients and projects, and shows some of the metrics their teams have identified as guides to creating the platform and the culture for high performing teams.

The State of DevOps in Banking – Report from DOES London 2018

Key Takeaways

DevOps moves across the enterprise as technologists and business people increasingly operate together in cross-functional teams across a value stream to get work done

CI/CD is mostly complete although further proliferation of these established practices across the banking enterprise remains in progress

Careful consideration of the PMO and portfolio management function is being undertaken and new models are being devised, understood and shared

System and inter-team dependencies are identified as a key constraint in the banking industry and work is frequently focused on breaking them and making systems loosely-coupled and using internal open source models

Outsourcing presents a number of challenges when evolving to practice DevOps principles around geography, communication, budgeting and skills

At the 2018 DevOps Enterprise Summit in London, a number of banks presented talks that shared their experience and learning around the principles and practice of embracing DevOps: CapitalOne, Barclays, Lloyds Banking Group, Key Bank, Standard Bank, ABN Amro, UBS and RBS. Here, we summarise the key points of their talks and identify the correlations and crossovers in the messages.

DevOps Evolving Across the Banking Enterprise

Aimee Bechtle and John Schmidt were on stage from CapitalOne with a keynote titled ‘When the Business Partners with Tech and They Do A Dojo’ describing how they devised a DevOps Acceleration Service that undertook the provision of technicians across the whole business landscape in response to the business people struggling to get commitment from the tech leads and reportedly frequently hearing the phrase: “We don’t have time right now. You need to talk to the Product Owner.”

Bechtle explained how initially they just had development and IT operations together, but now have technology and the business sat together too. They also brought DevOps, data science and machine learning together for the first time. She said:

We just weren’t going fast enough; DevOps was foreign but interesting and I realised we were going to need more than one or two NFRs in the backlog to make a difference.

Bechtel and Schmidt advised the audience to, “Look for the innovative, forward thinking Product Manager” and concluded that: “building a relationship between business and technology is the most important thing of all.”

Shaun Norris of Standard Chartered, a one-hundred and sixty-five year old bank headquartered in South Africa described their current state:

Five years ago everything was waterfall. Now we are around two-thirds agile and the rest is getting there. Operations have increasingly become a bottleneck and it can still take weeks of a manual process to provision an environment. Our change process has thirty-five steps; it’s manual, in Remedy, and prone to delays. You need to have a good relationship network to get through it – someone’s mood can change the outcome. We are seeing Conway’s Law in action; we can see thirteen teams involved in an incident. We’re taking a bet on site reliability engineering. Greatness has no finish line – so we’ll never settle for good enough.

ABN AMRO Bank’s Stefan Simenon, head of IT tooling and software development’s talk, ‘Scale Your DevOps Initiative Beyond its Awkward Teenage Years‘ started with a snapshot of their current DevOps state:

We used to be waterfall. We had lots of waiting time, lots of time to start a project, lots of time to build, test and deploy and software quality issues were found at a late stage. We had many manual handovers and approvals, code merging happening at a late stageand we had big, non-frequent deployments to production.

Following a successful CI pilot, ABN AMRO deployed CI/CD across the organisation whilst transforming into an agile organisation. It took them three to four years, and they completed in 2017 when everyone had to apply for their own, agile job. Simenon said the new organisation has much less project management and project overhead, and is much more focused on delivering software.

John Rzeszotarski, Senior Vice President & Director of Continuous Delivery & Feedbackat KeyBank said that around 10% of work is agile at the moment and they are seeing IT operations been pushing teams to be more agile. They recognised that getting continuous integration into waterfall would mean retraining the PMO to do things in a different way. They have high capability in small pockets of high value, high change, customer-focused areas.

In ‘The PMO is Dead, Long Live the PMO’, Jonathan Smart, head of ways of working and Morag McCall from international portfolio management at Barclays described how they had islands of agile prior to 2014 and then created communities of practice. These were entirely voluntary with over two thousand members. In 2015 they launched an enterprise-wide agility programme with the aim to deliver “better products faster, safer and happier”. In particular, they addressed lean portfolio management and the role of the PMO in an agile organisation.

Now, their key unit of work is a business outcome written as a hypothesis (if it’s not regulatory or mandatory). Business outcomes are expected to be delivered every three months. They have a portfolio epic as a twelve month collection of four business outcomes, and then portfolio objectives, rolled up to strategic objectives, which are aligned to the top five priorities from the CEO.

From a finance perspective, they are working with quarterly rolling wave outcomes and have some value stream capacity funding. They have developed a lightweight business case in order to start experimenting sooner, and procurement have adopted agile ways of working. Their focus is on reducing lead time.

CI/CD is Complete, In Principle

Universally, the banks described CI/CD as complete, although perhaps not fully adopted across the enterprise.

Simenon believes that they need the right tools to make the necessary organisational change happen, and they have aimed for“zero touch” deployment. They have multiple CI/CD pipelines for mid-range, Java, Python etc. and their DevOps toolchain includes Jira, XebiaLabs’ XL Release and XL Deploy and, for security, Sonarqube, Fortify and Sonatype Nexus Lifecycle.

Initially, they experienced some instability in Jira related to non-standard ways of working, but now there is growing awareness of the key principles alongside the pipelines: run pipeline locally, integrate quickly and often, practice test-driven development, keep changes small, get continuous feedback, embrace decomposition of functionality, have a fast pipeline, automate unit testing, and practice trunk based development.

The IT4IT organisation has teams for Jira, application deployment support, software logistics, test tooling, change and config management, pipelines, CI/CD metrics, portfolio management, application monitoring, application logging and mainframe modernisation. They offer standard pipelines, but people can build their own but are encouraged to do version control etc., and develop and enhance their DevOps toolchain along with the bank’s community.

At Barclays, Jonathan Smart saidthat they had taken an approach of ‘focus on the biggest constraint’ and so had initially focused on the CD lifecycle and were now looking at portfolio management and PMO.

Royal Bank of Scotland’s (RBS) head of performance and business management, JenniferWood, said:

One team has taken a 14 week deployment cycle and made it hours and have blue/green deployments.

Psychological Safety and the Human Emotional Journey

By bringing the people across their value streams closer together, physically, and by using the dojo environment to experiment with new ways of working, Bechtle and Schmidt claimed they have achieved a state of “no fear change”. Their advice is to recognise that:

When uncertainty is high, so is anxiety. This is normal and understandable – instead of quelling it, embrace it.

Both times they set up the dojo, Schmidt said, “This is going to be messy”; they were creating a psychologically safe environment in which the team could experiment. They were clear on the challenge to make IT ‘humane’ and not create unrealistic expectations or workloads that would cost them later in terms of cultural debt.

Bechtle and Schmidt recognised that they were on an emotional journey too, and identified two dips they went through: the ‘faith/fear’ dip, and the ‘are we there yet’ dip/mojo dip (the latter partly exacerbated by holidays). They emphasise how important it was that “it’s a leap of faith” needed to be heard and believed.

An agile coach also did some ‘mind mining’; providing proof, metrics and evidence for emotional state for the data scientists to understand more deeply what was happening as time in the dojo progressing.

RBS’ Wood made it clear that her key goal for evolving her organisation, using DevOps principles, is that she wants to make the technology teams’ lives and jobs easier and free them up to do what they want to do. At RBS they have the value: ‘determined to lead’, and every leader in the organisation knows their physician’s model for retrospective. She said:

Our staff are telling us it’s hard to work here. Some of those symptoms are caused by too much work in progress. We didn’t blame process or technology but identified it was the management that wasn’t allowing for the change. The treatment was to test and learn and to learn about tolerating failure.

Wood said their leaders needed to focus on their “bedside manner”, and learn to lead in a collaborative way. She defined an agile culture as: a mindset embodied in values, implemented by practices and enabled by tools. She said:

The greatest leverage is the mindset change; this is really uncomfortable. We set a loose set of guide rails; we basically told them what we don’t want them to do. We asked them to measure and work out what the next thing is they need to learn. They said we weren’t doing anything and asked us what to do, and we said: “No. You tell us what you want to do.” This is a journey of learning and unlearning. You cannot force people to join an empowerment programme.

KeyBank had a key goal as the elimination of release weekends and the expectation that staff should be working out of hours.

The Dojo and Accelerated DevOps Adoption

Capital One’s Schmidt had visited the Target Dojo whilst Bechtle had also been following a modified version of the dojo. They said that six weeks is the minimum viable time for a dojo, i.e. the minimum length of time to embed a new habit. They claim that:

The dojo is the fastest and most effective way to adapt an engineering culture.

Many of the banking organisations mentioned outsourcing as a reality and a problem in terms of having the talent they wanted to hand. DevOps and outsourcing are hard to do together for a number of reasons:

Colocation and access to the right skills, at the right time

Timezone and language differences

Differing cultures and staff continuity

SLAs, contracts and budgeting mechanisms

Security, Regulation and Safety Remain Key Concerns for Banks

CapitalOne shared that, for them, speed doesn’t exist as a “pure, simple goal” but that it’s imperative it paired with safety in order to avoid “the inevitable crash”. Focusing on safety, stability and resilience together reduces what they refer to as the “Oops Tax”, and security, for them, is always a priority.

Barclays’ Nick Funnell described how the bank has brought functional expertise into the delivery of a product and how, instead of delivering something secured, they are focusing on delivering it in a secure manner. They are doing this by embedding security consultants in the teams. Prior to this approach, as in many banks, infrastructure and security were two large parallel organisations and when agreement couldn’t be obtained, things had to be escalated very high.

Banking regulations demand segregation of duty, which means that you can’t have development teams deploying into production; IT operations has to be the team pressing the deploy button, and also the team receiving change requests. Accordingly, they embedded an IT operations representative into the team and, as a result, saw more successful outcomes.

Standard Chartered’s Norris described his challenge around financial regulation and security:

If you are of a pessimistic mindset the task can seem unfathomably huge. Every country has at least one regulator and there is agreement and disagreement around things like data sovereignty; this effectively represents sixty relationships for us to manage. Any new environments that go live have two-hundred and fifty security controls that need to be mapped and records before it goes live. Our processes have been optimised for compliance, not speed, [and so] we have two releases per year.

He continued to explain how the regulators influence his cloud strategy:

We’re expecting to have relationships with all of the big three cloud providers. The regulators want this because they want to reduce operating risk, and we like it because it makes us portable, and we can use the right platform for the right thing.

Key Bank emphasised the importance of security as code. They described how in many company structures, security is a completely separate organisation that reports through CIO and infrastructure separately reports into CIO; they said that the only thing that will ever work is doing it as early and often as possible – they must “shift left”. Whilst they don’t have a dedicated resource in their team, they do have a ‘go to guy’ – who has made that all possible. They also pointed out how XL Release allows all the steps to be automated and allows them to do risk assessment in the tool down to the release level which will define the workflow; some of the scans are in the deployment process, some of the calls are to the tools in release process

Nick Wadge CTO at Nomura, a bank headquartered in Japan, gave a talk titled ‘Securing Open Source in Nomura’s Software Supply Chain’:

I wanted to know if we really understood our risk from an open source perspective. I wanted to know how we control the risk that’s coming in, whilst still giving the developers the speed they need; speed with control.

Following a successful proof of concept, Wadge identified the tool that could answer his questions, Sonatype’s Nexus Lifecycle, but he needed to apply for budget:

The last thing I wanted to do was spend two weeks writing a thirty-five page business case for something which was so obviously the right thing to do – so I used a one page lean business case. The key is the analysis. You define your outcomes and your outcomes must be measurable.

Barclays’ Funnell described how their AWS environments has allowed the to scatter incognito security throughout. He made the point that:

“If the job is done properly, developers will be unaware of what’s happening with the security.

DevOps Principles Breaking Dependencies

James McCleod from Lloyds Banking Group’s talk revolved around the concept of ‘Enabling Innersource’. He explained that, at the bank, it’s been difficult for engineers to share code, but that Github is now driving community engineering. The platform allows them to collaborate on the outside, and share open source projects, but collaboration on the inside is important, as is bringing in the right talent to the organisation.

He described ‘Innersource’ as internal open source that allows sharing of work so that people can read and improve each other’s documentation, fork off others’ repos and review pull requests. Nexus is the shared registry or repository to which all can contribute. They are moving from a centralised to decentralised model, where federation is promoted and teams can choose their tools. IBM are providing and operating GitHub and they also participate in the working group. They can’t just open up the system though and there are still many engineers who want to join and don’t currently have access.

KeyBank identified their known constraints as cost, complexity, legacy and capacity.

At Standard Chartered, Norris said:

Very few people have the full end-to-end view on how things work, and none of the teams have APIs; the protocols are emails and meetings – we could get very discouraged. We already have one failed (bimodal) attempt at transformation under our belt – communications were opaque, delivery was poor, we didn’t realise much value and it was very high cost.

Norris did give plenty of reasons to be optimistic about the future though – in particular their toolchain, cloud and RunDeck. Standard Bank have over two-hundred teams using their DevOps toolchain and have seen a 90% reduction in turnaround time whilst producing six thousand builds per week. He went on to explain:

Only one team can current go from idea to live. It’s been a steep, uphill struggle to put things in the cloud; a private cloud is a tactical stop gap until we can go into the public cloud easily but providing a cloud-like experience brings some of the benefits and can go faster through using things like containers and Kubernetes.

Norris told a story about managers having to watch videos of their team’s work (production access and change reviews) for the auditors, and how Rundeck took that “hideous” requirement away, and is now a key part of the next generation DevOps toolchain. Like many DevOps initiatives, the RunDeck implementation began under the radar in one small part of the organisation and was organic, underfunded and non-strategic before it proved value.

CapitalOne stated that “embracing open source, microservices and RESTful APIs are always a priority”; these architectural approaches are all designed to allow for loosely coupled systems which allow for smaller changes to be designed, built, tested and deployed.

Simenon at ABN AMRO Bank explained how they had a team dedicated to SOA to loosely couple the systems of record and systems of engagementand, like Standard Bank, had procured a private cloud and are also working towards public cloud.

KeyBank explained how they used XebiaLabs' technology as an accelerator to migrate away from legacy Microfocus platforms and provide better compliant pipelines. In their environment they have IT operations people providing the capability for the developers to self-serve:

We’re much more hands on and provide more education. We have provided the pipeline and can embed IT Operations people in the teams to help build the toolset. Guardrails are provided so we have a “paved road” rather than “off road” experience. Our goal is empowerment.

Funnell described ‘Barclays Lean Control’; an initiative focused on long-lived services (aligned to value streams) and “continuous everything”. They have a ‘control tribe’ who engage with the team tightly to avoid ‘governance theatre’ and the raise constraints very early in Jira (as constraint types) that have to be dealt with. He said:

We have to push the conversation to the left; technology isn’t the hardest bit and it isn’t the bit that’s holding us up. Empowering the smart people and getting them to own and understand what’s going on was the challenge.

Smart at Barclays explained how around two years ago and some way into their agile and DevOps journey they had reduced lead time but were still waiting for integration testing (monthly) because of lots of dependencies, and then waiting for UAT (quarterly) and then waiting for release scheduling (releasing was hard and done quarterly).

He also identified a large amount of waiting time between business cases, the product backlog and the development backlog – around eighteen to twenty-four months in what he called ‘the fuzzy front-end’. But, he said, as soon as it arrived with the development team it’s marked urgent. He said:

This ‘urgency paradox’ leads to unsustainable working - even when the teams are using with modern ways of working. We identified some anti-patterns: anti-pattern one is overcommitting, where we keep on saying yes to the customer without really knowing what the throughput looks like, and as a consequence the backlog builds rapidly. Anti-pattern 2 is starvation and overproduction, and the creation of ‘fake-work’; a ‘feature factory’ where development is disconnected from strategic efforts of firm. Anti-pattern 3 is “feast and famine”, with masses of requirements coming in a single batch, and then nothing coming in, and this doesn’t result in sustainable flow.

Smart emphasised the need for high cohesion and low coupling, and recommended organisations aim to break dependencies rather than manage them.

From an organisational dependency perspective, we can reflect on the creation of cross-functional teams described by all of the banking contributors at DOES:

Having testers in the team means we aren’t dependent on a testing team

Having User Acceptance Testing capabilities in the team means we don’t need to wait for ‘the business’ for approval to deploy

Having IT operations embedded in the teams means requirements for infrastructure are understood and satisfied early and that non-functional requirements such as performance, load and security are embedded into the design

Having security expertise in the team means we can focus on secure approaches in the application and aren’t waiting for infrastructure to be made available, decisions to be made or penetration tests to be performed

Having the customer/business in the team means we have the best understanding of the changing requirements at the earliest opportunity

When we have cross-functional teams, we are more likely to design cross-functional systems (i.e. Conway’s Law) thereby improving end-to-end flow

Lloyds Banking Group’s James McLeod highlighted how they watched starter banks like Monzo go from concept to cash in one month for debit cards, and how, for them, previous to their current engineering transformation programme it would take three-hundred and sixty-five days to get a line of code into production. He cited that the bank sees that “engineering is the future” and the key to satisfy their need to accelerate and catch up with the curve. He stated that: “we are an engineering company”, and described how having the right tooling and messages are essential in order to help them attract the right talent as they continue to double-down on technology.

Barclays is a bank with over three-hundred years of history and £20 billion in annual revenues. They have forty-eight million customers and eighty thousand employees, of which twenty-thousand are in technology. Nick Funnell, GTO development practice lead in his talk, “From Racks to Cloud’, said:

We can’t do anything in finance without technology, AND we are having to do more and more with less as margins are squeezed, profits tightening and regulations increase. We are a technology company, and we are going all in on public cloud.

RBS’s Wood described the conditions for change in banking as: regulation, increasing customer expectations, continued economic underperformance, new and disruptive technology. She said that the foundation of banking and the way that banking is done is changing due to open banking, and that as a result, RBS have no choice but to become more agile as an organisation.

KeyBank’s vision of their future ready workforce is one augmented with automation and intelligent machines with microlearning embedded across the enterprise.

At Barclays, McCall recalled how she previously worked for a financial services organisation ‘going agile’ where project managers were viewed as impediments, and accordingly she was made redundant. At Barclays she described three chapters they have been through as they have evolved:

Chapter 1: The toddler years – focus on constant improvement and building agility in using the support of an agile coach, physical then tooling, self-optimising team

Chapter 2: The age of enlightenment – pivoting to the concept of the long-lived product and the concept of the value stream

Chapter 3: the Yoda years – have limited the portfolio of work in progress and prioritised the strategic work – started finishing and stopping starting – focus on outcomes not outputs

McCall said:

The PMOs are more advisory and consultative with a focus on dependency and release management. We have pivoted to be a learning organisation.

Almost all of the speakers highlighted coaching as a key learning enabler.

Chaos, Colocation, Communication and Communities

McLeod focused his talk on community engineering and collaboration in code, and how these approaches drive openness and education through promoting the use of code as the fundamental building block of discussion – rather than Powerpoint. He described Lloyds Banking Group’s use of internal open source (his ‘innersource’, which was covered previously in this article), but he also spoke about how they are starting to have conversations about collaborating outside of their organisation, and doing enterprise code sharing; this is challenging from a security perspective, but is something that they feel is key to attracting the talent they need to satisfy their renewed focus on engineering and being a technology company.

In addition to this initiative, they are also contributing to and participating in conferences and meetups, and have held code camps for children in their Halifax branch in Oxford Street in London.

Barclays’ Funnell described how they began using Jira to help drive collaboration, particularly between sites, and try to gain some insights into work that felt chaotic. They were not immediately happy – partly because using the tool showed that they were effectively kidding themselves. This was later viewed as positive as it allowed for healthy tension and a drove a lot of interesting conversations.

They also set up webcams so that they could see into other offices, which they found really powerful. They did daily standups around video boards. Funnell said:

Colocating is the best thing you can do but this is the next best. We could see what we had to do but then could see it was too much, particularly for the testers; a Kanban board showed what work needed to be done, but then showed that it wasn’t being worked on and the testers didn’t know what it was. We used the three amigos, the power of three, and brought the testers in earlier and ploughed through the backlog. We found that quality became much improved.

Jelena Laketić Head of Asset Management SWAT at UBS talked about their UBS Global hackathon which takes place across four regions and fifteen locations as a key DevOps enabler.

Another organisation that has found hackathons an effective way to raise digital awareness and bring people together to innovate is Nomura bank, headquartered in Japan. Their annual, global hackathon event brings their global developers together for twenty-fours hours; the theme of their most recent hackathon was around building chatbots on the messaging platform. They also have their ‘Tech Fayre’: a internal trade show that provides an opportunity for internal developers to share their work through a theme; last year it was ‘innovation’.

ABN AMRO ran a CI/CD awareness programme with particular focus on the management. They had a summer programme incorporating all kinds of events including leadership and external speakers. Simenon said:

It’s important to have senior management on board; they need to not just say it but show it and spread the message continuously into the organisation.

KeyBank’s approach to community education was around using lunch and learns, ‘brown bag’ sessions and presentation days.

About the Author

Helen Beal is a DevOps consultant, coach, trainer, speaker and writer. Her focus is on helping organisations optimise the flow from idea to value realisation through behavioural, interaction based and technology improvements. Fond of llamas. Once she saw a flamingo lay an egg. You can reach her at helen.beal@infoq.com - especially if there's something you want me to write about.

Community comments

Problem in understanding "feast and famine" anti-pattern

Your message is awaiting moderation. Thank you for participating in the discussion.

> Anti-pattern 3 is “feast and famine”, with masses of requirements coming in a single batch, and then nothing coming in, and this doesn’t result in sustainable flow.I am having a problem understanding the above sentence. Did it means nothing coming out after masses of requirement coming in a single batch?