DHS reveals year of IT outages

news New Federal Government super-department the Department of Human Services has revealed it suffered 137 IT outages thoughout the year to the end of September 2012, with dozens of instances where customers of services such as Centrelink were unable to access online services through Centrelink’s web site.

Boyce had asked DHS whether there had been any recent significant problems with its IT reliability, what problems it had had over the past 12 months and whether they could be detailed, and lastly what was being done to improve the IT systems reliability of the department. “In the period 1 October 2011 to 30 September 2012, DHS experienced a total of 137 ICT Reliability Outages,” the department responded. “28 of which were experienced in the outsourced Vendor Management Environment.”

A document attached to the department’s responses details dozens of examples where external and internal users of DHS’ IT systems were denied access to those systems because of outages. For example, a typical record states: “Customers with records in Environment S were unable to access Centrelink Online Services or Phone Self Service.” Other records stated that staff could not access intranet records.

There were also examples where staff were simply not able to log in to their desktop environments or could not use the Lotus Notes communications platform in use in some divisions of DHS, or Microsoft Office.

The inability of customers to access Centrelink online services has the potential to be a major issue for those customers. Centrelink is Australia’s main welfare agency tasked with supporting Australians who are out of work or receive other forms of welfare payments such as child support. Many of those customers are required to file forms on a regular basis with Centrelink or face having their payments postponed or cancelled. If those customers could not file their forms electronically or via telephone, they may have been required to visit a Centrelink office to file their forms, which would require a much more significant time investment than handing in the information remotely.

DHS wrote in its answers to Boyce that systems stability was “a key priority” for the organisation. “Improvement programs undertaken over the last 12 monts have realised a 27 percent reduction in ICT reliability outages during the July-September quarter compared to the previous April-June quarterly period,” the department wrote. In addition, DHS noted that it was currently undertaking a number of major IT improvement initiatives to fix the problems.

The first one involves a substantial datacentre rationalisation, which will see DHS reduce its datacentre footprint from seven to two over the next few years. “This is expected to improve reliability of services through the use of highly resilient modern Tier 3+ facilities,” wrote DHS. “The datacentres will ensure DHS can provide core services with high availability and redundancy.”

Another project will see the group’s “Customer Portal” — presumably including the Centrelink online services functionality — migrated to a 64-bit environment. “The new environment will increase the number of concurrent sessions that are able to be handled, thus increasing capacity and reliability for online services,” wrote DHS.

A third project will see ICT services previously outsourced from IBM and HP to in-house managed services — affording DHS “greater visibility and control over all ICT components that contribute towards a service”, and removing dependency on external factors, thus enabling what DHS described as proactive service management and increasing service reliability as a result.

Lastly, DHS noted that it was upgrading its monitoring and reporting capabilities. “DHS ICT teams are undertaking ongoing configuration of Agentless Monitoring Devices throughout the datacentres to monitor data packets,” the department wrote. “As configuration progresses, the devices will provide greater visibility into ICT service performance by allowing more comprehensive monitoring of ICT transactions. This will allow further visibility into the underpinning ICT service performance of the expanding portfolio of DHS business services offered to customers.”

opinion/analysis
I have to say, I am quite surprised by the news that DHS has been having this large series of outages in its online services. I’ve said before on Delimiter that the department’s IT division, which was largely formed from Centrelink’s IT team under John Wadeson, has a good reputation for IT service and project delivery within the Federal Government.

What this week’s questions on notice show is that that reputation is at risk. DHS really needs to get these outages under control. The department effectively runs an IT operation the size of a major bank, with a similar number of external and internal users of its IT infrastructure. Australia’s major banks have been engaged in a huge effort over the past few years to reduce and ultimately cut out what they call “Severity 1 outages” — outages which stop them from fulfilling their core mission of providing banking services.

I know that both Westpac and CommBank have hugely reduced the numbers of these incidents, and I’m sure NAB and ANZ have been working on this situation as well. It seems like now it’s DHS’s turn to place a huge focus on this. The organisation’s dependents are some of the most needy in Australia. Let’s hope the department can keep its online services up longer in 2013 to ensure they are well served.

6 COMMENTS

I wonder how many of those 137 would be the equivalent of a “Severity 1 outages” one though? If they were whole of department, then yeah, I agree it may impact on their reputation, but otherwise all organisations have IT issues from time to time.

“Centrelink’s IT team under John Wadeson, has a good reputation for IT service and project delivery within the Federal Government.”
Where is this reputation coming from? Maybe it’s just on comparison to other government IT that Centrelink looks good, because I can tell you Centrelink online services are appalling. It’s as slow as tar, and that’s when it actually works. I wonder if these 137 outages include the planned outages – I doubt it, they happen every week.

I’d like to echo Karl’s question, Ren – what’s the basis behind this ‘good reputation’? My wife worked for Centrelink for just shy of 10 years and I can tell you their whole infrastructure dropped out regularly, leaving staff to have to ask ‘customers’ to call back at another time. Their systems are so slow at peak times they routinely disconnect whole queues of people waiting to talk to someone, that’s if they were able to even get into a phone queue… They spend far more time and dollars on efficiency tracking and monitoring software so they can micro-manage staff call turnaround times than they do ensuring their basic infrastructure is reliable – several of their recent internal ‘upgrades’ were designed specifically for the purpose of monitoring staff, while actually complicating their work and making their systems less efficient, causing them to use workarounds to manhandle the new system to do what older systems could do on the one screen.

Interestingly it is also the most mismanaged, negative, misogynistic and short-sighted organisation I have ever had the displeasure of coming into contact with. The best thing my wife ever did was leave. Interesting that all staff are made to sign what effectively amount to gag orders as part of their employment contract – it’s designed to stop them talking about customer details and circumstances, but it is used by management to silence them from complaining outside the organisation (yes, explicitly). So I guess it’s not too surprising that you haven’t heard what is actually going on in there – everyone that works there is too scared to talk about how bad it is.

My brother has been hit with this shit. He’s young, been out of school a few years but working steadily, but was released by the company he worked for a reason he’s taking to an unfair dismissal case.

He called up to go on newstart/youth allowance.

Wait time. 1 and a half hours. Of course, he didn’t get through. Called the complaints line, they put him through to a slightly quicker stream. 30 minutes later the COB rolled around and he got hung up on.

Throughout this story there are gaps of days because he gets told he has to get an appointment for a certain person first, and has to wait 3/4/5 days before they are available.

When he got through, they said he had to apply online.

So he goes to apply on line. It says he needs ‘high access’ and to go into a centerlink office and do that.

He does that. He waited an hour and a half to see someone.

Then they setup his ‘online services’ & told him to go home and apply online.

Goes home, and his password is broken. Can’t log in. Has to call tech support who tell him to go back into the office.

Went back into their office, wait another hour and a half, only to be told the system can’t sign him up for some reason, and they give him an appointment earlier than the last week of this month.

So he still hasn’t even signed up properly. Meanwhile our family is supporting him. By the time he ends up at one of their useless job search agencies like Mission Australia (so much for Secular Australia) he’ll have waited 2 months. And it took my mother to make sure they didn’t stuff up his ‘date contacted’ meaning they’d have gotten away with no backdating his newstart/youth allowance.

10 years ago he’d have waited in a queue for 5 minutes at reception, seen one of the 5 receptionists on the desk, been directed to one of the two floors of staff (our local office is now one floor, with some of the medicare people taking what was a vacant top floor before the merge), where they would have staff to see people with no booking, and been signed up and sorted out by that same person they seen in 15/20 minutes.

None of the staff care anyway, they are all too busy trying not to get retrenched from their cushy office jobs or being forced to switch to working in the call centre (hellhole nightmare conditions for people who are used to the government office white collar working good life).

And the call centre in Perth is notorious as the worst in the country. It would be interesting for a FoI request to uncover statistics like staff turnover – nearly everyone in Perth call has been there less than 5 years. As soon as female staff get pregnant there is serious pressure applied to get them to go part-time, so that they get paid less by the time they go on maternity leave. Women are far less likely to be promoted to permanent senior positions, too, again probably to reduce the impact of maternity leave and retiring mothers.

Sorry Ren, I realise all this is beyond the technology focused scope of your site, it’s just frustrating that this kind of thing still goes on in 2013 and continues to be a dirty secret that news agencies refuse to touch.

Concur with the above comments.
My first experience with centrelink recently trying to register a young ‘un for the childcare rebate was attrocious.
60-90 second load time between pages on an online form, which crashed every 2-3 pages reporting “proxy error”. Since the whole process was around 20 pages in length, this was indeed a time consuming process.

Welcome! We were an energetic and engaged community of Australians who worked with or who were interested in technology -- all sorts of IT professionals, IT managers, CIOs, tech policy-makers and tech enthusiasts.