Ali Yunjuns machine was startled for three hours: How much would many apps lose if they were paralyzed?

Affected by the downtime, many Internet companies in North China, such as APP and websites, have been paralyzed, and a wave of programmers, operators and maintenance have to get up from their beds to work.

More importantly, this is not the first time Aliyun has failed.

Netizens Shanghai Blue League Network Xia Licheng teased, Aliyun one year down, this year is particularly early.

But after the downtime, people need to think about why downtime failures occur frequently and how to compensate afterwards.

Three hours of panic

In response to Aliyuns downtime, Shen Jian, 58 senior architect, said the accident lasted about three hours and was observed for two hours afterwards.

The most direct impact of the downtime is that the company website or APP that purchases Aliyun services can not be used normally.

If unusable is still an abstract term, the affected enterprises can provide a more concrete understanding.

Confucius Old Book Network issued a statement on the 3rd, saying that due to the large-scale failure of Aliyun, the network is temporarily unavailable. In other words, during the downtime, users will not be able to buy goods in the hole network.

Another example is the statement issued on the same day that the Hi score (a football game live application platform) said that the Aliyun downtime led to the Hi part of the module Katon phenomenon, that is, the user experience has declined.

By analogy, the larger the fault area of Aliyun, the more enterprises and users affected.

About an hour after the outage, Aliyun responded that IOHANG appeared in the C part of ECS servers in the usable area of North China 2, which was gradually recovered after emergency investigation.

According to China News Agencys Aliyun official website, Aliyun service can be divided into three parts: Asia-Pacific, Europe and America, Middle East and India, and the Asia-Pacific region includes 13 sub-regions such as North China, East China, South China and Hong Kong.

A screenshot of Aliyuns official website

Part C of North China 2 Area Available Area, that is, one part of North China.

Usually, in order to reduce network delay and improve the speed of customer access, enterprises will choose to buy areas close to customers.

Therefore, after the outage, North China is a mess of porridge.

With more and more enterprises and applications moving data to the cloud, every small downtime on the server may cause a catastrophe.

Aliyuns successive downturns

As the largest cloud service provider in China, this is not the first time Aliyun has gone down.

In June 2018, Aliyun witnessed a large-scale visit anomaly, and products such as photo services could not be used properly, and its official account could not be logged in. Officially, the failure was due to a misoperation in operation and maintenance. Afterwards, Aliyun said he would revere every line of code and every commitment.

In October 2016, the ECS server IOHANG accident occurred in part B of the Available Area of East China 1 in Aliyun.

Further on, in September 2015, Ali Yunduns Android product upgrade triggered a bug that caused some of the normal files in the users ECS to be mistakenly isolated. The reason is that the programmer wrote the wrong line of code. Also in that year, Aliyun launched the Hundred Times Compensation Plan.

In addition, according to media statistics, Aliyun has experienced various degrees of failure in 2012, 2013 and 2014.

According to IDC, a market research institute, Aliyun ranks first in China, accounting for 43%, which is equivalent to the sum of the second to ninth places. The following are Tencent Cloud, China Telecom, AWS, Jinshan Cloud, Ucloud, Microsoft, Baidu Cloud and Huawei Cloud.

Such a large amount, Aliyun every time the downtime will have a great impact on customers.

Contrary to its negative impact on customers, Aliyun has become a global leader in cloud services with its large Chinese market.

According to Alibabas January 30 earnings report, Aliyuns revenue scale is 21.36 billion yuan, which has grown about 20 times in four years, making it the largest cloud service company in Asia. Last year, the figure was 11.17 billion yuan.

How to compensate for the downtime?

After the outage, Aliyun said it would deal with the compensation as soon as possible according to the SLA agreement.

SLA protocol is the Service Level Agreement (SLA). According to Aliyun official website data, for a single ECS instance, if the service availability is less than 99.95%, users can get compensation of 10%, 25% and 100% of the monthly service fee.

A screenshot of Aliyuns official website

In addition, Huawei Yun and Tencent Yun have similar compensation standards.

A cloud computing enterprise engineer told China News Agency that the country is a through train, and the compensation for cloud service failure is basically based on sending time. For example, before that, Aliyun had executed 100 times time compensation.

A screenshot of Aliyuns official website

But sometimes there is a huge gap between this compensation and corporate losses. For example, if Taobao in Jingdong could not be landed for 5 minutes, how much would it cost? In response to the outage, some netizens also proposed that in addition to indemnifying the use of time and vouchers, they should also compensate for overtime, and many peacekeeping programmers climbed up from their beds to work overtime. For enterprises, they are most concerned about how to avoid failure. Some analysts believe that although cloud service providers promise 99.99% security and reliability, anyone is likely to be 0.01% of the bad luck. Therefore, there are usually two ways to avoid failure: one is to backup data and update it regularly; the other is not to put eggs in the same basket and use more than one cloud service provider. But this will undoubtedly increase the cost of enterprises. How to make cloud service providers more reliable is still a problem to be solved. Source: Guo is the responsible editor of through train: Wang Fengzhi_NT2541

But sometimes there is a huge gap between this compensation and corporate losses. For example, if Taobao in Jingdong could not be landed for 5 minutes, how much would it cost?

In response to the outage, some netizens also proposed that in addition to indemnifying the use of time and vouchers, they should also compensate for overtime, and many peacekeeping programmers climbed up from their beds to work overtime.

For enterprises, they are most concerned about how to avoid failure.

Some analysts believe that although cloud service providers promise 99.99% security and reliability, anyone is likely to be 0.01% of the bad luck. Therefore, there are usually two ways to avoid failure: one is to backup data and update it regularly; the other is not to put eggs in the same basket and use more than one cloud service provider.

But this will undoubtedly increase the cost of enterprises. How to make cloud service providers more reliable is still a problem to be solved.