This article is a mirror article of machine translation, please click here to jump to the original article.

View: 11425|Reply: 2

Huawei engineers mistakenly deleted user data, causing 800,000 mobile phones in Guangxi Mobile to be unable to make calls

[Copy link]
Posted on 9/12/2017 2:23:35 PM | | | |

A picture seen from the QQ group

On September 8 (this Friday), many Chinese mobile users in Nanning, Guangxi suddenly became unable to make and receive calls and could not use traffic.
China Mobile Guangxi Company's official Weibo released an announcement: "In order to improve the quality of communication services, our company has recently carried out network upgrades, which may cause some mobile phone users to temporarily be unable to use voice, text messages, and mobile phone Internet services, and are currently speeding up the process." ”
Is this really the case?


However, until this morning, many users still reported that there was still no signal on their mobile phones. The official has not released relevant recovery or the latest progress announcement.
Yuntoutiao reported that at 5:00 a.m. on the same day, after the expansion and cutover of HSS09 (Huawei) in Nanning, Guangxi Province, it was found that some user numbers could not be called as masters, and data services could not be used. It was preliminarily judged that the user data was lost due to human misoperation of the project cutover.
The accident affected some users of the local network in Qinzhou, Beihai, Fangchenggang, Guilin, Wuzhou and Hezhou, and it is preliminarily estimated that 800,000 users were affected.
According to Guangxi Mobile insiders, the misoperation of Huawei's supervisory implementation personnel was to format 1 pair of DSU boards for each NNHSS09BE01/NNHSS09BE02 disaster recovery (a total of 8 pairs of DUS boards in the HSS), resulting in the deletion of user data stored in HHS, resulting in the inability to use all 2/3/4G services for about 800,000 users in Qinzhou, Beihai, Fangchenggang, Guilin, Hezhou, and Wuzhou. As many as 1 million users have been without signal for almost 24 hours, and as of the time of publication, some user data is abnormal.
Failure process:
5:00 After the detection of Huawei's NNHSS09BE01/NNHSS09BE02 magnetic array expansion sub-project, it was found that due to the misoperation of the manufacturer's implementers during the implementation of the project, 1 pair of DSU boards for each NNHSS09BE01/NNHSS09BE02 were formatted (the HSS has a total of 8 pairs of DSU boards), resulting in the deletion of user data stored in HSS, resulting in about 800,000 users in Qinzhou, Beihai, Fangchenggang, Guilin, Hezhou, and Wuzhou2/ 3/4G is unavailable for all services.
8:15 Complete the authentication shutdown of all SGSN POOLs and MSC POOLs.
10:00 In order to speed up the registration of users on the network as soon as possible, the update cycle of the location has been modified to 6 minutes, and the user is forced to register with the network after 6 minutes. Therefore, 2G services will resume one after another from 10:00.
11:40 2/3G services basically returned to normal
11:40 Get the real authentication data from the BOSS
11:40-13:40 Complete the recovery of all authentication data and user data in three batches
As of 13:30, all faults were restored
A total of 10,086 complaints were received and 20,727 were received for this failure. It is a major failure of the group, and it is hoped that Guangxi Mobile will give the latest progress announcement as soon as possible to let users know.
Employ people again...
I believe that many students did not understand the above fault process information, and the official account "Fresh Jujube Classroom" did a popular science:
On the evening of September 7, in the second half of the night, the manufacturer's personnel carried out expansion and cutover (that is, to increase the capacity of the system, which is a common job, commonly known as "operation") in the industry). During the cutover, the engineer accidentally formatted and deleted the user data in the HSS device.
At 5:00 a.m., when it was almost dawn in the morning, people from Guangxi Mobile found something wrong and knew that the data had been deleted. It is estimated that everyone present at that moment was devastated.
If the user data is gone, it means that your user does not exist in the system. Of course, you can't make a call, so many users report that "when you call, it says it's an empty number".
Mobile quickly did two things: The first thing was to temporarily create user data for these 800,000 users (equivalent to opening an emergency account), and at the same time, because the authentication data cannot be falsified, it blindly compiled an authentication data, and then turned off the authentication function of the entire system.
What is authentication data? Let's put it simply, that is, there is a password in your mobile phone, and there is also a password in the mobile system. Now that the mobile has lost the password, it can't identify whether you are real or not, and it simply turns off the authentication function temporarily. In fact, at this time, if you are a fake user, you can also access the mobile system, make calls and surf the Internet. This risk is very big, but mobile can no longer control so much at this time, so you can't let real users call, right? In case of mistake, the responsibility is greater.
The second thing is that because the 800,000 users at that time became "lonely ghosts" (because they did not exist in the system, so they could not connect to the network), so after the mobile completed the temporary account opening, it needed to initiate a compulsory registration (equivalent to a mobile yelling - "6 minutes later, come to me"), and all users' mobile phones quickly went to find it (registered to the network).
Why 6 minutes instead of 6 seconds? Because this is a mandatory registration cycle time, 6 minutes is a cycle, and the network must be found every 6 minutes. If it is 6 seconds, the province's 10 million users' mobile phones, find a father once every 6 seconds, and this father will also be exhausted (the load is too great, the system will crash).
These two things are for temporary resumption of business. (The first principle of emergency fault handling: restore business first)
In addition to these two things, Mobile hurriedly went to the BOSS to get the real user data.
Note that this BOSS does not mean boss, but is a business operation support system (BOSS, Business & Operation Support System). It is usually divided into four parts: billing and settlement system, sales and accounting system, customer service system and decision support system. To put it bluntly, the mobile telecom business hall is connected to the BOSS system, and all your number information, balance information, and what services you have opened are all in BOSS.
At 11:40, the business was temporarily restored, and the real user data was also obtained.
Mobile got the real user data and quickly wrote it into the system. After writing, the data is considered to have been recovered. At this time, Mobile re-enabled the authentication function. Everything was completely restored to its original state...





Previous:HTML does not refresh to change the url address
Next:Introduce a fun mobile game >?
Posted on 9/12/2017 3:21:26 PM |
Wow, I finally understood
Posted on 9/12/2017 4:58:50 PM |
I don't understand it, but I think it's amazing
Disclaimer:
All software, programming materials or articles published by Code Farmer Network are only for learning and research purposes; The above content shall not be used for commercial or illegal purposes, otherwise, users shall bear all consequences. The information on this site comes from the Internet, and copyright disputes have nothing to do with this site. You must completely delete the above content from your computer within 24 hours of downloading. If you like the program, please support genuine software, purchase registration, and get better genuine services. If there is any infringement, please contact us by email.

Mail To:help@itsvse.com