CASE STUDY OF THE MASSIVE POWER OUTAGE IN MUMBAI
Analyzing an outage can provide valuable insights into vulnerabilities, root causes, and areas for improvement, helping to prevent future incidents. This knowledge also aids C-level executives in making informed decisions about infrastructure, risk management, and disaster recovery. Proactive measures, such as redundancy planning and cybersecurity protocols, can be highlighted, empowering organizations to be better prepared. The massive power outage in Mumbai on October 12, 2020 was an unprecedented failure. It occurred around 10:00 local time due to issues with the incoming supply to the main grid. While power cuts are common in many parts of India, they are rare in major cities, especially in Mumbai. The restoration process took time due to the complexity of diagnosing and resolving the underlying technical issues. However, officials worked diligently to restore power to affected areas.

BACKGROUND
ShadowPad and Red Echo are significant cyber threats targeting critical infrastructure, particularly in India. ShadowPad is a sophisticated backdoor Trojan malware that establishes a covert pathway to its command-and-control servers, enabling attackers to extract information or deploy malicious code. Red Echo, a cyber threat group linked to China, has been actively involved in cyber campaigns against India’s power sector, especially during periods of heightened border tensions between India and China. The U.S. security firm Recorded Future has warned that Red Echo is targeting Indian critical infrastructure entities, including many power sector and maritime sector organizations. The implementation agencies of R-APDRP SCADA/DMS were inadvertently introducing vulnerabilities, particularly through flawed RTUs and Chinese communication devices, about which Indian OT security experts had been gravely concerned and frequently warning. The automation architecture was inherently vulnerable due to the absence of a reliable firewall ruleset and the inadequate and improper deployment of Network Intrusion Detection Systems (NIDS) and Host Intrusion Prevention Systems (HIPS).
An FIR has been registered by law enforcement regarding the breach of attack vectors targeting Load Dispatch Centres and VAPT details related to the R-APDRP SCADA/DMS implementation in India by various foreign implementation agencies. This action was taken based on findings from an industrial cybersecurity researcher under Section 66F of the IT Act, which pertains to cyber terrorism. As a scheduled offense, it is mandatory to inform both CERT-In(Six hour rule) and the Union Home Secretary for permission for further actions. Despite the severity of the offense and its non-bailable nature, the accused was not arrested but permitted to leave the country. A delayed and concocted report was presented to the court to refer the case. Subsequently, the investigating officer was dismissed from the Kerala Police for colluding with criminals in other crimes. Instead of addressing the gravity of the situation, law enforcement attempted to whitewash the crime.
INCIDENT DETAILS (SOE)
1. TATA’s Incoming Supply Failure: o The power outage was initially attributed to TATA’s incoming supply failure. TATA Power is a significant player in power generation and distribution in Mumbai. 2. MSETCL’s 400Kv Transmission Line Tripping: o Around 10:00 a.m., the Maharashtra State Electricity Transmission Company Limited (MSETCL) experienced a fault in its 400Kv transmission line. 2 o This line supplied power to Mumbai and its surrounding areas, including Mumbai Central, Thane, Jogeshwari, Wadala, Chembur, Dadar, and Kandavili. 3. Pune-Kalwa Line Forced Shutdown: o Simultaneously, the Pune-Kalwa line was forcibly shut down due to a line-to-line fault (specifically, an RY line fault). o This fault affected the stability of the system. 4. Emergency Shutdown of Kalwa-Pagdhe Line: o MSETCL initiated an emergency shutdown for the second power line (400Kv KalwaPagdhe). o Maintenance work began on this line with the intention of resolving the fault and restoring it to its original position. 5. Tripping in Kalwa-Padghe Line: o Despite efforts, the Kalwa-Pagdhe line tripped, leading to further complications. o This line carried a substantial load of 634MW. 6. Impact on Kalwa and Kharghar Power Plants: o The tripping of the Kalwa-Padghe line had a cascading effect. o Both the Kalwa and Kharghar power plants also tripped. o Intense sparking occurred, resulting in a complete trip in the Pune-Kharghar line. o The sudden load drop in the Mumbai power system network caused an estimated total load loss of 2600MW. TIMING: The cyber-attack on Mumbai’s power grid in October 2020, which coincided with the heightened tensions between India and China at the Galwan Valley, is often cited as an example of hybrid warfare. Reports from cybersecurity firms and media sources suggest that the malware attack, which caused a significant power outage in Mumbai, was linked to Chinese state-sponsored groups. This timing of the incident highlights the evolving nature of modern conflicts, where cyber and physical domains are increasingly intertwined.

IMPACT ASSESSMENT
The Mumbai power outage caused significant disruptions, affecting daily life and critical services. The outage lasted over eight hours, causing financial capital to fall into darkness and disrupting normal activities. Local train services were severely impacted, leading to road traffic disruptions. The Bhandupbased water purification plant experienced a temporary power supply interruption, affecting water pressure in several areas. The outage highlighted vulnerabilities in the city’s infrastructure and underscored the need for robust disaster preparedness and reliable energy systems. The outage also affected services such as banks, trains, flights, and markets in India. This incident underscores the critical importance of robust grid management and timely fault resolution to prevent widespread blackouts.
TECHNICAL ANALYSIS
A domain expert-led team completed the technical analysis. The Mumbai Power Islanding System The Mumbai power islanding system, developed in 1981 by Tata power supply, is a unique and strong solution to power outages in Mumbai. It ensures uninterruptable power supply to the entire city and has saved the city from nearly 27 blackouts and events. The main objectives of the system are load generation balance, avoiding generator tripping, and quick restoration of the system with instantaneous isolation of all identified tie points. Frequency is a critical parameter in maintaining a high and rich islanding system. If the system’s frequency drops, the system restores and sustains by combining underfrequency control and power flow into the grid. If the system’s frequency increases, the auto restoration scheme stabilizes the system’s frequency and maintains normal conditions. Backup islanding schemes and redundancy are provided for smooth power flow if the islanding scheme 3 fails to restore or isolate the system. In case of grid failure, the Mumbai islanding system is highly capable enough to ensure continuity of power supply to the city. Tata power supply system is automatically isolated from the rest of the grid, saving the power system network and minimizing faults on the network. The Islanding Mystery: Unraveling the Causes of Failure Any of the following factors could have contributed to the islanding scheme’s failure, which allowed for the propagation of disturbance. 1. Latency and relay time settings: Latency and relay time settings significantly impact the performance of protective relays and fault clearing. Longer latency might have delayed fault detection and tripping, affecting system stability. Trip circuit latency includes relay decision time, physical output contact assertion time, and additional timing due to auxiliary trip circuits and digital communication paths. 2. Communication Speed: Propagation times might have affected relay operation, with faster communication reducing fault-clearing time. NB: Measuring trip circuit latency is essential, especially for nondeterministic technologies like Ethernet-based devices. Ultra-High-Speed (UHS) line protective relays rely heavily on relay processing and communication times. System conditions and faults also impact tripping, with relay processing times significantly impacting tripping. Faster relay responses enhance system resilience. Minimizing latency, optimizing relay settings, and ensuring efficient communication contribute to reliable fault detection and faster fault clearance. For more information or specific scenarios, feel free to ask. 3. Cyber Security: An islanding module might have failed to function properly during a power outage due to several cyber security reasons. These include: Incorrect or compromised SCADA data, which can affect the module’s decision-making. Faulty communication between components, such as relays, switches, and controllers, can also disrupt the islanding process. Fault detection relays, which play a critical role in identifying grid disturbances, can malfunction or fail to detect faults accurately, potentially causing the module to not activate as needed. Additionally, malicious actors could launch DoS attacks against the islanding module, which could delay its response or prevent proper isolation. Robust islanding detection methods are essential to prevent safety hazards, equipment damage, and grid instability. These factors are crucial for cyber-terrorism and grid resilience. The following is a list of potential strategies for attack that cybercriminals may employ: a. DoS Attacks and Fault Detection Relays: A DoS attack aims to degrade or block the availability of targeted resources by overwhelming them with malicious traffic. Fault detection relays play a critical role in power systems by identifying faults (such as short circuits) and triggering protective actions. While RATs typically target endpoints (computers, servers), directly attacking fault detection relays is less common. Communication between components (like relays, switches, and controllers) is crucial for effective islanding. b. RATs and Communication: RATs can indirectly impact communication channels by flooding network connections. However, directly blocking communication between relays or the islanding module using RATs is less common. 4 c. Sensor Targeting: While RATs typically focus on software and network vulnerabilities, they could potentially target sensors (such as frequency sensors, current transformers (CTs), and potential transformers (PTs)). Disrupting sensor data could indirectly affect the islanding module’s functionality if it relies on accurate sensor readings.