Exploration and Application of Voiceprint Recognition of Power Transformer

In a groundbreaking fusion of acoustics and artificial intelligence, engineers at State Grid Anhui Electric Power have pioneered a novel, non-invasive method for monitoring the health of critical power grid infrastructure. By treating the hum and vibration of a massive 220 kV power transformer not as mere background noise, but as a rich, data-laden “voiceprint,” the team has unlocked a new frontier in predictive maintenance. This approach, detailed in a recent study, leverages deep learning to transform ambient sound into a precise diagnostic tool, capable of not only detecting anomalies but even accurately estimating the transformer’s real-time electrical load. The implications are profound: a future where the subtle acoustic signature of a transformer can preempt catastrophic failures, optimize grid performance, and usher in a new era of truly intelligent, self-aware power systems.

The core insight driving this innovation is elegantly simple yet profoundly powerful: sound and vibration are direct, unfiltered carriers of mechanical information. Every clank, every hum, every subtle shift in frequency emanating from a transformer is a physical manifestation of its internal state. For decades, utilities have relied on scheduled outages and intrusive electrical tests to assess transformer health, a process that is costly, disruptive, and often reactive rather than proactive. The advent of vibro-acoustic detection technology promised a solution—continuous, external monitoring without any electrical connection to the high-voltage equipment. However, the raw data from these acoustic sensors was a cacophony of noise, polluted by environmental sounds like bird calls, wind, human speech, and the clatter of nearby machinery. This is where artificial intelligence, specifically deep learning, steps in as the indispensable translator, turning this chaotic symphony into a clear, actionable diagnostic report.

The research team, led by senior engineers Zhang Chenchen, Ding Guocheng, and Li Jianlin, focused their efforts on a specific 220 kV main transformer, model OSSZ11-180000/220, which had been in service since June 2016. Their first challenge was data acquisition.They designed a robust, weatherproof acoustic sensor array using high-precision electret microphones, calibrated to capture the critical frequency range of 20 Hz to 20 kHz. This ensured fidelity in recording the low-frequency rumbles that are most indicative of core and winding conditions. Deployed in the harsh, uncontrolled environment of a live substation, these sensors began collecting a continuous stream of audio data, laying the foundation for what would become a comprehensive “voiceprint” database.

Building this database was not a simple task of passive recording. It required a sophisticated, multi-stage data curation process. Initial recordings were a messy blend of the transformer’s true operational sound and a barrage of external acoustic events. To create a usable dataset for machine learning, the team employed a powerful suite of data preprocessing and cleaning techniques. The first line of defense was wavelet transform-based denoising. Wavelet analysis, with its unique ability to provide high time resolution for high-frequency components and high frequency resolution for low-frequency components, proved ideal for isolating the transient, anomalous sounds buried within the steady hum of the transformer. This step significantly suppressed background noise and enhanced the core acoustic features of the transformer itself.

The next phase involved untangling the remaining signal. Even after denoising, the audio contained both steady-state sounds (like the constant drone of cooling fans) and non-steady-state events (like a sudden tap or a burst of conversation). The researchers applied a cosine similarity matrix median replacement method to effectively decompose these overlapping signals. This mathematical technique allowed them to cleanly separate the persistent, rhythmic sounds of the transformer’s normal operation from the sporadic, unpredictable noises of its environment. The result was a purified dataset where the transformer’s true “voice” could be clearly heard and analyzed.

The final and most critical step in data preparation was multi-event detection. The team took approximately 1.7 hours of pre-processed audio and segmented it into 5,958 one-second clips. These clips were meticulously labeled into six distinct categories: bird calls, fan noise, human speech, wind, tapping sounds, and, crucially, the “normal operating sound” of the transformer. This labeled dataset was then split into training, validation, and test sets for a deep neural network. Using 80-dimensional Filter-bank features derived from short-time Fourier transforms, the model was trained to recognize and classify these six sound events. The results were impressive: the model achieved a 98% accuracy rate in identifying the transformer’s normal operating sound, a 97% accuracy for fan noise, and a 91% accuracy for wind. While accuracy for more transient events like bird calls (76%) and tapping (74%) was lower, the system’s ability to reliably isolate the transformer’s core acoustic signature was more than sufficient for its primary diagnostic purpose. This rigorous cleaning process transformed raw, unusable audio into a high-fidelity dataset, setting the stage for advanced predictive modeling.

With a pristine dataset in hand, the team turned to the most ambitious part of their project: using the transformer’s voiceprint to predict its operational state, specifically its electrical load. In a normal, healthy transformer, the primary factor causing changes in its acoustic signature is its load—the amount of electrical power it is currently handling. As the load increases, the magnetic forces within the core intensify, leading to greater magnetostriction and, consequently, louder and more complex vibrations. The team hypothesized that if they could train an AI model to recognize these subtle acoustic patterns, they could use sound alone to estimate the transformer’s real-time load, a parameter traditionally measured by electrical sensors.

To test this, they collected a month’s worth of continuous audio data from the transformer. They used a single channel from one of their four microphones, sampled at 48 kHz, to build their predictive model. Recognizing that electrical load changes relatively slowly, they treated the load as constant over 10-second intervals. The 15-minute audio segments were processed using the same wavelet and similarity matrix techniques to isolate the “normal operating sound.” These clean 10-second clips were then fed into a neural network, with their corresponding, time-synchronized high active power values (the load) serving as the training labels. The model used 256-dimensional Mel-log energy features, creating a rich, multi-dimensional representation of the sound for the AI to learn from.

The validation results were striking. When tested against a 62-hour continuous audio sample from March 2019, the AI model’s acoustic-based load estimates tracked almost perfectly with the transformer’s actual, electrically-measured load. The two curves—actual load and acoustically-predicted load—moved in near-perfect unison, demonstrating that the transformer’s sound is not just a symptom of its operation, but a precise, quantifiable indicator of its workload. This finding is revolutionary. It means that by simply “listening” to a transformer, grid operators can obtain a real-time, non-invasive measurement of its most critical operational parameter. This capability opens the door to a host of applications, from dynamic load balancing to early detection of abnormal operating conditions that might precede a failure.

The success of this project represents a significant leap forward in the field of power equipment diagnostics. It moves the industry away from reactive, schedule-based maintenance towards a proactive, condition-based paradigm. Instead of waiting for a failure or a scheduled outage, engineers can now continuously monitor a transformer’s health through its sound, identifying subtle changes that signal emerging problems long before they become critical. For instance, an unusual harmonic introduced into the acoustic signature could indicate a developing issue with the core laminations. A change in the pattern of cooling fan noise, when correlated with temperature data, could signal a failing bearing. The AI model, trained on a growing library of “normal” and “abnormal” voiceprints, becomes an ever-more sophisticated diagnostician.

Moreover, the implications extend beyond simple fault detection. The ability to accurately estimate load from sound has profound implications for grid management. In a modern, dynamic grid with fluctuating renewable energy inputs, real-time load data is crucial for stability. Acoustic monitoring provides a redundant, independent source of this data, enhancing the grid’s resilience. It can also be used for energy efficiency audits, identifying transformers that are operating under excessive load or are otherwise inefficient, allowing utilities to optimize their asset utilization.

The research team is quick to acknowledge that this is just the beginning. As noted in their conclusion, the application of voiceprint and AI-driven data analysis in transformer state assessment is still in its infancy. Future work will focus on expanding the database to include a wider variety of fault conditions, from incipient partial discharges to the distinctive hum caused by DC bias. They also plan to refine their data cleaning algorithms to handle even more complex acoustic environments, making the technology robust enough for deployment across diverse substations worldwide. The ultimate goal is to create a comprehensive, intelligent sensing system that can not only diagnose problems but also predict them, transforming the power grid from a collection of dumb, reactive components into a self-monitoring, self-optimizing organism.

This work stands as a testament to the transformative power of interdisciplinary research. By bridging the worlds of high-voltage engineering, acoustics, and artificial intelligence, Zhang Chenchen, Ding Guocheng, Li Jianlin, and their colleagues have not just solved a technical problem; they have redefined the very way we interact with and understand our critical infrastructure. Their approach is a perfect embodiment of the data-driven future, where every sound, every vibration, is a potential source of insight, waiting to be unlocked by the power of machine learning. It’s a future where the grid doesn’t just deliver power—it speaks to us, telling us exactly what it needs to keep running smoothly, safely, and efficiently.

Zhang Chenchen, Ding Guocheng, Li Jianlin, Zhen Chao, Zhao Haoran, Huang Wenli. Exploration and Application of Voiceprint Recognition of Power Transformer Based on Artificial Intelligence and Data Driving. Electrical & Energy Efficiency Management Technology, 2021, No.11. DOI: 10.16628/j.cnki.2095-8188.2021.11.013