These authors contributed equally to this work.

Identifying changes in the properties of acoustical sources based on a small number of sample data from measurements has been a challenge for decades. Typical problems are the increasing sound power from a vibrating source, decreasing transmission loss of a structure, and decreasing insertion loss of vibration mounts. Limited access to structural and acoustical data from complex acoustical systems makes it challenging to extract complete information of the system and, in practice, often only a small amount of test data is available for detecting changes. Although sample expansion via interpolation can be implemented based on the priori knowledge of the system, the size of the expanded samples also affects identification performance. In this paper, a generative adversarial network (GAN) is employed to expand the acoustic fault vibration signals, and an Acoustic Fault Generative Adversarial Network (AFGAN) model is proposed. Moreover, a size-controlled AFGAN is designed, which includes two sub-models: the generator sub-model generates expanded samples and also determines the optimal sample size based on the information entropy equivalence principle, while the discriminator sub-model outputs the probabilities of the input samples belonging to the real samples and provides the generator with information to guide sample size considerations. Some real data experiments have been conducted to verify the effectiveness of this method.

Monitoring the condition of the ship structure has been a challenge for decades [

However, as shown by previous studies, practical acoustic fault identification in ships is considered as a small-sample recognition problem because of the difficulty of obtaining representative fault samples, the high experimentation cost, and so on [

The generative adversarial network (GAN) provides a method to directly generate the acoustic signal under a fault condition. In 2014, Goodfellow et al. proposed the GAN and successfully applied it to the field of computer vision by generating a large number of highly realistic images [

The GAN approach has been applied to image recognition [

The rest of this paper is organized as follows. In

The basic concept of the GAN is to learn the potential distribution of real samples from training samples through adversarial learning, and then generate a large number of training samples. It includes two components: the generator and the discriminator. The generator is used to capture the potential distribution of real samples and generate expanded samples (i.e., fake samples). The discriminator is a binary classifier and predicts whether the input is a real sample or a fake sample. The GAN trains with the adversarial learning process: the generator generates fake samples that look like real samples to cause the discriminator to misjudge, and the discriminator should identify the fake samples as often as possible.

The performance of the GAN entirely depends on the network structure of the generator and discriminator. Early GANs adopted fully connected layers and maxout output layers [

The AFGAN is modelled by employing the DCGAN structure. Unlike a conventional GAN which generates two-dimensional images, the AFGAN generates one-dimensional signals. A schematic diagram of the AFGAN architecture is shown in

The generator performs signal expansion. Its inputs are the latent representation

An important feature of the AFGAN model is its end-to-end structure. The original signal does not need to be specially processed, such as undergoing feature extraction and can be directly inputted into the AFGAN model to generate expanded samples.

It is useful to improve the fault recognition rate by using a GAN to expand acoustic fault samples under small-sample conditions. However, an increased number of generated samples does not necessarily lead to better results. Too much expanded sample information may overwhelm real sample information and results in the “information hedge” problem, which may decline the recognition performance. Therefore, it is necessary to control the expanded sample size.

The GAN model generates expanded samples by capturing the potential distribution of real samples. From the point of information theory, the information contained in the expanded samples should be equal to the information provided by the real samples. The “information hedge” problem during the acoustic fault sample expansion process can be explained as an issue that the amount of the information of the expanded samples is more or less than that of the real samples, which leads to the distortion of expanded samples. Therefore, to avoid the “information hedge”, it is necessary to ensure that the same amount of information is contained in both two sample sets, and thus the size of the expanded samples can be optimized.

Information entropy is employed to measure the amount of information contained in the fault samples. Assume the real sample set and the generated sample set are

For the real samples, the information entropy [

Thus, the amount of information of the real samples set

Equation (

For the expanded sample set, the samples are rearranged in descending order of

According to the principle of information entropy equivalence, we need to find an expansion subset with size

Equation (

The objective function of the traditional GAN is

If the size constraint is taken into consideration, the task of the discriminator remains the same, which is to calculate the probabilities that the generated samples belong to the real samples. In addition to generating samples, the generator also needs to control the sample size. The objective function can be expressed as

The measured data obtained from a 1:1 ship module model is used to validate the proposed method. The sampling frequency is 2048 Hz. The primary noise sources are the motor, the pump, and the high-frequency exciters, representing three typical fault sources with main vibration frequencies 90 Hz, 296 Hz and 360 Hz, respectively. Accelerometers PCB 352CC were placed on the above three devices to collect the vibration signal and recorded by an 8-channel B&K 3560D pulse system. In addition, 1024 sampling points in the time domain are chosen as observation samples. The positive frequency bands of the signal power spectrum are obtained as feature vectors, and its dimension is 512. There are 900 mechanical noise samples obtained for each class, from which 800 measured samples are chosen as training samples and the remaining 100 as testing samples.

The size-controlled AFGAN model is designed based on the measured data set. The input to the generator has a uniform distribution with a 100-dimensional noise vector z, and z is projected onto a four-dimensional convolutional representation with eight fractional-strided convolution layers. The convolution kernel size is 5 × 1. For each layer, the convolution kernel numbers × the signal lengths are 2048 × 8, 1024 × 16, 512 × 32, 256 × 64, 128 × 128, 64 × 512, 32 × 1024, 16 × 1024, and finally 1024 × 1-dimensional single-channel signals are output through these characterization transformations. The output layer in the generator uses the Tanh function, and the other layers use the ReLU activation functionn [

The discriminator was also a fully convolutional network, building the reverse process of the generator. The input signal is a 1024 × 1-dimensional single-channel vector. There are eight strided convolution layers. The channel number doubles and the signal length halves (except in the second layer, whose signal length remains the same as the first layer) per layer compared with the previous layer. The final output is a scalar, which denotes the probability of the generated sample belonging to the real training data. The Leaky ReLU activation functionn [

To ensure the training stability, the input signals are pre-processed to zero-mean and normalized to [−1, 1]. All the weights of the network are initialized from a zero-mean normal distribution with a standard deviation of 0.02. In the Leaky ReLU, the slope of the leak is set to 0.2 in all models. The hyperparameter is adjusted using Adam optimization, and the learning rate is 0.0002. In addition, 800 real samples are trained with the size-controlled AFGAN for sample expansion.

The generated samples and the real samples are further combined into a new training sample set and then applied to acoustic fault identification. A Multi-layer Perceptron (MLP) artificial neural network [

In

Furthermore, the relationships among the classier performance, the expanded sample arrangement and the sample size are investigated. Three expanded sample arrangement schemes, random, descending and ascending order are carried out.

To test the generalization of the expanded samples generated by the size-controlled AFGAN, some models are built on different machine learning algorithms, and their classification performance before and after expansion at the optimum sample size are compared. The algorithms used in this task contain the most common machine learning algorithm like MLP neural network, passive aggressive classifier (PAC) [

To eliminate the uncertainty, 11 different seeds (0, 50, ..., 500) are set for the different learning process to obtain different models. Therefore, there are a total of 46 models developed for validating the generalization of the expanded data, where the data shuffle process does not influence Xgboost and ridge classifier. To demonstrate the performances’ differences based on data before and after expansion, the absolute accuracy increase (AAI) and relative error reduction (RER) are calculated. RER is used to eliminate the original performance of the model, which means when the original model has already achieved high accuracy, the AAI must be smaller than the one with lower accuracy, so the RER is calculated to eliminate this influence. RER is calculated with

To present the details of the models, the recognition accuracies of 11 GBDT models and MLP before and after sample expansion are given in

The models with which RER improved the most are the GBDT, Xgboost and MLP classifier, which are better in discovering the nonlinear structure of the data compared to other algorithms. Since the expanded samples were generated by two convolutional neural networks, the expanded samples must have a more nonlinear relationship with the original samples. Therefore, it is reasonable that models built on algorithms for nonlinear structure tend to show more significant improvement. From

In this paper, a GAN was applied to the expansion of acoustic fault samples under small-sample conditions. The original signal does not need to be specially processed, such as undergoing feature extraction and can be directly inputted into the AFGAN model to generate expanded samples. Then, the information entropy equivalence principle was employed in the AFGAN model for sample size control while producing high-quality expanded samples. The proposed sample expansion method is more intuitive to the nature of the problem compared to other traditional sample expansion algorithms. In the application of mechanical noise source recognition, the results showed that the samples obtained by the size-controlled AFGAN had the advantages of high quality and optimal size, and the classifier performance was optimal with the optimal sample size. The generalization was proved by applying the expanded data to other typical machine learning algorithms. Forty-six models were trained and an 11.6% average accuracy increase and 33.0% relative error reduction were achieved.

Conceptualization, L.Z. and X.D.; methodology, L.Z., N.W. and X.D.; software, L.Z., N.W. and X.D.; validation, L.Z., N.W., X.D. and S.W.; formal analysis, L.Z., N.W. and X.D.; investigation, L.Z., N.W. and X.D.; resources, L.Z., N.W. and X.D.; data curation, L.Z. and N.W.; writing—original draft preparation, N.W.; writing—review and editing, X.D. and S.W.; visualization, L.Z., N.W. and X.D.; supervision, L.Z.; project administration, L.Z.; funding acquisition, L.Z.

This research was funded by the National Natural Science Foundation of China under Grant No. 51205404 and 51709216.

This work was supported by the National Natural Science Foundation of China under Grant Nos. 51205404 and 51709216.

The authors declare no conflict of interest.

The following abbreviations are used in this manuscript:

The Acoustic Fault Generative Adversarial Network (AFGAN) architecture.

Frequency domain plots of the measured signals with main vibration frequencies (

The accuracies as the sample size variation for different arrangement schemes.

The accuracies (

The performance increase after sample expansion of 11 (

Accuracies of different expanded sample sizes.

Experiment | Sample Size per Typical Fault Source | Accuracy | ||
---|---|---|---|---|

90 Hz | 296 Hz | 360 Hz | ||

E |
831 | 538 | 282 | 83.00% |

E |
416 | 269 | 141 | 82.67% |

E |
1000 | 1000 | 564 | 81.67% |

E |
1000 | 1000 | 1000 | 83.00% |

E |
0 | 0 | 0 | 61.76% |

Average performances of the models from each algorithm.

Algorithm | Mean Absolute |
Improved Models |
---|---|---|

MLP | 19.4% (43.8%) | 100% |

Passive Aggressive Classifier | 12.2% (26.6%) | 100% |

Ridge Classifier | 14.3% (31.2%) | 100% |

Extreme Gradient Boosting Classifier | 17.3% (74.3%) | 100% |

Random Forest | 7.1% (15.2%) | 72.7% |

Gradient Boosting Decision Tree | 6.8% (42.8%) | 100% |