Broadband sound absorption with subwavelength bubble metascreens: Realization of an anechoic water tank
Underwater acoustic experiments, particularly those investigating marine animal behavior or bioacoustic signals, are frequently conducted in laboratory water tanks to overcome logistical challenges of field measurements. However, these controlled environments introduce significant acoustic distortions due to reflections from tank walls, leading to modified frequency content and resonance effects. While solutions exist for ultrasonic acoustics, reducing these effects at audible frequencies remains difficult. This paper introduces a bubble-based metasurface, a thin flexible coating consisting of air cavities embedded in a rubber elastomer, that achieves acoustic absorption down to the kilohertz range. We demonstrate this by characterizing a standard commercial aquarium, identifying its resonances, and designing an optimized bubble screen configuration that effectively reduces reflections in the 3-6 kHz range, approaching the acoustic behavior of an infinite open-water environment. Time-domain analysis confirms that applying this coating significantly reduces tank reflections, offering a practical method for improving the accuracy of underwater acoustic research.
Research on active magnetic field compensation for high-efficiency Terfenol-D magnetic circuit in magnetostrictive transducers
A high-efficiency Terfenol-D magnetic circuit with large-section permanent magnets is proposed to improve the electroacoustic efficiency of magnetostrictive transducers. This design adopts long Terfenol-D rods with an integral drive instead of the conventional multiple short Terfenol-D rods with segmented drive, eliminating the influence of the high magnetic resistance of permanent magnets in the magnetic circuit. The static magnetic field distribution of the long Terfenol-D rod is reconstructed through an increase in the cross-sectional area of the permanent magnet using active magnetic field compensation technology, effectively compensating for the weak magnetic field in the middle of the long Terfenol-D rod. The electroacoustic efficiency of the transducer improves by increasing the output sound power without reducing the input electrical power. Then, a high-efficiency Terfenol-D magnetic circuit with large-section permanent magnets and a transducer driven by it are designed and fabricated and compared to a segmented driving magnetic circuit. The test results show that the response and electroacoustic efficiency of the transducer driven by the high-efficiency magnetic circuit at 950 Hz are 187.6 dB and 38.6%, respectively, which are approximately 2 dB and 4% higher than those of the conventional segmented magnetic circuit.
Optimizing the spatial extent of long-duration broadband noise signals using time reversal
The use of audible sound for acoustic excitation is commonly employed to assess and monitor structural health, as well as to replicate the acoustic environmental conditions that a structure might experience in use. Achieving the required amplitude and specified spectral shape is essential to meet industry standards. This study aims to implement a sound focusing method called time reversal (TR) to achieve higher amplitude levels compared to simply broadcasting noise. The paper seeks to understand the spatial dependence of focusing long-duration noise signals using TR to increase the spatial extent of the focus. Both one- and two-dimensional measurements are performed and analyzed using TR with noise, alongside traditional noise broadcasting without TR. The variables explored include the density of foci for a given length/area, the density of foci for varying length with a fixed number of foci, and the frequency content and bandwidth of the noise. A use case scenario is presented that utilizes a single-point focus with an upper frequency limit to maintain the desired spectral shape while achieving higher focusing amplitudes.
Underwater acoustic Luneburg lens based on a square-lattice isotropic truss structure
The Luneburg lens plays a vital role in areas such as antennas, radar, and imaging systems owing to its unique omnidirectionality. This focusing behavior is intrinsically related to the isotropy of the materials used to construct the lens. However, achieving the desired refractive index profile and isotropy simultaneously is challenging for underwater acoustic metamaterials, especially for solid-based materials due to their strong multiple scattering effects. Here, we design a two-dimensional (2D) compound lattice truss structure based on square lattice and realize an acoustic Luneburg lens with excellent omnidirectional performance. The effective stiffness matrix is derived using the strain energy equivalence principle. The derivation indicates that as the slenderness ratios of all beams are identical, the proposed structure can achieve isotropy. The refractive index of the structure can be adjusted by changing the beam thickness. Based on this structure, an underwater acoustic Luneburg lens is designed and demonstrated by numerical simulations. The results indicate that the designed Luneburg lens exhibits excellent focusing performance and omnidirectionality. Furthermore, the designed 2D structure lends itself to a direct extension into a three-dimensional configuration without compromising its isotropic properties. This work paves the way for future potential applications in acoustic localization and imaging systems.
Probing cochlear compressive nonlinearity using signal-to-noise ratio-optimized distortion product otoacoustic emission input/output functions
Cochlear compression, a nonlinear response critical for encoding sound dynamics, can be non-invasively assessed using distortion product otoacoustic emission (DPOAE) input/output (i/o) functions. A common limitation in DPOAE i/o measurement is low signal-to-noise ratio (SNR), especially at low primary levels. We hypothesized that a peak-based method identifying a high-SNR f2 near the conventional frequency would improve the reliability and robustness of cochlear compression estimates by leveraging higher SNR at these peaks. This study compared two measurement approaches for deriving DPOAE i/o functions: (1) the conventional method using standard f2 frequencies (e.g., 1000 Hz), and (2) a peak-based method targeting individually identified maxima near the same frequency. DPOAE i/o functions were modeled using a two-segment piecewise linear fit to estimate low-level and compression slopes, and the associated change point (compression threshold). Repeatability was quantified using the intraclass correlation coefficient and Bland-Altman method. Compared to the conventional method, the peak-based approach yielded significantly higher SNR and DPOAE levels across frequencies and levels. Reliability analyses indicated that the peak-based method yielded more consistent and repeatable estimates of cochlear compression. The findings provide evidence supporting the use of fine-structure peak targeting to enhance the measurement quality of DPOAE-based cochlear compression estimates.
Similarity spectra analysis of a lab-scale afterburning jet noise rig
This paper presents the first study comparing the spectra of a lab-scale afterburning rig operating at a relevant total temperature ratios value of ∼ 6, typical of Full-Scale (FS) afterburning jets, against Tam's similarity model. The spectral characteristics of FS afterburning jets were successfully reproduced on a lab-scale. Far-field acoustic data at 63 diameters relative to the nozzle exit were used to fit the similarity spectra, with a priority placed on achieving the best fit for the overall shape of the measured spectra while ensuring a smooth growth or decay of the peak frequencies. The transition region, which is delineated by a narrow range of microphone locations from 90° to 107.5°, required a combination of fine-scale similarity spectra (FSS) and large-scale similarity spectra (LSS) to better model both the peaks and roll-offs of the measured spectra. Only LSS was needed to model the spectra near the region of maximum overall sound pressure level radiation, whereas sideline angles only needed FSS. The similarity model was unable to accurately predict the double peaks observed at select angles. Additionally, a mismatch in the high-frequency slope between the similarity model and the measured spectra became apparent outside the region of peak radiation.
Between- and within-speaker variability of voiceless fricatives in Persian
Fricatives vary acoustically across languages and individuals, with speaker variability shaped by both phonetic and non-phonetic factors. This study examined between- and within-speaker variability in Persian voiceless fricatives (/f/, /s/, /ʃ/, /x/) and how linguistic environments, such as syllable position and lexical stress, affect this variability. A gender-balanced sample of 24 Persian speakers was recorded in two sessions, 1-2 two weeks apart. Acoustic analysis targeted the first four spectral moments and duration. Results showed that center of gravity captured the greatest between-speaker variability, followed by standard deviation, skewness, duration, and kurtosis. Across segments, the alveolar /s/ exhibited the highest speaker-specificity, followed by /ʃ/, /f/, and /x/. Gender-based patterns emerged: for males, the center of gravity and skewness of /s/ were most discriminative, whereas for females, the center of gravity and standard deviation of /ʃ/ were most effective. The labiodental /f/ showed some speaker-specific characteristics only in the male group. Voiceless fricatives in syllable-initial positions demonstrated more speaker-specificity, while lexical stress did not impact between-speaker variability. Results also highlight cross-linguistic differences in the acoustic cues most effective for speaker differentiation and demonstrate that optimal features can vary across speaker populations. Adaptive algorithms are therefore crucial for improving forensic speaker comparison.
Annoyance due to high-speed train noise: Index testing and annoyance model based on noise sensitivity
In the context of climate change and transportation decarbonization efforts, railway noise annoyance has emerged as a significant concern, particularly when high-speed trains operate at high speed. Despite the growing importance of this issue, the literature reveals few studies specifically addressing noise annoyance due to high-speed rail traffic. The literature reveals a lack of consensus regarding factors influencing noise annoyance due to high-speed trains (e.g., frequency content, number of trains), as well as specific indices to complement the long-term index, the day-evening-night level, LDEN. The current study aimed to contribute to the identification of acoustical factors associated with high-speed train noise annoyance. To achieve this objective, short-term noise annoyance was assessed in laboratory conditions by subjecting 32 participants to high-speed train pass-by noises previously recorded in field. Short-term A-weighted sound pressure level indices appear to be the most relevant indicators of noise annoyance due to high-speed trains but with no significant differences among them when tested in multilevel regression annoyance models. Noise sensitivity emerged as a more substantial contributor to noise annoyance models compared to these indices. The results highlighted the need of studies to specifically assess noise annoyance due to high-speed trains.
Detection of ossicular chain pathologies using sweep frequency impedance with short-time stimulation and adaptive noise reduction
Conductive hearing loss typically results from ossicular chain abnormalities, commonly ossicular fixation or separation. While a precise diagnosis is useful for surgeons, distinguishing between fixation and separation before surgery is challenging. In our previous studies, we reported that sweep frequency impedance (SFI) effectively detects such middle-ear pathologies. However, due to the prolonged sound stimuli, SFI exhibited weaker resistance to noise. In this study, we introduce a novel method using short-time stimulation and adaptive noise reduction to improve SFI performance. The method was applied to both healthy individuals and patients, and a support vector machine was employed to evaluate its accuracy in distinguishing fixation and separation in clinical practice. The proposed SFI yielded results consistent with the original SFI meter but significantly shortened the evaluation time to within 200 ms. Classification results indicate that the SFI achieved accuracies of 98% and 83% for detecting ossicular separation and fixation, respectively. In contrast, such accuracies of traditional tympanometry were 70% and 49% for the separation and fixation. Additionally, the study indicates that gentle lullabies can serve as effective acoustic stimuli. These results suggest that our new SFI has potential for middle-ear testing across all age groups, from newborns to the elderly.
Mechanisms of auditory enhancement in younger and older adults
Auditory enhancement (AE), a perceptual phenomenon reflecting listeners' ability to use sequential spectral contrasts to improve detection and identification of target sounds, is reduced by hearing loss. Effects of advancing age on AE are less clear, mainly because the mechanisms underlying the phenomenon are not well understood. One of the most common explanations for AE, in terms of adaptation of inhibition, predicts age-related reductions of AE due to reductions of neural inhibition. In this study, AE was measured for younger and older adults using a target-detection task for unmodulated and amplitude-modulated pure-tone targets surrounded by an inharmonic-complex masker. The masker-target complex followed a precursor comprising either masker components alone or both masker and target components. The results show that a release from informational masking related to grouping between simultaneous target and masker components is an important, but not the sole, contributor to AE. For an amplitude-modulated target, AE was reduced in older compared to the younger group and its magnitude was not correlated with hearing thresholds. The findings are consistent with the idea that age-related reductions of AE magnitude are due to reduced neural inhibition and may contribute to age-related difficulties with speech-in-noise perception.
Principal component analysis of the interaction between spectral band power and pitch in classical singing voices
This study explores how the spectral information of vowel production targets changes by pitch in classical singing. Although it is established that singers engage in pitch-dependent vocal tract adjustments, less is known about how spectral envelope contrasts between vowel targets change by pitch and how consistent the patterns are across singers. Seven professional classical singers sang seven vowels across their pitch range. The energy of 16 spectral bands at every 500 Hz interval up to 8000 Hz was measured. Principal component analyses were performed to describe spectral variation. Results show consistent changes in spectral energy distribution as pitch increases despite individual differences in pitch range. A separate set of analyses further uses the center of gravity, standard deviation, skewness, and kurtosis of the spectra as a proxy for spectral shape variation, showing that the summarized spectral envelopes of vowel production targets systematically converge at higher pitches. Overall results suggest that pitch-dependent vocal tract adjustments may be shaped not only by singers' acoustic targets, but also by physiological constraints relative to each singer's overall pitch range. More broadly, this study demonstrates the possibility of using dimensionality reduction methods to characterize spectral patterns in high-pitched singing when formant tracking fails.
Cues for vertical localization in the upper median plane: Integrating directional band theory and parametric notch-peak model
A number of studies have examined cues for the perception of the vertical angle of a sound image. Sound localization tests using narrow-band signals revealed that there exist directional bands that were perceived in specific directions (front, above, and rear), regardless of the direction of the actual sound source. On the other hand, sound localization tests using wideband signals revealed that spectral notches and peaks above 5 kHz in the head-related transfer function (HRTF) contribute to vertical angle perception. Based on this finding, a parametric notch-peak (PNP) HRTF model, which is reconstructed using the minimum number of notches and peaks required for vertical angle perception, has been proposed. Thus, the directional band theory, which claims that the presence of specific frequency components is important, and the PNP model, which suggests the importance of the absence of specific frequency components (notches), appear to be contradictory. In the present study, we first focus on the front, above, and rear directions and attempt to integrate the PNP model and the directional band theory. Then, expanding the scope to the entire upper median plane, a hypothesis regarding the cues for vertical angle perception and the extended PNP model is proposed.
Acoustic parameter combinations underlying mapping of pseudoword sounds to multiple domains of meaning: Representational similarity analyses and machine-learning models
In spoken language, iconicity, referring to the resemblance between the sound structure of words and their meaning, is often studied using pseudowords. Previously, we showed that representational dissimilarity matrices (RDMs) of the shape ratings of pseudowords correlated significantly with RDMs of acoustic parameters reflecting spectro-temporal variations; the ratings also correlated significantly with voice quality parameters. Here, we examined how perceptual ratings relate to these parameters of pseudowords across eight meaning domains. We largely replicated our previous findings for shape, while observing different patterns for other domains. Using a k-nearest-neighbor (KNN) machine-learning algorithm, we compared 4095 combinations of 12 acoustic parameters (three spectro-temporal and nine characterizing vocal quality) to determine the optimal combination associated with iconicity ratings in each domain. We found that iconic mappings were linked to domain-specific combinations of acoustic parameters. One spectro-temporal parameter, the fast Fourier transform, contributed to all domains, indicating the importance of time-varying spectral properties for iconicity judgments. We applied the KNN approach to generate shape ratings for 160 real words. These generated ratings strongly correlated with perceptual ratings of real words, indicating the value of the KNN approach to assess iconic mapping in natural languages. Our findings support the relevance of iconicity to language.
Subwavelength monopole resonance of a cylindrical void in a soft material (L)
A soft material embedded with resonant inclusions can be tailored for targeted acoustic performance in underwater applications. Voids that are distributed in the soft material exhibit strong monopolar resonance similar to the well-known Minnaert resonance of air bubbles in water. While the resonance of a spherical void has been extensively studied and reported, the analytical treatment of a cylindrical void has been less frequently investigated. Such a void has a lower frequency response than the spherical case and its analysis is more challenging. The aim of this Letter to the Editor is to compare closed form estimations and asymptotic results available in the literature for the monopole resonance frequency of a vacuous cylindrical-shaped void in a soft material and establish the conditions for which the expressions are valid.
Moderate levels of high-frequency noise mask harbor porpoise hearing, but do not cause temporary threshold shift
The potential for masking and temporary threshold shift (TTS) of a harbor porpoise exposed to high-frequency noise was investigated using levels and a duration that match likely vessel noise exposures at sea. An auditory evoked potential (AEP) technique allowed immediate assessment of hearing sensitivity during and after 20 s noise exposures centered on the 125 kHz 1/3 octave band. When the noise was delivered concomitantly with the stimuli, a 125 kHz 1/3 octave level of 85 dB re 1 μPa root mean square (rms) was enough to mask the hearing of click energy levels of 83 and 97 dB re 1 μPa2 s, and no AEPs could be measured when the noise reached a rms level of 120 dB re 1 μPa rms. These masking levels in the 100-150 kHz echolocation and communication band of porpoises are realized at ranges of several hundred meters from vessels with screws causing cavitations. After a period of more intense noise exposure level up to 147 dB re 1 μPa2 s at 125 kHz, responses to the click stimuli were not lower than at baseline levels. Since exposure levels this high are rarely encountered at high frequencies, it is therefore unlikely that high-frequency components of vessel noise can cause TTS, even in harbor porpoises within 10 s of meters of passing vessels. The AEP responses observed after exposures support the hypothesis that harbor porpoises can actively reduce their hearing sensitivity during noise exposure to maintain high hearing acuity immediately after exposure.
Pseudo-spectral model of elastic-wave propagation through toothed-whale head anatomy, and implications for biosonar
The sound localization and biosonar system of toothed whales is exceptionally performant. What enables such precision, however, remains unclear, given that (i) toothed whales have no pinnae, and (ii) although their auditory pathways have been studied in detail, no specific feature that could functionally replace the pinna has been identified. We employ a pseudo-spectral time domain (PSTD) numerical scheme to model three-dimensional elastic wave propagation through a toothed-whale head including soft tissues. Computed tomography scans were used to build a velocity-density model of a bottlenose dolphin's head, parametrized on 1.11 mm voxels. We validate our wave propagation solver, identifying a range of frequencies and scale lengths where the PSTD scheme captures the complexities of wave propagation through anatomy. We next focus on the toothed whale's ability to determine the elevation of sound sources, where anatomy plays a crucial role. Sinusoidal bursts with 45 kHz central frequency, emitted by far-field sources at elevations from -90° to +90°, were recorded at the locations of left and right inner ear. We find that their elevation can be established, via correlation, solely based on the "coda" of the incoming signal, whose waveform is controlled by refraction through and reflection off multiple anatomical structures.
The roles of attention and L2 proficiency in L2 perceptual cue weighting
This study investigated how availability of attentional resources and listeners' inhibitory control (IC) affect cue weighting in L2 vowel categorization and whether the effects vary with L2 proficiency. Thirty-five Chinese L2 learners of English categorized sounds from the English /ɛ/-/æ/ and /i/-/ɪ/ continua under low and high load. IC and L2 proficiency were measured using the Stroop task and LexTALE. Group results showed that learners used primarily spectral quality and secondarily duration for both contrasts, with spectral reliance increasing with L2 proficiency. /i/-/ɪ/ categorization exhibited stronger warping effects of short (vs long) duration on spectral use, particularly for high proficiency learners. High load induced lower spectral reliance across both contrasts, with duration warping effects in /i/-/ɪ/ categorization being more pronounced under low (vs high) load. Learners with weaker IC were more susceptible to load-induced reduction in spectral use and exhibited stronger duration warping effects. Neither attentional variable interacted with L2 proficiency in affecting L2 cue weighting. Our findings support a hierarchical model of attentional modulation in L2 cue weighting where cognitive load directly impairs reliance on the strong cue, while IC operates as a protective factor that mitigates the severity of the load-induced impairment in difficult L2 contrasts.
Study on longitudinal-bending coupling in slotted cylindrical piezoelectric ultrasonic transducer
A longitudinal-bending coupled piezoelectric ultrasonic transducer (LBCP) is proposed to address the limitations of conventional sandwiched piezoelectric transducers, which predominantly exhibit unidirectional vibration modes. The LBCP transducer integrates a longitudinal sandwiched piezoelectric transducer with a slotted metal hollow cylinder, enabling synergistic longitudinal-bending coupled vibrations. Based on the theory of bending vibration and the equivalent circuit of the coupled vibration of the metallic tube, the electromechanical equivalent-circuit model is established to predict the coupled vibration characteristics and resonant frequencies. Finite element method simulations and impedance analysis experiments are employed to validate the theoretical predictions, demonstrating strong agreement between simulated and calculated resonant frequencies. The slotted configuration demonstrates strong longitudinal-bending coupling modes, offering a paradigm for advanced coupled-mode transducer design.
Sound properties and shallow water propagation for acoustic enrichment in coral reefs
Acoustic enrichment can facilitate coral and fish larval settlement, offering a promising method to rebuild degraded reefs. Yet it is critical to understand sound propagation in complex shallow-water coral reefs to effectively apply this method over large restoration-scale areas. In this field-based study, we quantified propagation features of multiple sound types emitted through a custom playback system over varying coral reef habitat. Sound levels were computed at different distances from the source in both pressure and particle motion, the latter being detected by marine invertebrates. Detection distances were primarily determined by source levels, and depth-dependent transmission losses. Transmission losses and detection distances were similar for sound pressure and particle acceleration measurements. Importantly, broadband particle acceleration levels could be closely estimated at distances >10 m using a single hydrophone and a plane wave approximation. Using empirically determined coral larvae sound detection thresholds, we found that low frequency sounds (<1 kHz) such as fish calls from healthy coral reef soundscapes may be detectable by larvae hundreds of meters away. These results provide key data to help design standardized methods and protocols for scientists, managers and restoration practitioners aiming to rebuild coral reef ecosystems over reasonably large spatial scales using acoustic enrichment.
Non-dimensional modal analysis of musical instrument plates
This work presents a framework for assessing whether a given set of measured modal shapes and frequencies can be reliably interpreted using a thin-plate model, in the context of inverse parameter estimation for musical instruments' plate-like components. While inverse modelling of vibroacoustic systems is commonly performed on a per-mode basis, little attention has been given to whether the underlying modal data conform to the assumptions of the selected model, most often based on Kirchhoff-Love plate theory. Here, we propose a non-dimensional, mode-by-mode comparison between computed and reference modal data and show that this provides a robust means of identifying deviations from thin-plate behaviour. Apart from the rectangular plate, the work is conducted on plates shaped like typical musical instrument soundboards. Modal shapes and frequencies are computed using thick-plate finite element methods, allowing for the inclusion of shear and rotatory inertia effects. A comparison with traditional dispersion relation-based criteria demonstrates that cutoff frequency arguments significantly overestimate the frequency range over which thin-plate theory remains predictive in the modal domain. A case study involving plates of varying thickness illustrates the utility of the method, particularly when used in conjunction with inverse identification routines.
