Respiration rate (RR) is a key physiological indicator of human health, typically expressed as breaths per minute (BPM). Accurate measurement of respiratory rate is crucial in fields such as medical monitoring, sports health, and sleep analysis. However, traditional respiratory rate detection methods (such as chest strap sensors and nasal airflow monitoring) often rely on specialized hardware, resulting in inconvenient wear and low comfort.
On May 1, 2025, Apple disclosed a patent called "Determining breathing rate based on audio stream and user status" (US20250134408A1), which aims to use the existing hardware architecture of head-mounted devices such as AirPods to achieve respiratory monitoring.
Technical Overview
The patent describes a system and method for determining breathing rate using audio streams and user status. The system receives input indicating the user's status, as well as audio streams from the headset's built-in microphone (in-ear microphone) and external microphone. Depending on the user's status, the system selectively uses internal and/or external audio streams to determine the user's breathing rate. In some cases, the system calls a machine learning model to extract breathing signals from the audio stream based on the user's status. This method is applicable to different scenarios (such as meditation or exercise) and improves the accuracy of breathing rate detection by adjusting the weights of internal and external audio streams.
Core Content
1. Dual-microphone signal fusion: The breathing signal is collected by both the in-ear microphone (internal audio stream) and the external microphone (external audio stream) to improve the signal-to-noise ratio.
2. User state perception: Dynamically adjust signal processing strategies based on the user's current activity state (such as exercise, meditation, rest) to optimize breathing rate calculation.
3. Machine learning-assisted enhancement: Blind source separation and deep learning models are used to extract respiratory signals and reduce environmental noise interference.
System Architecture
1. Hardware composition
(1) Head-mounted devices
- In- Ear Microphone : Collects breathing sounds (such as exhalation/inhalation airflow) in the ear canal.
- External Microphone : Collects ambient sound to assist in noise reduction or breathing detection during exercise.
- Speaker : Used to play audio (such as music and calls), but may introduce interference and requires echo cancellation.
(2) Supporting equipment
Smartphones, smart watches, etc., used to receive user input (such as exercise mode) or provide additional sensor data (such as heart rate, motion acceleration).
2. Signal processing flow
(1) Signal acquisition
In-ear microphone → internal audio streaming
External microphone → external audio stream
(2) Preprocessing (Filtering & Enhancement)
Low-pass filtering (LPF) : preserves the breathing-related frequency band (usually 0.1–2 kHz).
Blind Source Separation (BSS) : Separate breathing signals from noise (such as music and ambient sound).
Acoustic Echo Cancellation (AEC) : If the device is playing audio (such as music), the interference needs to be removed.
(3) Feature extraction
Time-frequency analysis (Spectrogram/MFCC): Detecting breathing cycles (inhalation/exhalation)
Machine learning model: A model consisting of a two-layer GRU recurrent network and a fully connected layer is used to extract respiratory cycle features through time series analysis.
- Input: audio features + user status (such as exercise, meditation).
- Output: Respiration rate estimate.
(4) Dynamic weight adjustment
Adjust the contribution weight of internal/external audio streams based on the user status.
- Meditation mode: relies primarily on the internal audio stream (clearer in-ear signal).
- Sports mode: relies primarily on the external audio stream (sports noise affects the in-ear signal).
(5) Post-processing & verification
Confidence score: Assessing the reliability of respiration rate calculations.
Multi-sensor fusion: Combine heart rate and motion data (such as smart watches) for cross-validation.
Technological innovations
1. Dual-microphone collaboration and blind source separation
AirPods use an internal microphone to capture breathing sounds in the ear canal, while an external microphone captures ambient noise. Using blind source separation technology, AirPods filter out interference like music and human voices to directly extract the breathing signal. This design maintains data accuracy in complex scenarios like subways and offices.
2. Multimodal Data Fusion
The system dynamically adjusts its algorithms based on exercise type and heart rate data, for example, raising the respiratory rate threshold during exercise to avoid misjudgments. The machine learning model adapts to the breathing characteristics of different users, minimizing the impact of individual differences.
3. Scenario-based application expansion
Respiratory rate is closely linked to cardiopulmonary function and metabolic status. For example, shortness of breath may indicate an asthma attack or altitude sickness. The portability of AirPods makes them suitable for daily health management, such as providing breathing rhythm guidance during exercise and providing early warning of sleep apnea. Furthermore, when combined with blood oxygen and heart rate data from Apple Watch, a more comprehensive health profile can be constructed.
Summarize
Currently, no consumer product utilizes the exact same technical approach as this patent. Manufacturers like Huawei and Samsung still primarily focus on HRV and body movement monitoring, leaving a significant gap between medical devices and consumer products. Technical challenges include signal separation in complex noisy environments, power consumption control for long-term monitoring, and algorithm adaptability to varying breathing patterns. If Apple can translate the patent into practical functionality, it could redefine the health monitoring capabilities of headphones.
This patent reflects the trend of wearable devices toward "non-invasive monitoring"-the non-invasive collection of physiological indicators through everyday devices such as headphones and glasses. In the future, audio analysis technology may be further expanded into areas such as voice emotion recognition and stress monitoring, promoting the deep integration of health technology and consumer electronics.
Note: This article is reprinted from 21dB Acoustics