This guide explains what vocal biomarkers are in plain English. It is not medical advice. A vocal biomarker is a signal and an estimate, not a clinical finding, and no consumer app, Sonora included, can detect, screen for, or treat any condition from your voice. If you have a health concern, please speak to a qualified professional.
What is a vocal biomarker?
A vocal biomarker is a measurable feature of how you sound, rather than what you say, that can correlate with something about your body or your state. Your voice carries more than words. Its pitch, steadiness, pace, and energy all shift with how you feel, and software can put numbers on those shifts. A vocal biomarker is one of those numbers: a measurement pulled from a short voice clip, such as how high or how shaky the voice is, that tends to move with stress, tiredness, or emotion. The key idea for a general reader is that it is a correlate, not a verdict. It points in a direction; it does not deliver a result.
This page is about the signals themselves: which features exist, what each one reveals, and what they honestly cannot do. It is the companion to a sibling guide about the process, the step-by-step way software turns a recording into an estimate. If you want the pipeline, how a voice clip becomes a state inference, read How AI Voice Analysis Works. Here we stay with the nouns: the acoustic features, grouped into a few plain families, and what the research does and does not support about each.
To see how these signals feed personalised audio rather than sit on their own, read how Sonora's AI sound therapy works, which sets out the whole approach and the evidence behind it.
The four families of vocal signals
There are dozens of named acoustic features in the research literature, but for a general reader they sort into four plain families, and you do not need the maths to follow any of them.
Frequency is the basic pitch of your voice, how high or low it sounds. The technical name is the fundamental frequency, often written F0. It tends to rise when people are stressed.1 A close cousin in this family is the formant frequencies: the resonant tones that the shape of your mouth and throat add on top of the basic pitch. Formants are mostly what makes one vowel sound different from another; they are part of why your voice sounds like yours.
Perturbation is the family of tiny, rapid wobbles in the voice from one cycle to the next. Jitter is the small variation in pitch, and shimmer is the small variation in loudness. You can think of both as a measure of how steady or unsteady the voice is: a very even voice has low jitter and shimmer, while a rougher or more strained one has more. They are among the standard voice-quality measures researchers track.2
Intensity is simply loudness, the energy in the sound. How loudly you speak, and how that loudness rises and falls, shifts with arousal and effort. Along with pitch, it is one of the features that researchers report as a reasonably useful indicator of stress and cognitive load.2
Prosody is the music of speech: its rhythm, pace, and stress pattern, the part that makes a sentence sound flat and tired or lively and alert. Prosody is not a single number but a family of timing and melody measures, including how fast you speak and how long your pauses are. It is often the most intuitive family, because we all hear prosody every day when someone sounds bored, anxious, or upbeat.
What each family reveals
The honest headline is that each family reveals a tendency, not a fact. Across the research, pitch (F0) and intensity emerge as some of the more dependable indicators of stress, anger, and cognitive load, while the perturbation measures (jitter and shimmer) and prosody add useful texture about voice quality and effort.2 In plain terms: a voice that climbs in pitch, gets louder, and grows less steady is, on average, more likely to belong to someone under stress than a calm, even one. That is a correlation a computer can read, and it is the basis for treating voice as a wellness signal.
It is just as important to be clear about how loosely these signals hold. The same systematic review that found pitch and intensity useful is candid that results are heterogeneous: the features that flag one emotion do not always flag another, and individual differences between people are large.2 Two people under the same stress can show different vocal changes, and one person's stressed voice can resemble another's relaxed one. So "what each family reveals" is best read as a rough tendency across many people, not a precise dial on any single speaker. The signal is real; it is also noisy.
How vocal biomarkers compare to other physiological signals
Vocal biomarkers belong to the same broad family as other everyday body signals, things like heart-rate variability from a smartwatch, the stress hormone cortisol from a saliva test, or sleep data from a tracker. What they share is that each is an indirect, measurable proxy for state rather than a direct readout of how you feel. The appeal of the voice is that it is genuinely non-invasive: there is nothing to wear, swab, or strap on, just a few seconds of speech.
How well does the voice stack up against a harder physiological measure? A 2025 study put speech features alongside salivary cortisol, the standard biochemical marker of a stress response, and found that some vocal measures tracked the cortisol changes after an induced stressor, supporting the idea of voice as a non-invasive stress signal.3 That is a meaningful result, because it ties an easy-to-capture voice measure to a genuine bodily change rather than to self-report alone. The fair comparison, though, is that the voice is more convenient but less established than the older measures: heart-rate variability and cortisol have decades of validation behind them, while the voice is an emerging signal still being characterised.
The clinical research
Because the voice carries state, researchers have asked whether it could one day help with clinical questions such as depression, anxiety, fatigue, and cognitive load. The direction is real and worth taking seriously, but it sits firmly in research, not in consumer products. An early and much-cited study of depression found that depressed speech showed a slower pace, longer pauses, and reduced pitch variability, and that these markers eased as people responded to treatment.4 That is a striking finding: the same prosody and frequency families described above appear to shift with mood over the course of treatment.
A 2025 scoping review of speech analysis in mental health gives the honest state of play. It describes a fast-moving, promising field, while being explicit that the studies are still small and heterogeneous, often not longitudinal, and far from routine clinical use.5 The reasonable reading is that vocal biomarkers may become a useful, accessible signal in clinical settings in time, under proper validation and with a clinician interpreting them. None of that describes what a relaxation app does, and none of it means a phone can assess your mental health today.
What vocal biomarkers can NOT do
This is the section that matters most, because the credibility of the whole idea rests on being honest about its limits. First and most important: a vocal biomarker is not a diagnosis. It is a signal and an estimate, not a clinical finding. When an app reads your voice for stress or fatigue, it is picking up an everyday wellness signal, the kind of thing a friend might hear when you sound worn out, not making a medical assessment. The clinical research above lives in research settings, not in a consumer app, and it does not mean any app can diagnose, screen for, or treat a condition. If you are worried about your health, the right step is a qualified professional, never a soundscape.
Second, the signals are imperfect and approximate, even on their own terms. The clearest illustration is voice pitch as a stress marker: a 2025 systematic review and meta-analysis found that F0 does tend to rise after stress, but once the analysis was corrected for publication bias the effect was no longer statistically reliable, and the authors called for validation in large, prospective studies before voice pitch is treated as a standalone biomarker.1 In other words, even the most-studied single biomarker is not yet dependable by itself. The sensible expectation is a rough, useful read of how you sound, not a precise measurement of how you are. Treat any product that implies pinpoint accuracy with caution.
Privacy: how Sonora handles voice data
Anything that listens to you raises a fair privacy question, so it deserves a plain answer. Sonora's developer has declared in Google Play's data-safety section that the app does not collect or share user data. On iOS, the App Store privacy labels carry the equivalent declaration. These store disclosures are the developer's own self-reported statements rather than independently audited facts, so if how your voice data is handled matters to you, review the in-app privacy settings and the current store listings before you rely on it. The practical point is that the voice read exists to shape your audio in the moment, not to build a record of you.
The future of vocal biomarker research
Where is this heading? The most likely path is steady, unglamorous progress: larger and longer studies, better handling of the variability between people, and combinations of features rather than any single magic marker. The reviews above all point the same way, that the science is genuinely emerging, the early signals are promising, and the honest bottleneck is rigorous, large-scale validation.25 For the broader context that music and sound can genuinely affect how we feel, a plain-English overview from the United States National Center for Complementary and Integrative Health, part of the National Institutes of Health, concludes that music-based approaches show promise for anxiety, pain, and sleep, while cautioning that many studies are small and more rigorous work is needed.6
For now, the honest framing for a reader is the modest one. Vocal biomarkers are a real, measurable, non-invasive signal of everyday state, useful enough to help an app match audio to your mood, and not yet precise enough to read your health. That is exactly how Sonora treats them: as a wellness signal that personalises sound, never as a diagnosis. You can see the full citation list behind Sonora's wider claims on Sonora's evidence base.
Related articles
This is one of two AI sound therapy guides in the Sonora Learn library. Vocal biomarkers are the signals; how those signals get extracted and turned into a state inference is the process, covered in our companion piece How AI Voice Analysis Works. For the broader context of how vocal biomarkers feed Sonora's adaptive soundscapes, read Sonora's practical AI sound therapy guide. More AI sound therapy guides publish through 2026, including voice analysis in mental health, adaptive soundscapes, and AI versus traditional music therapy.
You can find all articles in the Learn library, or try Sonora free to hear how the voice-aware, adaptive approach feels in practice.