声音的觉醒(四):科班声乐体系与普通人自学唱歌的天然断层——语言、方式与结构的差异

1 Introduction

Now that I've recently achieved initial success in "larynx control," I've been reflecting on the detours I've taken over the past two years in researching "singing"—it's no exaggeration to say it was a winding, circuitous road. The fundamental reason for all those detours was that I was initially misled by various online "vocal teacher" instructional videos. Not to mention, much of the content was essentially designed to sell courses and deliberately lead people down the "high-note route" (this was already discussed in the first part of this series).When singing is misunderstood as a 'high-note game').

But the high-note mistakes are only part of the problem. The more crucial point is that I didn't even realize it when I first started learning:Is the vocal training, theory, and professional terminology of a formal vocal system suitable as a "theoretical starting point" for an ordinary person with no prior experience who just wants to sing well in karaoke? I hadn't thought about this question at all at the time; I just felt that professional, systematic, and formal training methods should be the best approach.

Only after I truly dissected the training logic, curriculum structure, vocal exercises, and goal system of formal education did I gradually realize that there might be a huge misunderstanding hidden within it, a misunderstanding that has trapped the vast majority of ordinary people who learn to sing on their own. In fact, it could be said that without dismantling this misunderstanding, subsequent practice is not only inefficient but also particularly prone to leading to increasing confusion.

Therefore, this article aims to trace this "misunderstanding" back to its source: What exactly led me down so many detours? What framework should an ordinary person with no prior experience use to view "singing" at the beginning? Of those professional training programs that often last for several years, which parts are relevant to "singing well in karaoke" and which are completely irrelevant?

In short, let's stop the introduction here and not rush to the conclusion. Once we've clearly explained what the formal education system actually does and what the goals of ordinary people are, you will naturally see the answer.

2. Formal Vocal Training: Modules, Content, and Graduate Competency Profiles

2.1 Main Contents of Professional Vocal Training

If you look at the vocal training programs of music conservatories across the country, you will find that although the details differ between schools, the overall structure is extremely consistent: it is not as simple as "teaching you to sing a few songs", but a whole training system that treats the human voice as a "professional instrument".

Its content can be roughly categorized into five areas, but in actual learning, they occur in parallel and influence each other:

(1) Voice production and technical main line

This is the absolute core of formal vocal training: from breath control, vocal register transitions, vocal cord closure, and larynx control, to the adjustment of resonance cavities, timbre unification, and dynamic control. The entire four years of undergraduate study revolve around this line of refinement. Whether one can achieve a "stable," "controllable," and "durable" voice depends almost entirely on it.

(2) Musical language ability (music theory/sight singing and ear training/piano)

Vocal performance isn't just about singing by ear. Students need to read music, listen to harmonies, understand the structure of a piece, and even practice on the piano themselves. In short, formal training aims to ensure students "understand music like a musician," rather than simply imitating recordings.

(3) Language and articulation

In systematic vocal training, "the way a language is pronounced" and "the clarity of articulation" are important components. Whether it's Chinese, a foreign language, or a dialect, the core goal is to achieve these qualities when singing.Clear, coherent, and singable pronunciation.

Students with formal training are typically exposed to the pronunciation rules of multiple languages (such as Italian, French, German, etc.) because these languages are common in art songs and operas; however, for the vast majority of ordinary people, especially those who sing pop or Mandarin songs,Control of Chinese pronunciation itself(Such as front/back nasal sounds, medial consonants, neutral tones, and word/sentence coherence) are more direct and important training contents.

(4) Collaborative training in chorus/ensemble/chamber music, etc.

This is something many people don't understand: classically trained students can't just stand on stage and sing solos; they must be able to control, balance, and blend their timbre in choral, duet, and collaborative settings with accompaniment. This ability is closely related to stage collaboration.

(5) Stage performance and interpretation of the work

From their freshman year, students will gradually be exposed to opera workshops, art song rehearsals, public performances, and graduation recitals. In addition to vocalization, they also have to learn how to "act," how to use their bodies and stage space, and how to tell stories.

In other words, formal training aims to produce "well-rounded singers," not just those who excel in vocal ability.


Although I've divided this content into five categories, for students with formal training, they are almost all studied together:Technology enables you to release your work, musical language allows you to understand it, pronunciation helps you sing correctly, collaboration helps you integrate into the team, and the stage allows you to ultimately present the whole thing.

The essence of this training system is to cultivate students into "professional singers" who can stand on a professional stage, face complex repertoires, and perform for long periods of time.


2.2 A profiling of the abilities of formally trained vocal music graduates: What exactly have they "learned"?

After several years of systematic training, a typical undergraduate vocal music student, whether or not he becomes a professional singer, will develop a relatively stable set of abilities.

These abilities are not a "list of skills," but a complete presentation of "how the voice is trained into a professional tool."

(1) Sound body: stable, controllable, and durable (the core features are free larynx position and resonance optimization)

After several years of training, most classically trained students develop a distinct sense of "vocal control." This stability isn't innate, but rather built upon...Laryngeal position freeandResonance optimizationThis is based on these two core mechanisms.

Free larynx position means that the larynx no longer moves around due to tension, emotions, or pitch changes, and remains in a relatively stable and relaxed position when phonating; while resonance optimization makes the vocal tract space more consistent, the sound is easier to amplify, and it can maintain a consistent timbre across different vocal registers. Once these two mechanisms are established, the sound will be more stable and durable, without the wavering feeling of sudden highs and lows, and the relationship between vocal cord closure, breath, and resonance will be more easily fixed.

Therefore, the "vocal stability" of classically trained students is not actually about "singing louder" or "singing higher," but rather—Sound has become a reliable, predictable, and trainable tool.It offers more stable pitch, more controllable dynamics, and longer sustain. Even if you sing continuously for half an hour, an hour, or longer, your voice is less likely to break down.

In other words, the voice of a classically trained student is no longer a "natural voice," but a "trained organ." Even with average natural talent, as long as the larynx position and resonance mechanism are well-developed, a resilient, stable, and controllable voice will naturally emerge.

(2) Musical comprehension ability: Singing without relying on intuition

In formal musical training, there's a significant shift: students no longer rely on experiential judgments like "I think it should be sung this way" or "That's how others sing it." By graduation, most have developed relatively mature musical analytical skills—they can see the structure and patterns in sheet music, understand why the harmony, melody line, and rhythm are written the way they are, and comprehend what the accompaniment supports and propels. They have relatively clear judgments about where the emphasis of a melody is and how to understand the style and structure of a piece. When encountering a new work, they can quickly break it down and grasp its key points.

This ability doesn't necessarily make someone "sing better," but it gives them more direction when reading songs, preparing pieces, and rehearsing on stage. Strictly speaking, formal musical understanding serves more the "stage and the piece itself" than simply "I want to sing this song smoothly."

(3) Multilingual articulation and vocal adaptation ability

There is one ability that is consistently emphasized in formal training—Multilingual dictionChinese is the foundation, followed by common vocal languages such as Italian, German, French, and Latin. But the focus is never on "pronouncing it accurately," but rather on...Articulation, tongue position, oral cavity space, soft palate height, resonance positionThese elements form a harmonious whole, making the sound both clear and stable.

Different languages directly influence pronunciation. Chinese relies on a smooth connection between the initial consonant, vowel, and final consonant; the initial consonant must be crisp, and the final vowel full, without compressing the oral cavity space in pursuit of clarity. Italian is known for its open and bright vowels; as long as the vowel placement is correct, the sound will naturally be clear and stable. German has a high consonant density, requiring precise coordination between the tongue root and surface; French nasalized vowels alter the resonance path; Latin emphasizes clean and consistent vowel quality, demanding a high degree of tonal balance.

Although these languages are different, their training goal is the same: to make...Articulation and vocal tract morphology coordinationThis allows for clear listening while maintaining a consistent tone of voice, avoiding being bogged down by linguistic structure. This isn't just a foreign language lesson; it's a true language learning experience.Vocal articulation trainingFor opera and art songs, poor pronunciation is not just about being "unintelligible," but it also affects timbre, legato, and emotional delivery. Therefore, it is an indispensable foundational skill in a formal academic system.

(4) Collaboration and auditory ability: In multi-part dialogue, one can "hear others and also hear oneself".“

Through training in choral singing, duet singing, and various vocal parts, students with formal training quietly accumulate a set of "ear skills." They can distinguish the layers and lines of different vocal parts, and can maintain their pitch and vocal balance when multiple people sing at the same time; in accompaniment, orchestra, or other vocal parts, they can quickly find their own vocal line, know when to blend their voice in, and when to "stand out" more clearly.

These abilities are fundamental qualities for stage musicians. They are not as obvious as "singing high notes" or "breath control," but they often determine whether a person can deliver a truly mature and well-coordinated performance in a chorus, duet, or with a band.

(5) Stage presentation: It's not about "standing there and singing," it's about "acting."“

By graduation, students with formal training usually have developed a basic stage presence. Posture, breathing, and body language are all fundamental; they also learn how to evoke emotions, portray characters, and bring a song to life on stage. Facing an audience, stage lighting, and spatial awareness are all second nature to them.

These skills are not essential for ordinary people to sing pop songs, but they do make people with formal training appear more composed and stable in formal occasions, and more like "people who are used to standing on stage".

2.3 What happens when formally trained skills are applied to pop singing?

2.3.1 Overview

If we "project" the entire set of skills acquired by formally trained vocal graduates onto pop songs, a very intriguing phenomenon will emerge:Some of these abilities are extremely useful and can even directly enhance the singing experience; However, some other abilities may conflict with popular aesthetics, technological approaches, or even modes of expression.

We can break this down into two parts—

(1) Which abilities are bonus points?

(2) Which abilities actually become deductions? Why is this?

2.3.2 The "bonus" of formal training in popular fields“

(1) Freedom of the larynx: a golden ability that is almost universal.

Whether it's pop, folk, R&B, jazz, or rock, as long as your larynx stays in place and your vocal mechanism is stable, you'll sound more relaxed and natural, without any strain or tension. High notes will be much easier to achieve, emotional expression will be smoother, and you're less likely to feel stuck in your throat, tire your voice, crack, or break your voice.

In this respect, those with formal training do have a clear advantage.

(2) Resonance optimization: making the sound "more listenable"“

A professionally trained voice typically possesses a certain density and spaciousness, sounding stable, mellow, and neither dry nor harsh. The pressure distribution during vocalization is also more reasonable, avoiding the strain of straining the voice. Although pop music doesn't emphasize "uniform timbre," all styles strive for a common goal—to sound comfortable, natural, and effortless.

In this respect, people with formal training often have an advantage.

(3) Articulation ability: clear, smooth, and without strained resonance

Formal training emphasizes that the connection between initials and finals in Chinese should be crisp and clear, but without compressing the oral cavity space or affecting resonance. This articulation method, which is "clear but not tight," makes the voice more stable and transparent, and also makes it easier for listeners to understand emotions and tone.

Especially in styles that emphasize narrative and emotional detail, such as love songs, folk songs, and R&B, a "clean but not pretentious" way of pronouncing words is a big plus.

(4) Rhythm sense / Pitch stability / Music reading ability

In terms of rhythmic stability, pitch control, and melody interpretation, classically trained singers are generally more consistent than the average person. Many pop singers also have rhythmic issues, but those with formal training are less likely to sing off-beat, and can quickly grasp the melody even for a song they're not familiar with; they also don't easily get flustered with pieces that have many sections or variations. These abilities are more effective than you might imagine in improving your "KTV performance," and while not the deciding factor, they are definitely very useful foundational tools.

(5) Stage presence/atmosphere expression

This depends to some extent on individual personality, but those with formal training will at least not panic when holding a microphone, nor will they become so nervous that they tremble or stand stiffly when they go on stage, nor will they be at a loss thinking, "How can I get the audience to accept me?" In situations where they need to face an audience, these experiences and habits are a hidden bonus.

2.3.3 The "minus factor" of formal academic training in popular fields“

Here's the key point: many of the skills developed through formal academic training will... It directly and structurally contradicts popular singing.The problem isn't with the students, but rather that the training system wasn't designed for popular trends.


(1) Unified aesthetics of timbre → conflict with popular "texture, graininess, and looseness".

Formal training has a clear overarching goal:Regardless of the piece being sung or the pitch, the timbre must remain consistent, uniform, and full.However, the requirements for popular music are often different. It pays more attention to the variation of timbre. Some songs need a sense of airiness, some need a cracked or hoarse voice, and some need to be sung casually, colloquially, and softly.

Therefore, classically trained students often encounter problems when singing pop songs: they sound "too formal" from the start, losing their naturalness; their relaxed parts sound too strained; and when trying to imitate pop singers, they always carry the habit of using a classical timbre. This is not due to a lack of technique, but rather a conflict between training goals and popular aesthetics.


(2) The articulation logic of foreign language singing methods contradicts the popular "colloquial" style in Chinese.

Professional articulation training places great emphasis on clarity—vowels must be stable, consonants must be accurate, and vowel shapes must not be distorted. But what exactly are Chinese pop songs?They express emotions almost entirely through spoken language..

The result is often that classically trained singers' pronunciation sounds too "square," their vowels too rounded, and their tone lacks lightness, fluency, and relaxation, easily giving the impression of being affected and pretentious when they start singing Chinese songs. This isn't due to a lack of ability, but rather...The aesthetics of language are completely different..


(3) Technical "clean" → Emotional "too conventional"“

Formal performance training emphasizes vocal stability, atmosphere creation, emotional control, and the prohibition of vocal falters. However, the charm of pop singing often lies precisely in its instability: moments of emotional outburst or vocal cracking feel more authentic, and a slightly rough voice can be quite captivating.

Therefore, students with formal training often appear "too stable" and "too clean" when singing pop songs, lacking the unique "human touch" of pop songs.


(4) Aesthetics in large spaces → Microphone culture incompatible in small spaces

Professional sound design primarily targets large, unplugged stages, emphasizing spatial sound field and the ability to "throw" sound out. However, pop songs are almost always sung right next to the microphone, requiring singers to be close to the microphone to control their breath, thus incorporating the microphone's spatial characteristics into the "sound design."

Therefore, when classically trained singers sing pop songs, their voices tend to be too loud, too dense, and too spacious, lacking that "face-to-face" feel with the microphone. This is another fundamental aesthetic contradiction.


2.4 Summary: Formal training and popular trends are actually two completely different systems.

If we were to condense the content of this chapter into a single sentence, it would be:The traditional vocal training system only provides one main line, and this main line is always designed for "bel canto/folk singing" rather than for popular music.

The goal of formal training is clear: to shape the voice into...Unified, rounded, focused, and with strong projectionIt features a stage-like timbre, allowing singers to complete their works independently in grand theaters or microphone-free spaces, and relies on a set of strict technical specifications to maintain the stability of the sound.

The world of pop singing is completely different; it places more emphasis on...Style, personality, texture, colloquial expression, vocal register variation, and the "human touch" of the timbre.“It also relies on microphone space to create sound effects.

Therefore, their technological directions do not overlap in essence, and in many ways they are even different.ContradictoryBel canto should be unified, while pop should be diverse; bel canto should be thick and stable, while pop should be light, soft, broken, or ethereal; bel canto should de-nasalize, while pop sometimes should have a nasal tone; bel canto pursues stable closure, while pop allows for and even requires breathy, rough, and rough sounds; bel canto emphasizes "unified position," while pop emphasizes "emotional style brought about by positional changes," and so on.


Here, I'd like to mention a phenomenon that's very typical in the realm of popular culture, but has long been avoided in the formal academic system—Smoky voice.

In traditional bel canto or folk vocal training, a smoky voice is usually considered a vocal condition that needs to be guarded against: it means that the way the vocal cords are involved has changed, stability has decreased, controllability has weakened, and it may also bring long-term risks. Therefore, in the context of formal training, it is often not regarded as a "developable timbre," but rather as a byproduct that needs to be corrected or avoided during the training process.

But in pop music, the situation is quite the opposite. A smoky voice is often seen as a highly emotional and distinctive vocal signature. The roughness and imperfection it brings are instead interpreted as a realistic, fragile, overly forceful, or emotionally overflowing vocal state. This kind of voice does not strive for long-term stability, but rather serves a specific song, a specific emotion, and a specific style.

This is not to say that one side is "more scientific," but rather to illustrate:The formal music education system and popular music are fundamentally different in their judgment of sound value.What one system strives to avoid may be precisely the expressive resource that another system needs and cherishes most.


Therefore, only three fundamental skills from formal training can be seamlessly transferred to mainstream applications:Free larynx position, optimized resonance, clear and natural articulation.

The remaining principles—strong projection, spacious aesthetics, dramatic emotional management, positional articulation, and a unified timbre—often have the opposite effect in front of a microphone, making the voice too bright, too harsh, too stable, and too "formal." This isn't a problem with formal training or the students themselves, but rather that the two systems initially pursued different goals.

More importantly, the vocal training system in the traditional formal training system was not designed "for the purpose of singing well".It is part of the entire formal training system, designed to establish a vocal pattern suitable for the theater space, unify timbre, and support the performance system of opera or folk singing—rather than to adapt to microphones, aesthetic preferences, or the usage scenarios of ordinary people in popular music.

In other words, those traditional vocal exercises that last for tens of minutes or even hours every day are essentially designed to serve the entire structure of formal vocal training, rather than being an "independent singing technique." When directly applied to a pop music environment, due to differences in goals and styles, their direct effects...It may be greatly discountedIt may even require a re-understanding and reapplication.


Many colleges and universities now offer majors like "Popular Singing" and "Contemporary Music Singing," seemingly tailor-made for popular singing styles. However, if you truly examine their underlying training logic, most still utilize the technical framework of bel canto or folk singing systems, only leaning more towards popular themes in terms of textbook selection, ear training content, and repertoire. In other words,Although the course title has changed, the core methods of voice training have not completely departed from the traditional system.This is why many students who graduate from "classical pop music majors" still carry a strong trace of their formal training, rather than the natural, relaxed, and stylized voices we are used to hearing in pop music.


3. The Networked Transformation of the Professional Vocal Training System and the Learning Dilemmas of Ordinary People

3.1 Overview

If you've ever watched vocal instruction videos online, you've probably noticed an interesting phenomenon: as soon as the video starts, the comment section immediately fills with a bunch of technical terms, as if everyone knows a little bit about vocal theory, and each host speaks fluently and knowledgeably.

However, reality is often the other side:There aren't many people who can truly sing smoothly, steadily, and naturally.

Why is this? Why is it that everyone has mastered a set of "professional-sounding" rhetoric, yet often fails to produce a relaxed, high-quality singing voice? Where do these terms come from, and why do they become the most common pitfalls for online self-learners?

To answer these questions, we must first go back to the source:What are the backgrounds of these online "vocal teachers"? What systems do they study? Why is that system effective for them but not necessarily for ordinary people?

3.2 Why do vocal terminology confuse ordinary people the more they learn it?

If you've ever watched singing tutorials on Bilibili, Xiaohongshu, or Douyin, you'll notice a strange phenomenon: regardless of whether it's pop, R&B, folk, or karaoke improvement, the terminology used by the teachers is strikingly consistent—"chest resonance," "head resonance," "mixed voice ratio," "unified vowels," "breath support," "front resonance," and so on. Change the teacher, and you'll still hear the same familiar phrases.

This is not a coincidence, but rather because most online vocal teachers in China have highly consistent training backgrounds: they are almost all from a formal academic system, which has long been centered on bel canto and supplemented by folk singing. Regardless of whether these teachers later claim to teach "popular singing," their underlying technical thinking, training framework, and language system are almost entirely derived from the bel canto system.

A key characteristic of the bel canto system is—It relies primarily on "experience-based metaphorical language" rather than precise scientific language.

The terms "chest resonance," "raising the voice," "opening the throat," "unifying vowels," and "proportionally adjusting mixed voices" are not objective descriptions of body structure, nor are they strictly quantifiable action models. Rather, they are vague prompts developed by experienced teachers to guide students to produce certain vocal reflexes. They are "directions," not "actions"; they are meant to guide students to try towards a certain feeling, not to tell you how to "operate" your body.

So here's the question:Why are students with formal training not harmed by these vague statements, while ordinary people become more anxious and confused the more they hear them?

The reason is simple: students with formal training are in an environment where "metaphorical language can be translated instantly"—if a student sings a line in the practice room and goes in the wrong direction, the teacher will immediately correct them; the teacher demonstrates once, and the student immediately tries it; if they stray from the mark, the teacher will promptly bring them back; if their sense of sound is vague, the teacher will change the language, the metaphor, or the method until the student "sings it correctly." In this intensive, error-correcting training environment, even the most ambiguous experiential terms will be "calibrated" by the teacher in real time, preventing ambiguity. Over time, students no longer care about the terminology itself, but instead develop their own vocal feel through continuous small adjustments.

But the situation is completely the opposite for ordinary people—once they leave the classroom, these already vague metaphorical terms are further “literalized” in the spread of the internet—“looking up” becomes “looking up and pushing”; “opening up space” becomes “opening the mouth wide”; “mixing voices proportionally” becomes “squeezing chest voice and head voice at the same time”; and “pre-resonance” is understood as “pushing the voice into the nasal cavity”.

These terms were originally just meant to indicate direction, but ordinary people mistakenly take them as direct physical commands. As a result, the more they practice, the more strained, constricted, and off-center their singing becomes, and the more unnatural their voice becomes.

This is the scene you often see in the comments section: everyone talks about resonance position, vocal tract space, and mixed voice ratio, but when they open their mouths, their voices are still tight, squeezed, flat, and strained.

Ultimately—Experiential metaphorical language is never designed for environments of "solitary practice, lack of demonstration, and lack of error correction".

The reason why formally trained students can grow with these terms is because they have a training structure that can continuously explain, demonstrate, and correct; but once ordinary people leave this structure, the terms quickly become traps that lead them astray.


Using traditional martial arts as an example can help us understand more intuitively why this "experience system + one-on-one error correction" cannot be separated from the master's on-site guidance.

The transmission methods of traditional martial arts are very similar to the formal vocal music training system:Standing posture, power generation, breathing, and body movement all depend on the master watching over your posture and correcting it little by little. Beginners who practice on their own or secretly will find it difficult to understand the intention behind the movements. They will only imitate the appearance, resulting in stiff postures and incorrect force application. In the end, they may even damage their bodies. Therefore, true traditional martial arts emphasize the "power that comes from being taught by a master hand in hand," not something that can be learned by self-study through a secret manual or a few terms.

Jin Yong's novels frequently emphasize this principle. For example, in *The Heaven Sword and Dragon Saber*, Zhou Zhiruo, lacking proper instruction, tries to learn the Nine Yin White Bone Claw technique quickly but ends up practicing it incorrectly. The woman in yellow, however, has the authentic lineage (formally trained) and can immediately see Zhou Zhiruo's problems, even demonstrating the correct technique herself—just as a vocal coach can immediately point out a student's vocal errors and demonstrate the correct pronunciation.

image.png

image.png

image.png

image.png

Although it's literary exaggeration, the underlying logic is quite realistic:Without a master to correct your mistakes, even the best skills will be practiced incorrectly.

Vocal music follows the same structure. Bel canto terminology is originally intended for students who practice in front of a teacher. Once removed from the professional environment and lacking one-on-one demonstrations and immediate corrections, experiential language ceases to be a "guiding light" and becomes a "mystical riddle that leads astray." Vocal training is no longer about "adjusting the body" but becomes a self-destructive process of "becoming more and more tense and more and more confused the more one practices."

This is why formal training is effective, while ordinary people often find it much more difficult to learn online by relying on terminology, and may even get worse with practice.


3.3 The inherent mismatch between the learning structure of ordinary people and formal training

As I mentioned in the previous section, it's easy for ordinary people to go astray when they try to learn directly from formal theoretical knowledge. But even when they adopt what seems to be the best learning method—paying for one-on-one private lessons—the results aren't always as obvious as expected. This isn't because the teachers are unprofessional or the students aren't hardworking; rather, it's because the learning environment for ordinary people and the formal training system have fundamentally different structures from the ground up.

First, there's the huge difference in training intensity. Students in formal vocal training spend long hours practicing frequently in the practice room every day. If they sing a line incorrectly, their teacher immediately corrects them, and if their vocal technique goes astray, they're instantly brought back on track. Through this constant cycle of "singing—correction—adjustment—solidification," correct muscle memory and vocal cord reflexes are truly established. In contrast, ordinary people typically have one private lesson a week. They might seem to understand it in class, but once they get home, the environment changes, their attention wanders, and there's no one to correct them, and they quickly revert to their old ways. The core of vocal training is never just understanding what you hear, but rather "being repeatedly corrected until the body memorizes it." Without high-density feedback, the reflex mechanism cannot take root.

Secondly, there's the difference in training methods. Formal training is immersive and systematic; all vocal exercises, body awareness, resonance, breath control, and repertoire serve the same long-term goal. Private coaching for the average person, however, is more like "point-based teaching"—the teacher needs to address the most obvious problems within a limited time, helping you sing a small part less incorrectly, but rarely rebuilding the entire vocal system from the ground up. Students may feel "somewhat better," but this improvement is often unstable and difficult to sustain.

The end result is that I can sing well in class, but gradually go astray a week later; my voice is getting closer and closer to the correct one, but I never really "get into" it; I can see some improvement when practicing singing, but overall I still lack stability and freedom.

This isn't a problem with the teacher or the student; it's determined by the very nature of vocal skills. It requires frequent error correction, continuous stimulation, and long-term accumulation, an environment that is difficult for the average person to provide.

In summary, one-on-one personal training can indeed prevent directional errors and save you a lot of time and effort; however, it cannot replicate the deep structural changes brought about by the immersive training of formal academic programs. For most people, personal training is more like an "optimal solution under limited conditions" rather than a "replacement for formal training."


certainly,One-on-one private coaching remains the most direct and effective way to accelerate progress.Teachers can immediately point out deviations that you cannot hear, demonstrate the "correct feeling" directly to you, and help you avoid many detours in a short period of time.

However, there are several realities that need to be addressed: First, private tutors cannot replicate the long-term, high-frequency, and immersive training of formal training for every student—limited class time and varying post-class practice environments and frequencies significantly impact results; second, students have different starting points (vocal quality, body coordination, auditory sensitivity, etc.), so even with the same methods, their progress will differ; finally, the differences in the teacher's teaching methods and expression also affect the outcome—a good teacher can transform metaphors into actionable exercises, while a less skilled teacher may only be able to "sing a few songs a little more smoothly."

Therefore, the conclusion is that private coaching is very worthwhile because it can significantly shorten the learning period and reduce incorrect practice. However, it is not a panacea. To turn it into a truly effective driving force, it still requires a reasonable frequency of lessons, clear practice tasks, high-quality self-practice outside of lessons, and careful selection of the teacher's teaching style.


4. How can ordinary people learn to sing?

4.1 Why is "self-awareness practice" most suitable for ordinary people?

For ordinary people, the biggest obstacle to learning to sing is never their vocal condition, but rather the lack of formal training like that of students.A high-frequency, intensive training environment with constant error correctionStudents with formal training practice daily with teachers' demonstrations and immediate feedback, gradually developing an extremely sharp ability over several years—If there is any deviation in vocalization, the body can detect it and adjust immediately.This ability is not innate, but rather shaped by the environment.

Ordinary people cannot replicate this environment. Therefore, if they follow the formal vocal training method or rely solely on video tutorials for self-study, they can easily fall into the predicament of "understanding what they see but not being able to do it correctly, and practicing for a long time but not seeing results." It's not that the method is wrong, but rather that the learning conditions and the logic behind the system are mismatched.

Given this reality, what's most worthwhile for ordinary people to invest in isn't complex skills or a bunch of high-sounding jargon, but rather...Develop your body awarenessSelf-awareness refers to your ability to clearly perceive: whether your throat is suddenly straining; whether your neck and shoulders are starting to stiffen; whether your breath is being cut off; and whether your voice is being pushed, blocked, or suppressed.

When you can keenly capture these signals, you can stop the damage in time during practice and bring your voice back to a natural and relaxed track.

In this sense, "self-awareness practice" is not a technique, but a way for ordinary people to truly practice in environments lacking teachers.Steady improvementThe fundamental ability. Only by mastering awareness can you know what you are practicing correctly, and only then can you understand the three most core and worthwhile training goals that will be mentioned in the next section.

4.2 The core training objectives are "free larynx position, optimized resonance, and articulation".

Ordinary people naturally lack a training environment with high-frequency feedback, so singing practice must "focus on the key points." In pop singing, the real key to making a person's voice natural, stable, effortless, and pleasant to listen to is not formal courses like vocal register switching, stage presence, or dramatic lines, but rather the three most basic and often overlooked things:

  • Laryngeal position freeThe most crucial condition for achieving relaxation is not being suppressed or strained.
  • Resonance optimizationTo make the sound more transparent, round, and natural, rather than relying on pushing.
  • PronunciationMake your voice "front, bright, and clear," and support your expression.

These three items are best suited as the main training path for ordinary people because they all rely on a common judgment standard:Is the body relaxed and is the breath flowing naturally?In other words, as long as you can perceive changes in your body's tension and relaxation, you can determine whether your training is progressing in the right direction. This is also entirely consistent with the core idea of the previous section—For ordinary people practicing singing, the most important thing is not piling on techniques, but whether they can feel the changes in themselves and give feedback through their bodies..

Once you've identified these three core objectives, you won't be overwhelmed by a deluge of terminology, schools of thought, and techniques, nor will you fall into ineffective practice. Next, you simply need to build a stable and executable training process suitable for the average person.

4.3 What is "relaxation" in singing? (The sole criterion for judging the correctness of all training)

If you ask me, "What is the most important skill for an average person learning to sing?" I wouldn't answer pitch, nor would I answer breath control or resonance—I would say it's...relaxation.

The sense of relaxation mentioned here is not about lying flat, feeling weak, or being limp and disorganized; rather, it's a stable, natural, and smooth physical state. It's both a state and a criterion for judgment, the "master switch" of the entire learning system. Many people think singing is difficult because of its complex techniques; in reality, the real difficulty lies in the fact that once you enter "singing mode," your body instinctively tenses up: your neck involuntarily tightens, your throat rises, your chin stiffens, and your breath gets stuck in your chest. This tension isn't a technical issue, but rather an instinctive bodily reaction. And once tension sets in, the voice becomes narrow, strained, constricted, and strained, making singing increasingly difficult. This opposite is relaxation.

In fact, the physical state we naturally adopt when we speak is the closest model to the correct singing state—when we speak normally, our throat is natural, our breath is smooth, our muscles are relaxed, and there is no unnecessary control; you won't hold your breath to say "I'm back," nor will you lock your throat to increase the volume, and you won't stiffen your face to express emotions.

This is also my first article (see:Spiritual Nourishment Series: Awakening of the Voice (Part 1): The Instrument Within the Body—Rediscovering the Essence of SingingThe core concept emphasized in it is that singing is not about "creating sound," but about maintaining a natural state similar to speaking, so that the body does not disrupt it. In other words, the key to singing is not "doing it right," but "not doing it wrong."

Beginners often wonder: "Did I sing well today? Can I hit that high note? Does it sound like I have enough breath support?" These judgments mostly rely on hearing, which is often subjective and easily deceived. The body doesn't lie: once you tense up, your throat lifts, your breath catches, your voice thins, your neck and shoulders stiffen, and your face tenses—these are all signs that your body is on the wrong track. It's far more real than any metaphor, terminology, or online teacher's demonstration. By paying attention to your body's reactions, you can quickly determine if you've deviated from natural vocalization. Simply put, just ask yourself, "Is my body relaxed right now?" to know if you're on the right track.

High notes, in particular, highlight the importance of relaxation. Many people try to forcefully push and squeeze out high notes, but those who sing with ease will find that high notes aren't "shout out," "squeeze out," or "push up," but rather...The natural levitation lifted by the breathIf your body remains relaxed while singing high notes, you're almost certainly on the right track; conversely, the more tense you are, the further off track you're. Relaxation is essentially the underlying operating system for high notes.

Learning to be aware of your body is key to moving from "external seeking" to "internal cultivation." Most people learning to sing tend to focus on the external: techniques, terminology, the teacher's analogies, and what others say they should do. But those who truly break through will draw their attention back to their body, observing whether their throat is relaxed, whether their breath is flowing smoothly, and whether they are using excessive force. Once this "observer" awakens within you, you can correct yourself: adjust what's wrong, without relying on a teacher to correct every phrase. This is the most precious ability in singing—Understanding the language of the body.

Relaxation is the fundamental basis of all training. It is the only reliable and unwavering criterion. As long as the body remains relaxed, it's impossible to go astray; once relaxation is lost, even if the technique appears correct, the body will tell you "it's wrong." Therefore, in learning to sing, relaxation is not a result of vocal exercises, but rather the fundamental standard for measuring whether you are singing correctly. As long as you consistently feel a natural larynx position, free breathing, relaxed muscles, and a relaxed body, your singing is definitely developing in the right direction. In short, relaxation is the compass for all practice, the most reliable internal reference for ordinary people learning to sing.

4.4 How to achieve "laryngeal freedom"”

If "relaxation" is the steering wheel of singing, then "laryngeal freedom" is the chassis of that car—all stable, natural, and smooth sounds are built on a state where the throat is not disturbed. Laryngeal position is not a "technique," nor is it something that needs to be deliberately controlled; it is more like a natural state where "it works automatically if you don't touch it."

The biggest problem for many singers is that they focus on the "voice" from the very beginning: trying to make it brighter, thicker, more stable, and stronger. And as soon as these thoughts arise, the first part of the body to tense up is the throat. Once the throat tightens, all high notes, resonance, and breathing immediately collapse. So they start trying to "press their throat down," "open their throat," and "make the sound rise"... These efforts only make their throats more tired.

To understand what "laryngeal freedom" is, we must first understand one fact:There was nothing wrong with my throat to begin with; it was our "wrong efforts" that created the problem.

When speaking, the throat neither descends nor rises; you don't consciously think, "Where is my throat when I speak?" Free larynx position essentially means making the throat during singing as close as possible to the "undisturbed" state of speaking. Truly skilled singers experience their throats feel almost exactly the same as when they speak—moving when it should move, releasing when it should release, and supporting it with breath when it should support it. That naturalness is key.

Free larynx position is not about "fixing the larynx position," but rather a dynamic balance where "the throat is not under extra tension, not manually controlled, and not emotionally controlled." The more you try to "adjust it," the more it will resist you; the less you deliberately touch it, and the more you keep your whole body relaxed and stable, the more it will find its own position.

in other words:Freedom is not about "doing," but about "not hindering."

So how should ordinary people practice? The most crucial step is actually just one—to make your body remember that "your throat should be as light as when you're speaking." If your throat feels more tired than when you're speaking while singing a melody, then you're almost certainly doing it wrong. The method for judging whether you're singing freely is very simple:After singing a line of a song, does your throat feel tighter than when you speak? If so, then you're doing something wrong. Any skill must comply with this inspection standard.

To achieve this, the most effective way is not to "specifically train the throat position," but to reduce damage in three ways: 1. Let the breath do more work and the throat do less work; 2. Open up the resonance position and prevent the sound from being squeezed at the throat exit; 3. Prevent the articulation action from spreading to the lower throat and pulling on it—when these three things happen at the same time, the throat will naturally be free.

You'll find that there are no "specific movements" at this point, nothing like "pushing your throat down 3 millimeters," because true laryngeal freedom isn't about movement, but rather...stateThis is something ordinary people can do—you don't need to practice intensive vocal training for three hours a day like students with formal training. You just need to be constantly aware while singing: "Am I using too much force? Am I pushing the sound into my throat? Am I tense up?" If you find something wrong, stop, let your body return to a relaxed speaking state, and then continue.

This is how larynx freedom is achieved: not through brute force training, but through...One "unhurried moment" after another.“ It accumulates over time. As long as the direction is right, it will gradually transform from an experience into a habit, and from a habit into a fundamental setting. Ultimately, you will find your voice becomes fluid, stable, and clear, which is the foundation of all advanced techniques.

I found the feeling of "laryngeal freedom" when I got stuck during a chorus of a song. I noticed the muscles on both sides of my neck were tense. After consciously trying to relax them, I suddenly found the feeling. So, ultimately, it was because I hadn't maintained the relaxation of my muscles. Out of curiosity, I did some research and found that the muscles on both sides of my neck are the sternocleidomastoid muscles, as shown in the picture below:

image.png

When the sternocleidomastoid muscle exerts force, the head will be unconsciously pulled forward and upward, and your larynx is "suspended" by the soft tissues in the front of your neck. When the position of the head changes, these tissues that suspend the larynx will be tightened, and the larynx will feel like it is being pulled and cannot move up and down freely; when singing, you will feel that your throat is stuck and your voice becomes tight. You can also try to feel this when you encounter similar problems while singing.


When you truly achieve "larynx freedom," you'll discover a very noticeable side effect—your natural vocal range becomes almost entirely usable. Notes that were previously impossible to reach lower now descend naturally; high notes that were previously unreachable can now be easily sung. In other words, vocal registers that were previously "locked" due to throat tension or improper technique can now be produced freely and naturally. This isn't about increasing pitch itself, but rather the vocal mechanism being no longer restricted, and the body's inherent vocal potential being fully released. You can smoothly explore every note from low to high without any stuttering or strain.


4.5 Resonance Optimization: It's not about "finding the vocal cavity," but about guiding the sound along the most efficient route in the body.

For many singing learners, "resonance" has always seemed like a mystical concept. People say it determines timbre, and that it makes the voice brighter, more stable, and less strenuous, but when it comes to "finding resonance," it becomes a chaotic jumble of sounds between the nasal cavity, head cavity, and mask, leaving you confused after practicing for ages. In reality, the true meaning of resonance training isn't about finding a specific "cavity," but about guiding the voice along the most efficient route in the body—once the route is clear, the voice will naturally become clearer, more relaxed, and farther-reaching, closer to the "smoothness" of a professional singer.

What truly hinders resonance is never "not finding the right vocal cavity," but rather the various unnecessary tensions and blockages within the body: a slight tightening of the tongue root immediately causes the voice to drop; a contraction of the pharynx instantly muffles the tone; insufficient mouth space prevents sound waves from escaping; and raising the larynx traps the sound in the neck, making it neither bright nor stable. The essence of resonance is to clear these "blockages" one by one—not to make the voice "louder," but to make it "more flexible." When the body no longer obstructs the sound, it will naturally move towards a brighter, clearer, and more relaxed direction, instead of simply remaining confined to the throat.

For ordinary people who sing pop songs, the most obvious and direct change in resonance optimization comes from a very simple sense of direction: the sound travels forward, rather than falling backward, dropping, or being squeezed into the neck. You already know this feeling of "forward" when you speak in everyday life—when you speak to someone, your voice is directed towards them, not "shrinking" into your neck or chest. Singing requires this "natural forward-moving quality." When the sound begins to travel forward, the timbre immediately becomes cleaner, the sound is less muffled, and it's easier to penetrate, making it "present" even when not loud.

Many people believe that resonance requires "opening your mouth wide," but true spatiality isn't achieved by opening your mouth wide, but by "opening" it. The upper teeth, upper lip, pharynx, and soft palate all provide a little natural "room," allowing the sound to flow smoothly outward. This space is subtle, yet crucial in determining whether the sound becomes brighter, more fluid, and more natural. It doesn't require exaggerated movements or forcing your mouth into a specific shape; it's like the feeling of "natural space in your mouth" when you speak, only a little more—not the exaggerated, artificial "ah—".

To determine if resonance is being optimized, a very simple method is to say a sentence casually, such as "The weather is nice today." Then observe whether the sound is forward, clear, whether the mouth opens naturally, and whether there is any strain in the throat. If even speaking tends to sound backward, sink, or be squeezed into the throat, then singing will inevitably follow the same "wrong path." Almost all stable singers start with a "sense of direction in speaking," gradually guiding their voice towards a more effortless path, rather than starting with abstract, theoretical vocal resonance.

Resonance isn't some mystical concept because it changes not your intention, but the path of your voice. When you smooth out this path, you'll immediately feel your tone become brighter, farther, and more stable, and the whole process happens almost naturally. Low notes become less muffled, mid notes less strained, high notes less overpowering, the tone of a song becomes more consistent from beginning to end, and your throat is no longer under unnecessary strain. For the average person, you don't need to pursue professional systems like "bel canto resonance," "head resonance," or "nasal resonance." Simply let your voice follow the most natural and effortless direction—that's the most practical and authentic way to optimize resonance.

When you begin to feel your voice no longer trapped in your throat, but flowing smoothly out from the front of your mouth; not strained, but gently lifted by your breath; not imagined, but with every part of your body making way—that's when you've truly entered the core stage of resonance training. Resonance isn't about piling on techniques, but a process of "removing resistance." As your voice begins to flow smoothly, you'll naturally understand that "brightness," "resonance," and "clarity" aren't the names of any particular technique, but rather the restoration of the natural vocal pathways your body inherently possesses in its most natural state.

4.6 Pronunciation

In singing, the core goal of articulation is...The lyrics are conveyed clearly, naturally, and smoothly.Many beginners unconsciously focus all their attention on the sound itself when practicing, ignoring the impact of the pronunciation process on timbre and expression. In fact, correct articulation is not just about pronouncing the sounds accurately, but more importantly, about making the initials and finals connect naturally and smoothly, while keeping the lips, tongue, and jaw relaxed.

One of the key principles isRapid transition from initial consonant to final vowelThe initial consonant is the starting point of pronunciation, while the final vowel carries pitch and timbre. If the initial consonant pauses for too long, is applied excessively, or the vowel's ending is delayed, the sound will sound stiff or strained. Beginners can imagine the initial consonant as the opening of a water pipe, and the final vowel as the flow rate within the pipe. Only when the initial consonant transitions smoothly can the sound flow freely without any obstruction or abruptness.

During practice, special attention needs to be paid to relaxing the mouth and tongue. Tight lips, a stiff tongue tip, or a hard jaw will directly interfere with the free flow of sound, resulting in difficult pronunciation or a harsh tone. A way to determine if you are doing it correctly is:When pronouncing each word, do the muscles of the mouth and tongue remain naturally relaxed? Is there any extra force used? If you feel tense or stuck, it means your movements need to be adjusted.

In addition, it is important to noteThe lightness of consonants and the stability of vowelsConsonants don't need to be forcefully "pushed out," but rather act like switches guiding the flow of sound, allowing the sound to smoothly transition into the vowel. The vowel, in turn, should remain continuous and stable, unaffected by the consonant. A simple practice method is to slowly break down each syllable, first gently pronouncing the consonant, then immediately transitioning to the vowel, repeatedly feeling whether the sound flow is smooth. Then gradually increase the speed until you can maintain a natural transition even at normal singing speed.

In general, the essence of articulation training lies not in deliberately controlling every mouth shape, but in...Find a natural, relaxed, and smooth vocalization path through body awareness.Similar to the "relaxation" mentioned earlier, the key reference signals here are: whether the mouth, tongue, and jaw remain relaxed during pronunciation; whether the initial consonant transitions quickly and naturally to the final vowel; and whether the final vowel is stable and fluent. By following these signals, one can continuously self-correct during practice without relying on subjective listening or complex theories, and gradually master the clear pronunciation and natural expression required for popular songs.

4.7 The Relationship Between Vocal Training and Singing Practice

In vocal training, the relationship between "vocal exercises" and "singing practice" has always been a concern for many. Many textbooks and teaching methods emphasize the importance of vocal exercises, seemingly treating it as a prerequisite that must be completed before moving on to specific pieces. However, this logic primarily applies to formal training programs; ordinary people don't need to adhere to this order.

In traditional formal vocal training, vocal exercises are considered the core of fundamental training—through systematic vocal exercises, students can solidify their vocal mechanisms and improve vocal control. Students in formal programs have ample opportunities for one-on-one instruction, repeated practice, and continuous demonstrations daily, thus achieving two goals simultaneously in vocal exercises: first, mastering correct vocal techniques and breath support; and second, developing muscle memory and bodily sensitivity through repeated practice. Only after vocal exercises have achieved a certain level of effectiveness do students formally enter the singing practice stage, applying the learned techniques to specific pieces. In other words, vocal exercises and singing practice are strictly separated in the formal training system, and their order is scientifically arranged to ensure efficiency and effectiveness.

For ordinary people, the situation is entirely different. They lack the high-density time and continuous professional supervision of formal training, so there's no need to strictly distinguish between vocal exercises and singing practice. In fact, singing practice is the most natural and efficient way to train one's voice: by applying breathing, larynx position, resonance, and articulation techniques in specific songs, the body gradually develops muscle memory through repeated practice and masters the correct vocal mechanism. At the same time, singing practice provides immediate feedback, helping learners adjust their vocal state in a timely manner—precisely the "self-awareness" that ordinary people lack most.

In other words, for ordinary people, vocal training is no longer a necessary prerequisite, but can be integrated into the singing practice process. As long as you pay attention to maintaining a free larynx position, natural resonance, and accurate articulation while practicing singing, and are always aware of the relaxation in your body, every performance can serve as vocal training. This saves time and is more practical, allowing skill improvement to be directly linked to song performance, without having to spend a lot of time on separate vocal exercises like students with formal training.

Understanding this is crucial for ordinary people to develop cost-effective self-practice plans.

5 Conclusion

In this article, we attempt to rethink the concept of "learning to sing" from the perspective of an ordinary person: you don't need lengthy formal training, nor do you need to master those seemingly sophisticated terms, and you certainly don't need to practice your voice for hours every day. What you really need is...Sensitivity to one's own bodyAnd the three most crucial things surrounding pop singing—Free larynx position, optimized resonance, and improved Chinese pronunciation—Conduct focused practice.

I hope to clarify a few key concepts through this article:

First, the core of singing is not "producing sound", but maintaining the body's natural vocalization.

When you learn to listen to your body instead of words, tension, smoothness, and the path of power will become clear and identifiable. The body is the most honest feedback system; it can tell you more directly than any terminology where you are doing things right and where you are going wrong.

Secondly, achieving freedom of throat position, resonance space, and clear articulation is the most effective way for ordinary people to improve.

These three things are specific and practical enough to minimize the chances of going astray without a teacher's guidance. They are not "all-encompassing skills," but they are the core foundation for achieving a natural, stable, and pleasant singing style in pop singing.

Third, vocal training and singing practice are a system, not separate actions.

For the average person, vocal exercises are meant to aid in singing practice, making it easier and less strenuous to sing; not to pursue some kind of "technique list." The reason why formal training emphasizes extensive vocal exercises is because they have ample time, environment, error correction, and goals, while the practical path and needs of the average person are completely different.

Finally, I would like to emphasize that:This article is about direction, not instructions; it's about understanding, not formulas.

Once you understand these underlying principles, you'll no longer get lost amidst a sea of terminology, nor will you waste time and energy following the wrong tutorials or practicing in the wrong direction. You'll be able to refocus your attention on your body, your voice, and the song you truly want to sing well.


📚 系列文章:声音的觉醒(4 / 6)


← 上一篇


下一篇 →

📌 Content Structure Hints:
This content belongs to "Music and Sound Cognition Thematic MapThis is part of the document; you can view the full content path here: Music and Sound Cognition Thematic Map .
Share this article
All blog content is original; please indicate the source when reprinting! The blog's RSS address is:https://blog.tangwudi.com/feed, welcome to subscribe; if necessary, you can joinTelegram GroupDiscuss the problem together.
No Comments

Send Comment Edit Comment


				
|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
(╯‵□′)╯︵┴─┴
 ̄﹃ ̄
(/ω\)
∠(ᐛ 」∠)_
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ`)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ( ̄∇ ̄o)
ヾ(´・ ・`。)ノ"
( ง ᵒ̌ᵒ̌)ง⁼³₌₃
(ó﹏ò。)
Σ(っ°Д °;)っ
( ,,´・ω・)ノ"(´っω・`。)
╮(╯▽╰)╭
o(*////▽////*)q
>﹏<
( ๑´•ω•) "(ㆆᴗㆆ)
😂
😀
😅
😊
🙂
🙃
😌
😍
😘
😜
😝
😏
😒
🙄
😳
😡
😔
😫
😱
😭
💩
👻
🙌
🖕
👍
👫
👬
👭
🌚
🌝
🙈
💊
😶
🙏
🍦
🍉
😣
Source: github.com/k4yt3x/flowerhd
Emoticons
Emoji
Little Dinosaur
flower!
Previous
Next