The Awakening of Sound: Fundamentals (Part 3): Visualization of Interval Structure and Historical Conventions

1. Starting with the black and white keys: the smallest structural unit of the music system

In previous articles (see:The Awakening of Sound: Fundamentals (Part 1): Pitch Structure and Auditory StabilityI mentioned an important fact:The fifth interval describes the auditory proximity or distance between pitches.For example, C and G, G and D, are naturally similar in sound; however, the piano keyboard does not arrange the pitches in this auditory order.

Instead, the keyboard will play these note names (C, D, E, F, G, A, B).In a way that suits both hands playing and memorizationProjected onto a linear key structure—the seven most frequently used and easiest to auditorily integrate note names are retained, forming the basic framework of the white keys; the remaining pitches are inserted as black keys, thus balancing playability and scale integrity within a limited physical space:

image.png

However, once you actually focus on the keyboard, a question inevitably arises: why do the white keys... C–D–E–F–G–A–B–C′ Why does the order keep repeating? And why is it that... E–F and B–C′Between (the position of the next group of C) there is always a missing corresponding black key? If we only consider "ease of playing," all white keys are visually uniform in width and spacing. Under this premise,The fact that these two places are "gap" does seem somewhat counterintuitive. But when we return to actual hearing, we discover something else: the distance between the E–F and B–C′ sets of notes is significantly closer than on other adjacent white keys. Because of this, only a semitone is needed to complete the transition from one note to the next—which implies a crucial fact:White keys are not equidistant.

Although a piano keyboard is a linear physical structure, the pitch relationships it carries inherently exhibit an uneven interval distribution within the white keys from the very beginning. If we consider... C to the next C' By considering the white bonds between them as a complete observation interval, a stable and recurring pattern can be observed:

  • C–D: whole tone
  • D–E: whole tone
  • E–F: Semitone
  • F–G: whole tone
  • G–A: whole tone
  • A–B: whole tone
  • B–C′: Semitone

In other words, in this structure consisting of seven pitches and ending with an octave return,Semitones always appear after the third and seventh degrees. This is not a clever design feature in the keyboard, nor is it a compromise made to "look neat," but rather a direct result of a deeper structure:The white keys of a piano do not present an arbitrary combination of seven notes, but rather a complete, closed, and self-consistent scale structure.

From this perspective, the white keys are not "the first notes selected," nor are the black keys "the notes added later." The actual order is exactly the opposite—first there is a specific way of organizing intervals; the white keys are simply the simplest visual representation of this organization on the keyboard; the black keys appear to fill in the remaining possible pitch positions without disrupting this framework.

And that is precisely why,C–D–E–F–G–A–B–C′ This set of notes, without relying on any sharps or flats, naturally forms a structurally complete and audibly stable scale model. This is precisely the core point I repeatedly emphasized in "The Awakening of Sound: Fundamentals (Part 1)" and "Fundamentals (Part 2)"—A stable listening experience does not come from a single note, but from the smallest and most closable musical structural unit formed by the fixed whole-tone and semi-tone relationships between notes.

2 From Structure to Examples—The Development of C Major and Minor Scales

2.1 Natural Major Scale: The Cleanest Structural Example

In the previous chapter, we saw that the white keys of a piano do not present seven randomly selected notes, but rather a highly stable and recurring structural model of internal interval relationships. It is this structure that makes us naturally perceive a tendency towards "stability," "returning to the starting point," and "ending" in our ears.

In "Awakening of Sound: Basics (Part 1)" and "Awakening of Sound: Basics (Part 2)," I have mentioned the two most basic scale forms, "major" and "minor," several times. Here, I will not repeat their conceptual definitions, but rather look at them from a more intuitive perspective:If one were to choose an example on the keyboard that is least dependent on sharps and flats, most intuitive, and best represents this interval structure, then C major would be an almost unavoidable starting point.

When we summarize the interval patterns from the previous chapter—“"Full—Full—Half—Full—Full—Full—Half"” When directly mapped onto the white keys of a piano, the resulting set of notes is: C – D – E – F – G – A – B – C′. Here, there are no sharps or flats, and all the pitches fall precisely on the white keys.

image.png

This is not because the note C is "special," but because of the interval structure here.It was fully developed with minimal modifications..

This is precisely why C major is often used as a "reference model": not because it is more advanced, but because...The structure itself can be revealed with minimal reliance on external modifications..

From an auditory perspective, the stability of this scale does not come from any particular note, but from the following facts: the distance between the tonic and the other notes is uneven; the semitones are fixed after the third and seventh degrees; this unevenness, on the contrary, constitutes the strongest sense of direction and cessation.

In other words, the reason we "want to stop" on C is not because C itself has any magic, but because the entire set of intervals constantly pushes the auditory center of gravity back to this point.

2.2 Another form of parallel stability: C minor

If we still start with C, but reorganize the intervals, we get another equally complete structure, but with a significantly different sound: C – D – E♭ – F – G – A♭ – B♭ – C′, which is the C minor scale.

image.png

In C minor, the scale still starts from C, but the distribution of semitones and whole tones has changed—“"Full—Half—Full—Full—Half—Full—Full"”,turn out:The tonic remains the same, but the pitch set has changed, and the form of stability has also changed accordingly. This is precisely the core difference between "parallel major" and "parallel minor".

It is important to emphasize that this difference is not a matter of emotional labels such as "bright" or "melancholy," but rather a more fundamental fact: the distance between certain notes and the tonic is shortened; certain stable points are weakened; and auditory tension is redistributed within the scale. From the keyboard's perspective, the piano remains unchanged; from an auditory perspective...The entire "route home" has become completely different..

This also confirms the conclusion of the previous chapter:The keyboard is merely the outer shell that supports the structure; what truly determines the listening experience is always the interval relationships themselves.

2.3 Another classification of the same group of white keys: A minor.

Things get more interesting if we don't change any pitch for now, but simply shift the "auditory focus." We still use the seven white keys: C – D – E – F – G – A – B, but instead of using C as the auditory starting and ending point, we let the sense of stability naturally fall on A. Then, a new scale form emerges: A – B – C – D – E – F – G. This is precisely the scale of C major.Relative minor key – A natural minor key (For ease of understanding, we will not distinguish between specific octaves here, but only focus on "which notes are used"):

image.png

Here, the pitch set is completely consistent, but the auditory experience has changed significantly. The reason is not complicated: the point of stability has shifted; the direction of tension within the scale has been reinterpreted; and "where it ends" is no longer determined by the note name, but by the auditory organization.

From a structural perspective, this was a crucial expansion: it demonstrated that—A musical scale is not just about "which notes are used", but more importantly, "how these notes are understood around a central theme". It is in this sense that tonality, harmony, and the sense of belonging to a voice part begin to have a basis for discussion.

Note: Since "relative minor keys" exist, there are naturally corresponding "relative major keys." By definition, they simply originate from the same set of pitch materials but choose different auditory centers. However, in the context of this article, introducing "relative major keys" will not bring new insights; instead, it may draw attention to the naming system itself. Compared to the symmetry of terminology, this article focuses more on:When the auditory focus shifts, the music changes in its stability, tension, and direction.Therefore, this concept will not be elaborated upon for now, but will be discussed again in a more appropriate context later.

2.4 Summary: A musical scale is not a set of note names, but rather an auditory order.

Looking back at this chapter, we've consistently focused on a more fundamental question:When a set of sounds is organized into a scale, how exactly does hearing achieve a sense of stability?

Regarding this issue, we observed three seemingly similar situations that yielded drastically different auditory results:

1. C major: In the most intuitive physical layout of the white keys, the interval relationship of whole-whole-half-whole-whole-whole-half unfolds naturally, with C becoming a stable starting and ending point, and the scale presents a complete and clear sense of center.

2. C natural minor (parallel minor): With the tonic unchanged, the interval structure is adjusted, and the distribution of stability and tension changes accordingly. The same tonic C can have a significantly different auditory character.

3. A Natural Minor (Relative Minor): The pitch set remains unchanged, still consisting of the seven white keys, but when the auditory focus shifts from C to A, the original order is reinterpreted, and a new stable point naturally emerges.

These three examples reveal a key fact:A musical scale is not simply a "collection of note names," but rather an ordered structure built around the center of auditory perception.The arrangement of the white keys is merely a projection of this structure onto the keyboard. What truly determines "where we stand" and "where we go back" is not physical position, but the relationship between the notes.

That is why, when we talk about scales, we are never just talking about "which notes are used", but how these notes are organized in the auditory sense, how they form a center, and how they create expectations.

Only after understanding this level can we move forward: we will introduce, on top of these scale structures,...Vertical pitch combinationLet's see how the auditory order changes when multiple sounds occur simultaneously.


As a side note, here's some historical background: before the modern keyboard system was developed, the cyclic order of note names was... A–B–C–D–E–F–GThis arrangement is more in line with language habits and is also easier for early notation and singing.

The reason why modern piano keyboards are chosen is... C starts arranging white keysIt's not because the phonetic naming system itself changed, but rather because, in the later development process, in order to simultaneously satisfy...Convenience of performance, visual regularity, and intuitive presentation of scale structurePeople reorganized the physical arrangement of the keyboard. In this process, the C major key, which could fully present the core interval structure of "whole-whole-half-whole-whole-whole-half" without any sharps or flats, became the most intuitive and stable reference model, playing a key role in establishing the final form of the piano white keys.

However, it needs to be emphasized thatThe white keys themselves are not "serving the C major key".It's more like a basic pitch framework built around a cycle of pitch names, independent of the choice of tonic. Therefore, when we... A minor keyWhen mapped onto the same set of white keys, although the auditory focus falls on A and the modal color changes accordingly, the cyclical order of the note names does not conflict with the keyboard layout.They simply represent different trade-offs made in response to historical evolution and actual usage needs at the "structural layer" and "auditory function layer," respectively.

Similarly, other church modes such as Dorian and Phrygian can also be established on the same set of note names and keyboard structure, with the differences mainly reflected in the choice of tonic and the changes in the center of the interval.Since the underlying principles are largely the same, they will not be elaborated upon here.


3. Chords: The formation of vertical structure

3.1 Where do chords come from?

In the previous two chapters, we have repeatedly discussed the issue of scales. Whether it's the arrangement of the white keys or the structural differences between major and minor keys, they all essentially answer the same question:In a series of consecutive pitches, which notes are selected and form a stable auditory order around the tonic? But up to this point, the discussion has remained under one premise—The sounds appear in sequence..

Melody, whether simple or complex, is the unfolding of sounds along a timeline: one note after another, the relationships between them determining the direction, tension, and sense of belonging. However, by slightly altering this premise, a new phenomenon immediately emerges. When two, three, or even more notes...They no longer appear one after another, but are played simultaneously at the same moment.At that time, the auditory experience will change significantly.

Some combinations of notes sound exceptionally natural, stable, and harmonious; even without analysis, their presence is subconsciously accepted. Other combinations, however, even if all the notes come from the same scale, sound tense and uneasy, as if suspended in mid-air, awaiting a "solution" in some direction. This difference does not depend on the complexity of the melody; even by simply pressing a few notes together, most people can immediately distinguish which combinations "stand firm" and which "need to continue."

This shows that when the relationship between notes changes from a "sequential relationship" to a "simultaneous relationship", music enters a new structural level. More importantly, this stability and instability are not accidental.

If we return to the familiar scale structure, taking the C major scale as an example, the seven notes formed by the white keys already contain a clear but uneven distribution of intervals. When certain notes are played simultaneously, the distances between them naturally create a sense of fusion, tension, or tendency in the auditory perception. In other words,Not all simultaneous sounds are equivalent..

In practice, people quickly discovered that some combinations were used and accepted repeatedly, while others were rarely used as focal points. It was on this basis of practice that those combinations within the same scale...A combination of sounds that is stable in sound, clear in structure, and easily perceived as a whole.Gradually, chords were distinguished and became an independent and important structural unit in music. The "chord" we talk about today is a summary and name for this phenomenon.

From this perspective, chords are not a concept "invented" out of thin air, nor are they rules artificially imposed on musical scales. They originate from a very simple fact:When multiple notes in a musical scale are used simultaneously, the auditory system will actively filter them.

Therefore, chords are not the opposite of scales, but rather the natural unfolding of scales in another dimension—if scales describe how notes are arranged in time, then chords discuss how these notes cooperate with each other at the same moment. Understanding this, subsequent discussions about the construction, functional tendencies, and even more complex harmonic progressions of triads are merely further refinements and extensions of this fundamental fact.

3.2 Why "three notes": The most stable vertical structure

In the previous section, we observed a crucial phenomenon: when multiple notes are played simultaneously, the auditory system doesn't perceive them as a "chaotic superposition," but rather actively judges which combinations are more stable and which possess tension. This naturally leads to the following question:How many sounds are needed to form a relatively complete vertical structure that can be independently recognized by hearing?

Let's start with the simplest case: when two sounds occur simultaneously, the core auditory focus remains on the interval between them.Interval RelationsWhether it's a perfect fifth, a perfect fourth, a major third, or a minor third, it's essentially still "the distance between one note and another." This combination can be stable or unstable, but it remains at the level of "relationship," not "structure." In other words,The two notes are more about describing tension than about establishing a sense of belonging.They can serve the melody or counterpoint well, but they are difficult to stand alone as the "landing point" or "center".

When the third note was added, the situation changed fundamentally. The three notes were no longer simply the sum of their intervals, but began to form a...Internally self-consistent wholeAt this point, hearing no longer tracks each note individually, but automatically integrates them into a unified auditory object. Taking the C major scale as an example, when the notes C, E, and G appear simultaneously:

image.png

We almost no longer perceive their individual pitches separately, but instead hear a "solid, unified whole" that clearly points to C without forcing our hearing to move further forward. This is crucial:For the first time, the three notes allowed the auditory sense to simultaneously experience both a "sense of center" and a "sense of completeness".

Structurally, this is because a three-note combination can simultaneously satisfy three conditions: there is a definite reference note (the tonic or root note), there is a note that determines the color (the major or minor third), and there is a note for support and extension (the fifth). Even without using any theoretical terminology, hearing will naturally make this judgment: "This combination is complete and can stop here."“

In contrast, if the number of notes is increased further, such as by adding the seventh or ninth note, the auditory perception will not become more stable. Instead, it often introduces new tension and directionality. These notes are not useless, but they are more like adding decoration, propelling the direction, or creating suspense on an already established structure.

Therefore, from the perspective of long-term musical practice, three notes constitute a minimum, yet sufficiently complete, vertical structural unit. It is neither a hastily constructed temporary combination nor an overly complex stacking, but rather achieves a very natural balance between stability, clarity, and recognizability. Precisely because this structure is audibly complete, the combination of three notes is repeatedly used and has gradually become the most basic building block in vertical sound organization.

However, it is important to note that "complete" is not the same as "identical". Even if they are composed of the same number of notes, these three-note structures will not be treated equally in actual hearing.

When they are placed back into the context of a specific musical scale, some will be naturally heard as "a place where you can stop," while others seem to be born with a sense of direction, pointing to a point that has not yet appeared.It is this difference that makes the structure composed of three notes no longer just an abstract, stable combination, but begin to assume different musical roles.

Next, we need to take a closer look at how these roles came about.

3.3 Why do three notes sound different to each other?

In the previous section, we established that a three-note structure is the smallest, yet sufficiently complete, vertical unit in terms of auditory perception. However, this does not mean that all three-note structures are auditorily equivalent.

If we still use C major as the background and stack the notes in the scale upwards in thirds, we get multiple combinations of three notes. Structurally, they all meet the condition of "three notes," but in actual listening, they exhibit distinctly different tendencies.

For example, when the notes C, E, and G appear simultaneously, most people have a very clear feeling: the sound can stop here and doesn't need to continue. Even if the melody ends here, there is no suspense or discomfort in the ear. This sense of stability is not because there are more or louder notes, but because this combination highly coincides with the center of gravity of the scale.

However, if the same approach is applied to another group of notes, such as G, B, and D, the situation changes. Although these are still three notes from the same scale and structurally sound, the auditory perception often creates a feeling that "it's not over yet." The combination seems to be pointing to a point that has not yet appeared, rather than becoming the end itself.

If you change the set of notes, such as F, A, C, the listening experience will be different again. It doesn't have the strong sense of termination like C–E–G, nor does it urgently demand a resolution like G–B–D. Instead, it presents a state in between, which can pause temporarily while retaining room for further development.

These differences are not coincidences of subjective perception, but rather stem directly from their positional relationship within the scale. Although the three-note structure is complete vertically, their distance from the tonic and their connection to the scale's center of gravity still subtly influence auditory judgment.

This is why hearing doesn't treat all three-note combinations the same; instead, it naturally assigns them different "roles." Some are more like points of belonging, some more like points of advancement, and others serve as transitions and connections.

It is precisely because of the differences in the auditory perception of these three-tone structures that hearing does not regard them as equivalent units that can be arbitrarily interchanged. Even if they are equally complete in structure, they will naturally exhibit different tendencies in actual perception: some are more easily regarded as the point of belonging, while others are more like pushing hearing forward.

It is important to emphasize that this difference is not an acquired rule, but rather a spontaneous judgment formed by hearing when faced with different pitch positions. It is precisely at this level that we are able to distinguish for the first time:Although both are structures composed of three notes, they do not play the same auditory role.

Up to this point, we are still at the level of "individual structures," discussing their individual tendencies and characteristics. Only when these structures with different tendencies are placed within time and connected to each other will they further form a more complex overall trend.

3.4 Intra-key chords: The development of three-note structures in the scale

In the previous sections, we've established that chords aren't simply a random stack of notes; rather, they're built upon a scale, forming a stable vertical structure centered on a "three-note framework." The next question then becomes—If we consider a specific key, how would these three-note structures appear?

To avoid introducing additional complexity, we will use the most intuitive and commonly used reference here. C Major Let's take an example.

The scale of C major is: C – D – E – F – G – A – B, if we strictly follow the rules already discussed:Within the scale, three notes are superimposed in a manner that skips one note.This will result in a set of triads entirely derived from the key. Starting from each degree of the scale, we can construct the following:

  • With C as the root note: C–E–G
  • With D as the root note: D–F–A
  • With E as the root note: E–G–B
  • With F as the root note: F–A–C
  • With G as the root note: G–B–D
  • With A as the root note: A–C–E
  • With B as the root note: B–D–F

As you can see, these chords are not "a set of combinations designed by humans," but rather...The result is naturally derived from the structure of the musical scale itself.As long as you stay in one key and consistently use triad stacking, these seven chords will almost inevitably appear.

Further observation reveals that while they appear similar in form, their stability is not entirely the same—some sound bright and stable, some are softer, and some are noticeably tense and uneasy. This difference is not accidental, but stems from the different interval relationships contained within each chord. It is precisely for this reason that, in tonal music, these chords have gradually been given different "positions" and "roles," rather than simply existing as isolated vertical structures.

However, we won't rush to discuss "how to use them" or "where they should go" here. The focus of this section is simply to clearly show the reader that when a three-note structure is fully placed within a scale, chords are no longer fragmented concepts, but rather unfold naturally along the scale in a systematic way. From this moment on, "chords" are no longer just a matter of individual stable structures, but formally become part of the tonal system. And when these intratonal chords begin to relate to each other, forming a sense of direction and progression, we truly enter the realm of "harmony."

4. Harmony: When the structure begins to move

4.1 When chords begin to point to each other

In the previous chapter, we saw that even though they are both composed of three notes, different three-note structures are not auditory equivalent. They will naturally exhibit different tendencies such as stability, advancement, or transition depending on their relationship with the tonic.

But so far, we've still been discussing "what each of these structures looks like individually." In other words, as long as these structures are understood only as "individual existences," what we perceive is merely a fleeting moment, not the movement of the music itself. What truly gives music a sense of direction is not a single three-note structure itself, but the overall relationship formed when they appear sequentially in time and are interconnected—This is where harmony begins to play its role..

Harmony can be understood as follows:Along the time axis, the continuous changes in the longitudinal structure (chords) and their interrelationships constitute harmony itself.In other words, harmony is not concerned with what a single chord is, but rather how these chords unfold in time, point towards each other, generate tension and release tension, and ultimately create a sense of direction in the music.

When a stable three-note structure appears, the ear tends to assume it can serve as a temporary resting point; conversely, when a structure with a tendency to advance appears, the ear instinctively anticipates the arrival of the next structure. This anticipation does not originate from the melody itself, but rather from the implicit directional relationship between the vertical structures.

For example, when a structure centered on G–B–D appears, even if the melody has not yet clearly fallen to C, the auditory sense has already begun to "wait" for some kind of return to occur. This sense of waiting does not depend on the number or loudness of the notes, but rather on the position of the structure in the scale and the unclosed relationship between it and the tonic.

Therefore, harmony is not describing "what is heard at this moment", but rather describing...What space does the current structure leave for possible future structures?.

From this perspective, chords are more like static components, while harmony is how these components are organized in time. The former answers "what it is," while the latter answers "where it goes." When multiple chords are placed in different positions and appear in a specific order, a series of relationships such as tension, relief, return, and deviation are formed between them, and these relationships are precisely the musical movement perceived by hearing.

This is why, even if two pieces of music use the exact same chords, as long as the order of arrangement or the timing of pauses and transitions are different, the overall feeling they ultimately present can be completely opposite. Harmony is never concerned with isolated materials, but with the paths formed between materials.

Once we realize this, we will find that the so-called "sense of harmony" is essentially the ability to judge the direction of structure. It does not require the listener to be able to accurately say the chord names, but it allows people to clearly perceive "this is not over yet", "this has come back", or "this is turning elsewhere".


In this article, the "harmony" we refer to meansThe concept used in modern vocal music and music theory to describe chords and their vertical relationships.—That is, starting from pitch, interval, and three-note structure, analyzing how they interact in time, generating tension and a sense of return. This is not entirely the same as people's everyday understanding of "harmony": when singing, the lead singer sings the melody, while the singers next to him sing different notes, which we also call harmony.

The connection between the two lies in the fact that the different notes in the backing vocals are often the chord tones analyzed in theoretical harmony. Their arrangement and progression actually present the stability, tension, and sense of direction in harmony theory. In other words, backing vocal harmony can be seen as a visual manifestation of theoretical harmony, but theoretical harmony focuses on structure and function, not just the superposition of multiple voices heard.


4.2 What does harmony do as the melody progresses?

In our daily auditory experience, we tend to naturally focus our attention on the melody. The melody is "in the foreground": it is sung, remembered, and is the line that is easiest for people to hum. Therefore, when we talk about the direction of a piece of music, we intuitively attribute feelings such as "progress," "pause," and "return" entirely to the melody itself.

However, if you look closely, you will find an intriguing phenomenon: the same melody often gives people different feelings under different backgrounds—sometimes it sounds like it lands steadily in one place, while at other times it seems to be suspended in the air; sometimes a note can be naturally regarded as the end point, while in other cases, the note seems to just "pass by".

The problem lies not in the melody notes themselves, but in the context of their placement.Vertical EnvironmentA melody is essentially a horizontally unfolding line. It presents only one note at a time, and a single note does not inherently carry the information of "this is the end" or "this is the beginning." A note is perceived as stable or unstable not often because of its absolute pitch, but because of...What is its relationship with the current vertical structure?.

It is at this level that harmony begins to function. When a chord (or more precisely, a vertical structure) is in the background, it provides an implicit frame of reference for the present moment. Every note in the melody is automatically placed within this frame for judgment: Is this note part of the structure, or an offset from it? Is it reinforcing the current stability, or creating tension, pointing to the next change?

Therefore, harmony does not need to be fully "heard" to produce an effect. Even if it is only faintly present, or even if it is only presented in a simplified form in the accompaniment, it is already subtly influencing how the ear interprets the melody.

To give a visual example: when the note G appears in a melody, if the background structure clearly points towards C as the center, then this G will often be perceived as a stable but not fully closed state; however, if the background structure itself is already in an unstable position, the same G might be perceived as a further driving force. The note hasn't changed, and the direction of the melody hasn't changed either, but...The auditory perception of "what will happen next" has changed significantly..

From this perspective, the melody is not moving forward in isolation, "carrying the music forward." It is more like moving within a pre-laid force field, and this tension field is constructed by harmony.

This is why, when listening to music, we often have a very clear yet indescribable feeling: in some parts, "that's enough," while in others, "it just doesn't feel over yet." This judgment is not derived step by step by analyzing the intervals of the melody, but rather as a holistic intuition formed by the auditory system simultaneously integrating horizontal lines and vertical structures.

It can be said that,Melody is responsible for speaking, while harmony is responsible for defining whether the statement is a statement, a question, or a question awaiting a response. The melody tells us what happened, while the harmony quietly determines behind the scenes whether these sounds have taken hold, whether they are still suspended in mid-air, or whether they are pointing to a point that has not yet appeared.

Once we realize this, we discover that the so-called "sense of melody" and "sense of harmony" are not two separate abilities. The former is hearing lines in time, while the latter is perceiving direction in structure. A truly mature auditory experience occurs precisely in the continuous interaction between these two.

4.3 Harmonic Progression: How Structure Forms Paths in Time

In the first two sections, we gradually shifted our focus from "individual structures" to the interrelationships of these structures over time. As the melody progresses, different vertical structures appear and disappear in the background, subtly altering the auditory focus and expectations. However, viewing these changes merely as a series of "passive accompaniments" is insufficient to explain a more crucial phenomenon:Why do some sequences of events seem logical and natural, while others feel awkward, abrupt, or even unsettling?This is precisely the level that "harmonic progression" aims to describe.

The so-called harmonic progression is not about listing "which chords can be used", but about discussing:When different longitudinal structures appear in sequence, how do they connect to form a directional path in the auditory perception?This path did not arise out of thin air, but is based on a core fact that has appeared repeatedly before—different structures themselves already have different tendencies toward stability.

When a relatively stable structure appears, hearing tends to naturally perceive it as a temporary pause; however, when a structure with a sense of urgency or incompleteness appears, hearing begins to anticipate the next development. Importantly, this anticipation doesn't arise after the structure disappears, but is activated the moment it appears. In other words,The very appearance of certain structures already "hints" at what is more likely to follow them, and less likely to follow them.

It is through the continuous accumulation of these suggestions that harmony begins to exhibit a "direction." When the directional relationships between structures connect with each other, hearing no longer merely perceives one instantaneous state after another, but begins to perceive a continuous process of movement: from relative stability to instability, from tension to relief, from deviation to return.

It's important to note that the "path" here is not an absolutely fixed rule, much less a mechanical formula. It's more like a consensus gradually formed through long-term auditory experience:Some directions feel natural not because they are the only correct ones, but because they follow auditory expectations.The tension that arises when music deliberately goes against this expectation is based on the premise of "it should be this way, but it isn't".

From this perspective, harmonic progressions don't dictate "how music should be written," but rather describe "how music is typically understood." It focuses not on the name or structure of a particular feature, but rather on:Where will this structure lead our hearing? When the next structure appears, will it continue in this direction, or will it deliberately change it?

This is why, even if two pieces of music use the exact same structural set, as long as the order of appearance, the duration of their pauses, and the points that are avoided or emphasized differ, the overall direction they ultimately present may be completely opposite. What the ear perceives is not "what materials were used," but "what kind of route these materials form in time."

Understanding this, we can view harmonic progressions as a kind of "path design": some paths are clear and direct, while others involve detours, delays, and the creation of suspense. But regardless of the method chosen, they all rely on a common premise—The auditory system is constantly tracking the directional relationships between structures and adjusting its expectations for the music to come accordingly.

4.4 Why does deviation from the path create tension?

In the previous section, we understood harmonic progression as a kind of "path": different longitudinal structures appear sequentially in time, forming an auditoryally traceable path through their respective directional relationships. When this path unfolds according to pre-existing expectations, the auditory experience feels natural and smooth, and even before it ends, one can "know" roughly what will happen next.

So, where does tension come from? The answer is not complicated:Tension arises precisely when the path does not continue as expected.

When a structure appears, it is not merely stating its own state, but also sending an implicit directional suggestion to the auditory system. Through long-term experience, the auditory system has become accustomed to which structures are more easily followed by others, and to the fact that certain incomplete states eventually need to be "completed." It is in this context that if subsequent developments do not unfold along this implicit path, the auditory system will immediately perceive the deviation.

This deviation is not the same as "error" or "confusion." On the contrary, it is precisely based on the premise that the path has already been understood. If the auditory system has not yet formed any expectations, then there is no such thing as deviation.Deviation is perceived as tension only when the path is clear enough.

From an auditory perspective, this tension often manifests as a state of "suspense": the music seems to have paused at a point where it shouldn't have stopped, or continues moving in a direction that isn't the final destination. At this moment, the auditory sense doesn't lose its sense of direction; instead, it focuses more intently on the upcoming changes, trying to determine whether the path will be pulled back or veer further away.

Importantly, the tension is not generated by the individual structure itself, but by...The relationships between structures are temporarily suspended or reversed.This is what causes it. The same vertical structure, if it appears on a logical path, may go almost unnoticed; but if it appears in a position that "shouldn't be there," its instability will be immediately amplified by hearing.

This is precisely why deviating from the path doesn't create tension indefinitely. If the music continues to wander in directions that are completely untraceable, the auditory system gradually abandons its expectation of a path, and the tension dissipates. In other words,Tension depends on order, not chaos.It needs an established sense of direction as a reference point for the deviation.

When a deviation occurs, music typically unfolds between two possibilities: one is to re-establish a new path through further shifts; the other is to bring the auditory experience back to its original, expected position through regression or correction. In either case, the deviation itself is not the end point, but merely a stage in the evolution of the path.

From this perspective, tension is not an "extra effect" in music, but a natural byproduct of path awareness. When hearing has learned to track the directional relationships between structures, any delay, avoidance, or reversal of this relationship will be perceived as a state that needs to be resolved.

Once we understand this, we can easily see that the so-called "tension" and "release" are not abstract emotional labels, but rather a dynamic judgment experienced by the auditory system as the path is deviated from and then reconfirmed.

4.5 When the path is redefined: mode change/modulation

In the previous section, we discussed a situation where the path still exists, but the musical choice temporarily deviates from it, thus creating tension; subsequently, through regression or correction, the path is reconfirmed. No matter how far it deviates, the auditory perception always defaults—This road is still the same one..

But music doesn't always have to go back. Sometimes, music doesn't just "deviate and then return," but takes a more thorough step:Redefine the path itself directlyWhen this moment occurs, the hearing no longer waits to return to its original center, but is guided to accept a new stable point. This is precisely the experience brought about by modal changes or modulations.

From an auditory perspective, modulation is not simply "changing a set of note names," but rather something more fundamental happens:The original frame of reference used to determine direction was completely replaced.The structures that were previously used to create tension and propel progress no longer hold true; new stable relationships begin to take effect, and new "routes home" are quietly established.

This process works precisely because hearing is sufficiently sensitive to path. When music begins to repeatedly emphasize a new, stable point and develops a series of self-consistent directional relationships around it, hearing gradually abandons the old center and shifts the "sense of destination" to the new location. This shift does not need to be explicitly announced, but is slowly confirmed over time.

This is precisely why successful modulations are often not abrupt breaks, but rather a kind of...Shift of focusBefore the transition is complete, music often goes through a period of ambiguity: the allure of the old path has not completely faded, and the stability of the new path has not yet been fully established. During this time, the auditory system retains two sets of judgment criteria simultaneously, and this "swaying" experience is precisely the most tension-filled part of the modulation process.

Structurally, modulation is fundamentally different from the "deviation" discussed earlier. Deviation still creates suspense within the same path, assuming the endpoint remains unchanged; while modulation tells the ear:What was once the destination is no longer the destination.Once this is confirmed, all the previous unresolved tensions will be reinterpreted, rather than "resolved".

This is why, after a key change, music often gives the listener a feeling of "opening up their horizons." It's not because the number of notes increases or the difficulty increases, but because the auditory frame of reference is refreshed. Familiar structures may acquire completely different meanings under the new path; and previously inconspicuous combinations may suddenly become new, stable cores.

It is important to emphasize that modal changes do not necessarily have to deviate from the original pitch material. Even when using a large number of the same notes, as long as the stable point shifts, the auditory system will still perceive a reconstruction of the path. This is highly consistent with our experience when discussing relative major and minor keys earlier: the set of notes may remain unchanged, but once the question of "where it goes" changes, the overall perception will be reversed.

From this perspective, modulation is not a decorative technique, but rather an act of replanning the path. It doesn't determine whether there's tension at a particular moment, but rather...What will the rest of the music revolve around?.

5. Postscript: Things this article deliberately omitted.

At this point, I can actually say quite frankly:This article deliberately keeps the discussion of chords and harmony at a very basic level. Throughout the process, we focused almost exclusively on a few of the simplest yet most core concepts: the structure of scales, the vertical combination of three notes, the auditory tendencies of different structures, and how they point to each other and form paths in time. To this end, we avoided many of the professional divisions that are common and indeed exist in music theory.

For example, in a more complete theoretical system, chords are not limited to the "triad" consisting of three notes. Above the three-note structure, chords can be stacked up by thirds to form seventh chords with four notes, ninth chords with five notes, and even more complex extended forms.


A triad is...Number of soundsThe naming convention is based on the number of notes—three notes stacked together form a basic chord. However, when we continue to stack more notes, such as seventh, ninth, and eleventh chords, the naming method is no longer simply based on the number of notes: the names of these chords reflect...The difference in degree from the root note to the apex note(For example, a seventh chord represents a span of a seventh from the root note to the top note), not just the number of notes.

This difference in naming is mainly due to historical conventions, so there's no need to worry about it.


Even when looking at triads themselves, they can be further distinguished into major triads, minor triads, diminished triads, and augmented triads based on their internal interval relationships. In harmonic analysis, more detailed tools such as function, inversion, and voice progression are often introduced to describe how these structures function in time.

These contents are not unimportant; on the contrary, they constitute an indispensable part of a complete system of harmony theory. However, in this article, they are...Intentionally set aside for the time being.The reason is simple: the goal of this article is not to teach readers how to "analyze," but rather to first address a more fundamental and easily overlooked problem—What exactly is it about music that allows us to feel its stability, tension, and sense of return?

Introducing a large number of terms and classifications before establishing auditory intuition can easily create the illusion that these sensations in music exist simply because the rules have been "memorized." In fact, quite the opposite is true. It is precisely because these sensations are repeatedly heard and verified that theories are gradually summarized to describe them. If, after reading this, you begin to realize that when listening to music, you can more clearly sense "this isn't over yet," "this is turning," and "this is finally coming back," then the purpose of this article has already been achieved.

As for those aspects that have been temporarily omitted—more complex chord forms, more refined functional divisions—they are not negated, but rather reserved for other contexts. Perhaps they will emerge gradually in later articles, or perhaps they will gradually surface when you begin to actively analyze the music you enjoy. Music theory is not a set of knowledge that must be learned all at once. Often,Understanding never begins with "completeness," but with "just enough."

If this article can help you develop a little more of this understanding in the face of music, then its choice of restraint is already worthwhile.


📚 系列文章:声音的觉醒 · 基础篇(3 / 3)


← 上一篇


最后一篇 →

📌 Content Structure Hints:
This content belongs to "Music and Sound Cognition Thematic MapThis is part of the document; you can view the full content path here: Music and Sound Cognition Thematic Map .
Share this article
All blog content is original; please indicate the source when reprinting! The blog's RSS address is:https://blog.tangwudi.com/feed, welcome to subscribe; if necessary, you can joinTelegram GroupDiscuss the problem together.
No Comments

Send Comment Edit Comment


				
|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
(╯‵□′)╯︵┴─┴
 ̄﹃ ̄
(/ω\)
∠(ᐛ 」∠)_
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ`)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ( ̄∇ ̄o)
ヾ(´・ ・`。)ノ"
( ง ᵒ̌ᵒ̌)ง⁼³₌₃
(ó﹏ò。)
Σ(っ°Д °;)っ
( ,,´・ω・)ノ"(´っω・`。)
╮(╯▽╰)╭
o(*////▽////*)q
>﹏<
( ๑´•ω•) "(ㆆᴗㆆ)
😂
😀
😅
😊
🙂
🙃
😌
😍
😘
😜
😝
😏
😒
🙄
😳
😡
😔
😫
😱
😭
💩
👻
🙌
🖕
👍
👫
👬
👭
🌚
🌝
🙈
💊
😶
🙏
🍦
🍉
😣
Source: github.com/k4yt3x/flowerhd
Emoticons
Emoji
Little Dinosaur
flower!
Previous
Next