Mathematics of Music
Sound consists of a physical wave-- a sound wave. A musical note corresponds to a periodic sound wave with a specific frequency. And, as we discuss here, musical relations between notes correspond to mathematical relations between their frequencies.
Contents
Absolute Pitch, Frequency, and Wavelength
The absolute pitch of a note describes how high or low it is. We denote absolute pitch with letters like C, D, E, ... Historically, different tunings (assignments of frequencies to pitches) have been used; and even today, different instruments have different tunings. (For instance, a trumpeter calls "D" what a flutist calls "C".) However, a commonly adopted standard ("concert pitch") is that on non-transposing instruments, the note \(\text{A}_4\) corresponds to a sound wave with a frequency of 440 Hz. This choice also determines the frequencies of the other notes.
Principle 1: Higher pitch corresponds to higher frequency.
Recall that the frequency \(f\) of a sound wave describes how many vibrations there are per second. This is the most important quantity here, because the frequency remains constant as the sound wave travels from its source through the medium into our ears. The wavelength \(\lambda\) describes how the sound wave is spread out in space. The relationship between the two quantities is \[f\lambda = v,\] where \(v\) is the speed of sound in a medium. In air at normal temperature, \(v \approx 340\ \text{m/s}\).
When \(\text{A}_4\) at concert pitch is played, what is the wavelength of the sound wave as it travels through air?
Solution: From the discussion above, we know \(f = 440\ \text{Hz}\) and \(v = 340\ \text{m/s}\). Thus \[\lambda = \frac v f = \frac{340\ \text{m/s}}{440\ \text{/s}} = 0.77\ \text{m}.\]
Relative Pitch, Interval, and Frequency Ratio
In practice, relative pitch is more important in music than absolute pitch. The impression of a melody or harmony depends on the pitch of notes compared with notes played previously or simultaneously. For instance, if an orchestra plays a piece with \(\text{A}_4\) at 450 Hz instead of 440 Hz, and all other notes adjusted accordingly, most people will not even notice the difference.
The relative pitch of one note to another is also called the interval between two notes. On a keyboard instrument, an interval may be characterized by the number of keys that separate two notes.
Principle 2: A musical interval corresponds to a frequency ratio.
For instance, in the song "Are You Sleeping" ("Frère Jacques") the melody starts with an upward interval called "major second". The frequencies of these two notes, \(f_1\) and \(f_2\), have the ratio \[\frac{f_2}{f_1} \approx 1.123.\] This frequency ratio makes the melody recognizable. In principle, one could start the melody at any frequency \(f_1\).
The Dutch anthem "Wilhelmus" traditionally begins with a note C at 264 Hz, followed by F at 352 Hz.
If the melody is transposed to a different pitch, starting at 330 Hz instead, what should be the pitch of the second note?
Solution: After transposition, the frequency ratio should still be the same. Thus we write the equation \[\frac{f}{330\ \text{Hz}} = \frac{352\ \text{Hz}}{264\ \text{Hz}}\ \ \ \ \therefore\ \ \ \ f = 330\cdot \frac{352}{264} = 440\ \text{Hz}.\]
Intervals and Simple Ratios
Some intervals are more pleasant to listen to (consonant) than others. While there is a strongly cultural component in this assessment, there are universal trends in the choice of basic intervals. One of the remarkable discoveries of the ancient Greeks was:
Principle 3: Consonant intervals correspond to simple frequency ratios.
The most important interval is the octave. Notes that differ precisely one octave have a clear difference in pitch, yet sound remarkable similar. An octave is the distance between the first "do" and the second "do" in the music scale; between \(\text{A}_3\) and \(\text{A}_4\). On a keyboard instrument, an octave corresponds to the repetition of the pattern of white and black keys.
Principle 4: An interval of one octave corresponds to a frequency ratio of 2:1.
For instance, in concert pitch the note \(\text{A}_4\) has a frequency of 440 Hz. The note \(\text{A}_5\), which is one octave higher, has a frequency of \(2\cdot 440 = 880\) Hz. The note \(\text{A}_3\) is one octave lower, and has therefore a frequency of \(\tfrac12 \cdot 440 = 220\) Hz.
Insert image: Wave patterns of \(\text{A}_3\), \(\text{A}_4\), \(\text{A}_5\), with frequencies.
A typical organ keyboard has a range of about five octaves. If the lowest note it can play has a frequency of \(66 \text{Hz}\), what is the highest frequency, approximately?
![]()
Solution: Each octave higher corresponds to multiplication of the frequency by two. Thus we get \[f = 2^5\cdot 66\ \text{Hz} \approx 2100\ \text{Hz}.\]
An other interval that is considered consonant in all music traditions is the so-called "perfect fifth". This interval is found, for instance, between the note C and the following G.
Insert: Image of keyboard showing interval C-G.
Principle 5: An interval of a natural perfect fifth corresponds to a frequency ratio 3:2.
Many other intervals can be defined in terms of fifths and octaves. For instance, a major second interval (e.g. from C to D) can be constructed by going up a perfect fifth, then an other perfect fifth, and then down one octave. What is the frequency ratio of this major second interval?
Solution: Simply multiply the ratios. When going down an interval, write the ratio with the smallest value in the numerator. \[\frac{3}{2} \cdot \frac{3}{2} \cdot \frac{1}{2} = \frac{9}{8}.\] Thus, a major second interval corresponds to a frequency ratio of 9:8.
Pythagorean and Just Intonation
In Western music, the standard collection of musical notes originally consisted of seven notes per octave; later, five more notes were added, making for a total of twelve. (This corresponds to 7 white keys and 5 black keys per octave on keyboard instruments.) The frequencies of these notes are chosen in such a way, that many consonant combinations can be made-- that is, simple frequency ratios.
Starting with the note C, we can define G to be one perfect fifth higher; F to be one perfect fifth lower. The note D is constructed by stacking two fifths together. But what about the notes A, E and B? Musicians make different choices here, resulting in different intonations for instruments.
- The Pythagorean intonation continues to pile fifths on top of each other, as follows:
\[\begin{array}{|ccccccccccccc|} \hline \text{F}_4 & \\ \downarrow \tiny \div 2 & \\ \text{F}_3 & \xrightarrow{\times\ 3/2} & \text{C}_4 & \xrightarrow{\times\ 3/2} & \text{G}_4 & \xrightarrow{\times\ 3/2} & \text{D}_5 \\ & & & & & & \downarrow \tiny \div 2 & \\ & & & & & & \text{D}_4 & \xrightarrow{\times\ 3/2} & \text{A}_4 & \xrightarrow{\times\ 3/2} & \text{E}_5 \\ & & & & & & & & & & \downarrow \tiny \div 2 & \\ & & & & & & & & & & \text{E}_4 & \xrightarrow{\times\ 3/2} & \text{B}_4 \\ \hline \end{array}\]
The interval C-E is called "major fifth". What is the frequency ratio for this interval in the Pythagorean intonation?
Solution: Following the diagram above, we get from C to E by going up four perfect fifths, and going down two octaves. Thus we get \[\left(\frac{3}{2}\right)^4\cdot \left(\frac{1}{2}\right)^2 = \frac{3^4}{2^6} = \frac{81}{64}.\]
This is hardly a "simple ratio" anymore-- the main reason why most musicians prefer the just intonation instead.
- In the just intonation, priority is given to simple ratios. Thus we define the interval called major third to have a ratio of 5:4, and define
the note E to be a major third higher than C;
the note A to be a major third higher than F;
the note B to be a major third higher than G.
The following table shows the frequencies in both intonations when we start with C = 264 Hz.
\[\begin{array}{rcccccccc} \hline f\ (\text{Hz}) & \text{C} & \text{D} & \text{E} & \text{F} & \text{G} & \text{A} & \text{B} & \text{C'} \\ \hline \text{Pythagorean} & 264 & 297 & 334.1 & 352 & 396 & 445.5 & 501.2 & 528 \\ \text{just} & 264 & 297 & 330 & 352 & 396 & 440 & 495 & 528 \\ \hline \end{array}\]
The interval D-A is usually considered to be a perfect fifth. Is this the case in either of the intonations described above?
Solution: Yes, for the Pythagorean intonation; no, for the just intonation.
In the Pythagorean intonation we defined A to be a perfect fifth, and indeed we get \[\frac{f_A}{f_D} = \frac{445.5}{297} = \frac{3}{2}.\ \ \ \ (\text{Pythagorean}) \] But in the just intonation \[\frac{f_A}{f_D} = \frac{440}{297} = \frac{40}{27}.\ \ \ \ (\text{just}) \] This frequency ratio is about 1.5% less than 3:2; in the just intonation, the interval D-A sounds slightly lower ("flat") compared to a natural perfect fifth.
In the Pythagorean intonation, the intervals C-D, D-E, F-G, G-A, and A-B all correspond to the same frequency ratio of 9:8.
In the just intonation it is more complicated. Classify the five intervals as "major tones" and "minor tones" based on their frequency ratios.
Solution: We find \[\text{frequency ratio C-D, F-G, A-B} = \frac{9}{8}\ \ \ \ (\text{major tone}); \\ \text{frequency ratio D-E, G-A} = \frac{10}{9}\ \ \ \ (\text{minor tone}).\]
The Pythagorean approach has the advantage that all frequency ratios between notes can be written in powers of 3 and 2. In particular, the "tones" (whole note steps) in the scale all have the same ration \(3:2\). A disadvantage is that intervals such as the major-third (C-E) are not simple-number ratios.
The just intonation has the advantage that the intervals are mostly simple-number ratios. However, certain intervals are not quite what one would except; for instance, D-A falls short of being a natural perfect fifth. Moreover, whole note steps such as C-D and D-E do not have the exact same frequency ratio.
How different are these intonations? Consider the interval C-E: in the Pythagorean intonation it has a ratio of 81:64; in the just intonation it is 5:4, or 80:64. Thus the discrepancy between the intonations is of the order \(81:80 = 1.0125\). The frequencies can therefore be expected to differ in the order of 1 percent.