However, it is possible to use a small set of patterns, and it is also possible (and reasonable) that this pattern would not be the same in the version of the system deployed in real life.
We will now show that our attack still works. More precisely, we will show how to recover an unknown pattern from the marked song, without the original song.
Let us first assume, to simplify the exposition, that the mark starts
at the first sample of the song.
We use the same notation as in the rest of the article:
denote the
chunk of
samples of the unmarked song,
and
the corresponding chunks of the marked song,
denotes the pattern (unknown here),
and
denote the unknown function.
Let us also assume, again for simplicity, that the song is exactly
chunks long.
So, we have, for every
in
,
and every
in
,
![]() |
(3) |
Let us divide by
and sum over
.
We will use the following notations:
We have, for every :
![]() |
(4) |
The multiplicative term
is not very problematic. First, it turned out that it was extremely close
to one for every
(actually, it would almost disappear if we knew
),
second, is is not a real problem to recover the mark times a multiplicative
constant. I would have been a problem if this sum happened to be very small,
but that was not the case.
The more problematic term is . We would like it to be small.
However, it is very difficult to estimate the typical value of
.
Naturally, if
is large enough, it should be very small.
Having longer songs (the songs included in the challenge were only
two minute long) would help. Also, if the same (now unkown) mark is used for
several songs (which seems to be the case), we can actually perform the
averaging on all the songs we can obtain, thus largely improving the
chances for
to be negligeable.
The problem is that the structure of the music plays an important part in the
value of . If, for example, a drum beat happens with a period which
is synchronized with the period of the mark,
might be very large
on some specific points.
We have not had time to perform an analysis of the value of on
a large number of songs, and we are not aware of a general statistical
model for music. What we know, however, is that in the case
of
, our technique works surprisingly well. It turned out that the average
of the unmarked version of this song was especially small. We could recover
the mark from
(and from only
) with a very good precision.
Figure 7 shows a part of the mark recovered from
and the
corresponding mark extracted from the difference
.
Once we have recovered the mark, the rest of the attack works as previously.
Note that, especially when is not negligeable, it is possible
to improve the precision on
by filtering
to attenuate
in a
very significant way. As a matter of fact,
and
are very
different in nature.
is obtained by averaging the signal over
periods of 147 samples and such a process is well known to be equivalent as
low-pass filtering. On the contrary, the watermark
has the most important part of its information in the higher frequencies.
Figure 8 illustrates very well this phenomenon on a different
song than
.
Therefore, applying a high-pass filter with an adequate cutoff frequency
(
in the case of figure 8) allows the
extraction of
with a higher precision.
As a final note, let us recall that we had assumed, for simplicity,
that the mark was starting at the first sample of the song.
This was not the case in . However, we simply needed to perform
the averaging attack for every
possible starting positions.