|
 | |  | | May31Written by:Rip Rowan Monday, May 31, 1999 6:00 PM  The Internet is exploding with music. Currently, MP3 formatted audio is the most popular Internet attraction, offering high-fidelity digital audio at 1/12th the file size of a standard 16/44.1 stereo digital file. As Internet bandwidth grows, we will also have the option of quickly downloading 16/44.1 uncompressed audio.
There is a misconception that the Internet is going to usher in the New Age of free music for all. Any song you want to hear will become free for the listening. There are even other, more misinformed people who point to FM radio and say, "see, music is already free on the radio - MP3 just gives us better-sounding free music." Ridiculous. Music on FM isn't free. You have to pay for it. You pay by listening to advertisements and buying products. Money flows from you to advertisers to radio stations to publishing companies and eventually to artists.
Others point to the record labels who are the chief beneficiaries of the current system and remark that the Internet and MP3 will offer "power to the people." Artists will be able to publish their music for free on the Internet and people can download it free. Sure, I don't mind giving away some of my music for free. But if my music was the most popular thing since the Beatles, and I wasn't making a cent off of it, you bet I'd get a little upset. When you're an artist, and you're required to write, record, rehearse and perform for a living, you had better believe that copyrights and royalties become pretty damned important. We're talking about food on the table here, folks.
The fact is, artists need record companies and publishers. It's a true love / hate relationship. You can't live with them, and you can't live without them. The best thing that the Internet can offer the working artist is a lower-overhead means of generating income. Perhaps, by eliminating some paper flows, by automating some processes, by maybe even eliminating some middlemen, we can increase the cash flow to the artist.
The key barrier to music distribution over the Internet remains: how can I (as either an indie artist or a record label), the owner of the rights to a musical recording, collect sales and royalty revenue from my creation? It's easy to set up a web site where people can pay a buck to download a song, but it's damn near impossible to make "the next buck" after the first guy posts the song on his website for free downloading!
What is needed is some kind of technology to easily identify a piece of work. This has been done before. Early technologies, such as SCMS (used to copy protect DATs) utilized a code stored in a particular location on the tape. Thus it was easy to find the code and remove or nullify it. Clearly, any protection mechanism must be more integral, so that removing it or nullifying it is more difficult.
That's where data embedding - digital watermarking - comes into play. If we can encode some kind of inaudible unique identifier into a piece of music, then perhaps we can get control over the problem. Since the "code" would be integrated into the material, it would be impossible to remove the code without damaging or losing the content. Possible schemes include:
1. Create a pay-per-use playback system - kind of like a DIVX player - where every time the audio is played a charge is incurred
2. Hire a bunch of lawyers to monitor Internet transactions and sue people who are distributing music which is clearly identified as mine
3. Create a new, more esoteric technology to "invisibly" charge the user for a download.
We'll talk more about these options - and others - later. But it becomes clear that the foundation technology lies with a digital identification stamp that can uniquely identify a song - a stamp which is:
1. Quickly recoverable from the playback media to enable quick transactions in an automated environment
2. Robust enough to withstand compression, production, broadcast, and multiple lossy generations
3. 100% failsafe to prevent misidentification
4. Sufficiently inaudible to be undetectable by the human ear
5. Embedded into the data itself and protected to eliminate or substantially reduce the ability for others to detect and remove the stamp
Base Technology
Cognicity AudioKey is a digital watermarking solution that enables the distributor of a digital audio file to encode a digital message - say, a serial number - into the audio. This serial number could then be used to track distribution, to verify ownership of the music, or perhaps even to cause a charge to be incurred for downloading or playback. We'll discuss possible applications at the end of this article.
AudioKey is a first generation base technology product. It does not attempt to actually do anything with the embedded data: it doesn't "lock" a file from playback, charge anybody, or automatically report unauthorized use to any attorneys. It merely allows the user to embed and extract information manually. As people become familiar with the technology, it is expected that Cognicity will build new application-specific software, or, hopefully, will license the encoding and extraction technology to third parties who offer applications based on it.
Cognicity AudioKey
AudioKey software is very easy to use. Provide an audio file or a set of files to be processed, provide a short text file containing the data to embed into the audio, and press the "Embed" button. Then sit back and wait for the encoding to complete.
Retrieving an encoded message is just as easy. Just open an audio file or a batch of files and click the "Extract" button. Your text message will be retrieved.
Embedding and extracting are time consuming processes. On my machine - a hopped-up Celeron 300A running at 450 MHz, it took well over a minute to encode or retrieve the text from a four-minute long song. That's important because certain kinds of real-time applications of digital watermarking technology will require the data to be quickly extracted.
My other concern with this software was that it wouldn't be sufficiently robust to withstand copying and / or compression technology. Cognicity claims that once encoded, the watermark will "survive music editing, production, format conversion (including D/A and A/D conversions), compression, streaming, broadcast, etc.."
These are bold claims. In particular the words, "production" and, more notably, "etc." leaped off of this page right at me. Do these people have any idea how engineers can torture a digital audio file? Can this watermark really survive "etc." - for instance, Sonic Foundry's "gapper-snipper", reversing the audio, severe distortion, mixing with another source, or lo-fi duplication? I was determined to find out.
I started out by encoding a 4 minute long 16 bit stereo audio file with a serial number that consisted of the CD's UPC, the song number on the CD, and today's date. Since this was a "stress test" I encoded the audio with AudioKey's "High Robustness" setting. I tried extracting the data from the encoded file and was able to do so without error.
Next I tried a few "abusive" operations. First, I ran the song through two levels of compression until it was pumping and breathing like an old whore with a heart condition. AudioKey was able to recover the encryption without error. I also encoded the audio at MP3's highest quality level - 12:1 compression ratio. AudioKey was able to recover the code easily and without error. These two tests demonstrated that AudioKey's watermark would survive "hi-fi" data and audio compression.
I tried some other, "hi-fi" distortion with good results. I ran the audio through Cakewalk's FX2 Tape Sim, with varying levels of "warmth" and "hiss", and in each case, AudioKey was able to extract the watermark.
Now I tried more extreme processes. I recorded the audio onto a standard chrome cassette tape, trying different kinds of noise reduction. Using dbx noise reduction, AudioKey was able to extract the watermark, but there were a couple of errors in the extracted data. But, using dolby B noise reduction, as well as no noise reduction, AudioKey was unable to recover the data watermark.
I was also curious about lo-fi Internet broadcast. When the file was encoded with MP3 at it's "28.8 modem" setting, the resulting file could not be decoded. Consulting the manual on this issue reveals that Cognicity offers a special method for encoding RealAudio files. However, I was at a loss to understand how this helps me out as a user. If someone gets a CD-quality copy of my song, and encodes it in RealAudio, I want the watermark to survive.
The other application that really interested me was that of encoding a music sample. Manufacturers of sample and sound effects CDs would cherish a means of proving that their samples and sound effects were used without permission. Would AudioKey be able to decode the watermark after that sample had been used in a song, with other sound playing? So I encoded a sample and mixed the sample into a minute or so of music. AudioKey was not able to extract the watermark from the resulting mix.
I have one final complaint. The application, built in Java, does not adhere sufficiently closely to the Windows UI standards. Files open on single-clicks, right-clicking does nothing on anything, and windows and menus look and feel "different" from the other applications on my machine. Some of these deviances were enough to really irritate me - particularly the one about opening files with one click. After all, this is one convention that even Apple and Microsoft agree on - single-click to select, double-click to open. I found myself double-clicking on filenames and accidentally drilling all over the damned place.
Conclusions and Applications
In the end, Cognicity's claim that the watermark will "survive music editing, production, format conversion (including D/A and A/D conversions), compression, streaming, broadcast, etc.." was very overstated. I was easily able to confuse the software.
But let's put aside this obvious overselling of the tool. What about real-world use? Can a digital watermark survive the "normal" transmission / distribution means routinely used for real-world pirating? The answer to that question is yes. I do not believe that a 28.8-quality MP3 is a valid "pirate" of my music. I don't think that in this day and age of high-fidelity audio that anybody would consider such a lo-fi copy to be an acceptable substitute for a CD or a high-quality MP3. Nor is a cassette tape a valid pirate.
One other test I performed was to take the encoded song, invert the waveform, and "subtract" it from the original. I wanted to hear the difference between the two. I was very surprised to discover how loud the watermark was. AudioKey utilizes masking, which "hides" the data behind the audio at approximately 40 dB below the level of the audio. This technology makes the watermark very robust when encoded into material with a high average signal level, and very fragile when encoded into quiet or very dynamic material. The key benefit is that the distortion it introduces into the signal remains imperceptible.
Nevertheless, the concept of introducing distortion into a recording is likely to meet with fierce opposition from die-hard audiophiles. In particular, those audiophiles who claim that MP3 files "sound like crap" aren't going to like watermarking at all. In the end, the forces of Audio Purity and Music Business are going to have to come to terms on an acceptable way to protect both fidelity and copyrights.
Cognicity AudioKey certainly sets some high expectations. I was disappointed that the technology did not live up to the claims made by the manufacturer. However, for many applications the technology is suitable.
So, what can we do with digital watermarking?
First off, although we have focused on the use of watermarking digital audio, it must be pointed out that this technology can be used to watermark graphics and video files. Cognicity provides three packages - AudioKey, ImageKey, and VideoKey for these three applications. Because of the high bandwidth of video, VideoKey provides the most exciting opportunities. For example, AudioKey can embed up to 25 bytes of data per second of audio. VideoKey can embed up to 8 Kb/sec per color in a video file.
Remember that this is a base technology - it is not application specific, but rather can be used in many ways. We have chosen to look at it from the point of view of copy and distribution protection, but Cognicity has proposed a number of possible applications for digital watermarking.
· Signal description - what kind of signal is this, how was it created, what formats are used
· Rights protection - tracking usage for pay-per-use applications
· E-commerce - linking the data object to a sales site
· Customized delivery - embedded advertisements or optional reproduction options like captions
· Fraud and tamper protection
Of course the most interesting applications to me are rights protection in an e-commerce environment. Suppose that "watcher" sites were created which would spy on MP3 transmissions. Similar technology is already in place in products like MP3 Spy, which can identify live MP3 broadcasts and bring them to your desktop. A Watermark Watcher would monitor MP3 broadcasts, identifying the source and the material being played back. Royalties could then be charged back to the artists from the broadcast site. Similar technology could be used to identify downloads of watermarked material and cause a charge or some kind of cost recovery to be invoked.
It will be up to the applications integrators of the world to figure out how to exploit this technology. Key among them will be Microsoft. It is unknown what Microsoft's position will be regarding digital watermarking. Because effective utilization of this technology will probably require tight support from the operating system, Microsoft will remain a player to watch in the development of this technology. Tags: | | | | | | | |
|
|
 | |  | |
| | | | | | | |
|
 | |  | | |
|