Back in 2020, years before we knew one another, a video that surfaced on the internet, captured both of our attention. This strange audio clip of Jay-Z reciting Hamlet’s famous “To Be, Or Not To Be” soliloquy not only triggered our curiosity, but that of many people on the internet [1]. Had Jay-Z discovered a sudden interest in English literature, we wondered? Was he sharing this newfound interest with the world? No, it turns out that this video was created by a YouTuber by the name of “Vocal Synthesis” using an audio deepfake. The YouTuber wrote that the video was generated with an open source program called “Tacotron 2” which utilized a text-to-speech model that analyzed Jay-Z’s speech patterns and produced a dramatic reading of this Shakespearean piece of work [2]. When Jay-Z caught wind of this video and many similar ones, his entertainment agency, Roc Nation LLC, claimed that Vocal Synthesis was infringing on his copyright rights related to his voice [3]. As laypeople, we initially thought that Jay-Z might have a pretty strong case against Vocal Synthesis. After all, with a distinctive and high-profile voice, shouldn’t he have the ability to protect his voice against impersonations and malicious uses? Now, with our newfound IP law knowledge from Professor Festinger’s class, we thought it might be interesting to revisit this issue and see how our views have changed.
First things first, it seems likely that Jay-Z’s representatives did not take an IP law class. Otherwise, it would have been clear to them that the better argument to make would have been through trademark law. Copyright law only applies to works of expression that are fixated in some tangible form [4]. While there is a noticeable lack of case law in Canada on this matter, American case law such as Midler v Ford has made it very clear that a voice, by itself, may not be copyrighted because the sounds that it produces are not fixed [5]. Only where voices are recorded in a fixed form, such as in lyrics, songs, or speeches, may copyright protection then become available. [6]
Returning to Jay-Z, his recognizable voice and the reputation associated with it contains flavours of trademark. The better question to ask then, for the purposes of our project, is whether or not we should consider allowing voices to be trademarked, particularly with the increased prevalence of AI programs which decrease the barrier for mimicking an individual’s voice for a multitude of purposes.
Trademark & Voices
The purpose of a trademark is to distinguish a brand’s identity from others in the marketplace, which in turn protects the brand’s reputation [7]. Given that Jay-Z’s fame comes in large part from his voice, it would be understandable why he might want to prevent others from being able to mimic and use his voice in ways that might harm his reputation. That being said, the current state of intellectual property law does not extend to trademark protections for vocal styles. The only related protection offered is for a voice in association with a distinct good or service [8]. A well-known example of a sound mark is Homer Simpson’s iconic phrase, “D’oh!”, which is trademarked by 20th Century Studios in the United States [9]. The key here is that just Homer Simpson’s voice or the phrase alone would not be protected, but when they are used together, they are.
Visualizations of why voices cannot be trademarked exist in our daily lives. You may have noticed that certain contexts produce similar vocal patterns leading to individuals sounding the same. A common example of this is news anchors whose standard on-air voices are lower, slower, and include more pauses than their day-to-day speech [10]. If a news anchor wanted to trademark their voice, it is not difficult to see how that might lead to some issues. An alternate argument for why such voices should not attract trademark protection could be in connection with how trademarks are not registrable if features are primarily dictated by an utilitarian function [11]. The reason that news anchors speak in this manner is because the expectations of professionalism in the industry mean speaking without any regional accents and in an enunciated fashion [12]. One could say that this type of voice serves a utilitarian function within the industry.
This may be contrasted with a “Tiktok Voice”, where creators are known to speak in a colloquial and conversational manner in their content in order to better engage their audience [13]. While individuals can be colloquial and conversational without sounding the same per se, the “tiktok voice” that has resulted can also be seen as a utilitarian way to achieve this goal.
Arguments for Allowing Trademarking
While the general opinion on the matter is that trademarking voices is not and should not be allowed, there are still some strong arguments for why maybe it could be considered in the future. One of the main effects of trademark law is to avoid consumer confusion [14]. We want to avoid situations where consumers believe they are getting one thing, but have instead been tricked into getting something else that appears similar. While, as discussed above, case law has already established that copyright protection for voices does not make sense, perhaps we could consider incorporating some of the justifications for copyright law into trademark law. Namely, Théberge explains that the approach to copyright law should be to strike a balance between promoting public interest in the encouragement and dissemination of works of art and intellect with obtaining just reward for the creator [15]. Many voice artists have spent years working hard to develop a reputation, and deserve just compensation for their efforts. Allowing other people to use their voice in any way they want, including ways that might leave a stain on their reputation seems like it would not be fair. Thus, there may be an argument to be made for fair compensation for these voice artists despite the lack of a fixated work.
The Trademark Act in Canada last saw a significant amendment in 2019, which in the legislation world, is considered very recent [16]. The AI world, on the other hand, moves a lot faster. So many major AI developments have occurred in the last five years, GPT-3 in 2020, DALL-E in 2021 or autonomous vehicles becoming increasingly commonplace, just to name a few. As the usage of AI becomes more and more prevalent with each passing year, it seems that law touching this area should adapt as well. Back in 2019, I think it is unlikely that most of us imagined that today, anyone sitting at home could simply log on to their computer, plug in a short audio recording, and create a clone of someone’s voice within minutes. Just like how we needed to adapt our approach to intellectual property in the age of the internet where communication was greatly facilitated, it seems that the ease with which anyone can use these artificial intelligence tools begs some further change.
One could argue that an alternate avenue for dealing with such issues is the right of publicity, which protects against the commercial exploitation of an individual’s or a business’ identity (e.g. image, name, voice, or likeness) [17]. However, this is restricted to exploitation for commercial uses and as it becomes increasingly simple for the general public to use AI tools online to create imitations of individuals’ voices, as mentioned above, we could envision many non-commercial use cases that may nonetheless be harmful for their reputation while not meeting the threshold for other causes of action such as defamation.
Arguments Against Trademarking
While these considerations seem to highlight some enticing arguments for providing trademark protections for voices, we should remain cautious in extending protection too broadly when it comes to this segment of the intellectual property domain. Firstly, as voices can – and do – sound similar, especially in certain specialized contexts, it can be challenging to assess the validity and merit of trademark protection. Without a clearly defined difference between impersonation and similarity, the trademarking of voices will likely open the floodgates to litigation and overwhelm the courts with cases that may not even involve purposeful imitation.
Further, while the concept of fair dealing in copyright protects users’ rights to use copyrighted work for various reasons, including parody, satire, and education, such protections do not exist in the world of trademarks. Consequently, the protection of voices under trademark would essentially prohibit the impersonations of public figures for the purpose of parody. Thus, in Canada and the United States, anyone is entitled to portray or imitate another person in a public forum, whether it be in a community performance, in a YouTube video, or broadcast on TV as long as they are doing it for the sake of parody [18]. An illustrious example is Saturday Night Live, which frequently sees cast members impersonate celebrities and other noteworthy people’s appearances, voices, and mannerisms on a public broadcast network [19]. Under the concept of fair use, which is equivalent to Canada’s right of fair dealing, the show is protected from copyright and publicity rights. However, because this protection doesn’t apply to trademarks in Canada, the trademarking of voices may inhibit expressions of parody in a significant way.
An impersonation of an individual who has trademarked their voice, even where it is for the purpose of parody, may constitute trademark infringement if the impersonation is deemed to cause confusion. A person who is unfamiliar with the parody source may simply hear a voice impersonation and believe it to be from the person that is being impersonated. While visual media literacy is expanding among the public – with the younger generations in particular – less focus has been on educating them about media in audio form. Any ensuing confusion may lead to implications for trademark infringement, and parody and the freedom of expression may have to pay the price for people’s inability to distinguish real from fake.
Additionally, even in the absence of trademark protections, there are currently other avenues that may be able to regulate the use of famous voices for purposes that may be damaging to the voice owner’s reputation. Passing off is available as a remedy to those who feel that their voices have been used in a way that involves misrepresentation and consequently causes damage to the plaintiff [20]. If the owner of the voice in question has amassed goodwill in the form of a good name or reputation, they will be able to pursue a claim for passing off. In Rihanna v Topshop, the singer Rihanna successfully brought a claim for the use of her image on clothing, which was deemed to be a misrepresentation that damaged her reputation as a “fashion icon” [21]. Similarly, where a celebrity’s voice is used without their consent in a manner that misrepresented their involvement and may conceivably damage their goodwill in some way, they may be able to pursue a claim for passing off.
Final Thoughts
To wrap things up and to illustrate our point on how easy it is to create a clone of a well-known individual’s voice, we plugged a 30 second clip of Phineas Flynn, from the show Phineas and Ferb, into an AI voice generator called PlayHT which offers voice cloning services and generated the short audio clip below [22]. This process took only around 5 minutes to complete and was free for us to use. PlayHT also offers a “high fidelity” version which requires only a monthly subscription of around $30 a month, which still presents quite a low barrier to entry. This is just one of the many accessible resources out there that offer such services at the click of a button.
Thanks to Vocal Synthesis and Tacotron, we were able to do a deeper dive into how AI developments might in turn affect the development of trademark law in Canada. Overall, we think that it is safe to conclude that despite the existence of a few compelling arguments for the addition of trademark protection for voices, Canada has gotten it right for the time being. However, it remains important that this area of law receives continuous attention and revision as further changes are sure to occur.
Authors’ note: This project contains mentions of Jay-Z in relation to AI and his reputation, which were written prior to the recent allegations brought against him and should not be viewed as endorsements of this individual or his reputation.
References
- https://www.youtube.com/watch?v=m7u-y9oqUSw
- ibid
- https://pitchfork.com/thepitch/what-does-jay-zs-fight-over-audio-deepfakes-mean-for-the-future-of-ai-music/
- https://ipwatchdog.com/2020/10/14/voices-copyrighting-deepfakes/id=126232/
- Midler v Ford Motor Co., 849 F. 2d 460 (9th Cir. 1988)
- Supra note 4
- Class notes – Week 10 Slide 233
- https://www.stratford.group/blog/auditory-trademarks#:~:text=If%20you%20produce%20a%20unique,in%20Canada%20for%20sound%20marks.
- https://www.uspto.gov/sites/default/files/76280750.mp3
- https://www.tiktok.com/@michellemackeytiktok/video/6955846007787654406?lang=en
- Class notes – Week 10 Slide 254
- https://www.businessinsider.com/how-american-broadcasters-all-came-to-sound-the-same-2023-1#:~:text=Experts%20told%20Insider%20the%20stereotypical,looked%20and%20sounded%20the%20same.
- https://www.vice.com/en/article/why-everyone-uses-tiktok-voice/
- https://canadian-trademark.ca/advanced-trademark-information/confusion/
- Théberge v. Galerie d’Art du Petit Champlain Inc., [2002] 2 S.C.R. 336 at para 30
- https://laws-lois.justice.gc.ca/eng/acts/t-13/20190618/P1TT3xt3.html
- https://ca.practicallaw.thomsonreuters.com/8-512-2969?transitionType=Default&contextData=(sc.Default)&firstPage=true#:~:text=The%20right%20of%20a%20natural,Name.
- Copyright Act, RSC, 1985, c C-42 at s 29
- https://youtu.be/R3N6Iqp8QIk?si=iPsNrUe4TTbh-pbP
- Class notes – Week 10 Slide 66
- https://www.techlaw.ie/2015/02/articles/intellectual-property/rihanna-v-topshop-uk-court-of-appeal-upholds-decision-in-landmark-passing-off-judgment/
- https://play.ht/?via=wmn89
Copyright & Social Media
Communications Law
Interesting post! I remember listening to a podcast that touched on this subject. The podcast host had discussed how AI was evolving to the point that it could mimic his voice, but he saw it as a potential opportunity. As part of his podcast, the host frequently had to make recordings for advertisements. With the development of AI however, he could instead license his voice to advertisers, who can then create the ads without his involvement. Thus, as an additional argument for allowing trademarking of voices, it could allow the trademark owner to further commercialize and profit from their own voice.
This is such an interesting topic! I actually encountered something similar when I came across a YouTube video featuring AI-generated music by an artist group I really like. The voices sounded so similar that I thought it was an official release at first. It really hit me how easily AI can mimic unique voices. This also made me think about voice actors, whose careers rely entirely on their distinctive sound. As AI-generated voices become more common, it raises important questions about ownership, consent, and how the law might evolve to protect their livelihoods.
This is such an interesting topic! I actually encountered something similar when I came across a YouTube video featuring AI-generated music by an artist group I really like. The voices sounded so similar that I thought it was an official release at first. It really hit me how easily AI can mimic unique voices. This also made me think about voice actors, whose careers rely entirely on their distinctive sound. As AI-generated voices become more common, it raises important questions about ownership, consent, and how the law might evolve to protect their livelihoods.