Could OpenAI’s Sora text-to-video generator kill off jobs in Hollywood?
Some say it’s ‘game over’ for creative professionals in the film industry.
Artificial intelligence startup OpenAI has been teasing its new AI
video generator, Sora, on social
media in recent weeks. Last week, it revealed that it had also given actors and
directors in Hollywood a first look at the technology – and a chance to try it
out – before Sora is launched publicly.
OpenAI published a blog post on March 24 titled Sora’s First Impressions, showcasing the work that several creative studios and directors had produced using the video generator.
Some
media experts speculate that Sora will be extremely disruptive to the film
creative industry.
AlJazeera spoke to one executive who works in Hollywood, who asked us not to
reveal his identity due to the sensitive nature of the subject. When asked what
his initial reaction was when he saw Sora’s capability for the first time, he
said: “My reaction to Sora was just like everyone else’s – my jaw hit the
floor. It was like we were seeing our murderer but it was beautiful at the same
time. Just immediately impressive and terrifying.”
The
tremors caused by Sora have already been felt by some in the industry.
In
an interview with The Hollywood Reporter in February, actor, filmmaker and
studio owner Tyler Perry stated he would put his $800m studio expansion in
Atlanta on hold after seeing Sora’s video-generating capability.
He
added: “So I am very, very concerned that in the near future, a lot of jobs are
going to be lost. I really, really feel that very strongly.”
What is Sora?
Sora
is OpenAI’s text-to-video generative AI model. Similar to ChatGPT, you enter a
text prompt but instead of generating answers to questions or prompts in text
form, Sora will generate videos up to one minute long.
A
video example of Sora’s capability, which was released by Open AI, can be seen
below:
- Example prompt: “A movie trailer featuring the adventures of the 30-year-old spaceman wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colours.”
Sam Altman, CEO of OpenAI, also posted several
examples to his X account, including the below:
- Example prompt: “An
instructional cooking session for homemade gnocchi hosted by a grandmother
social media influencer set in a rustic Tuscan country kitchen with
cinematic lighting.”
·
Sora is far from perfect. If
you look closely at the “instructional cooking session” video, the spoon in the
right hand disappears after the “grandmother” stops mixing. Although
hyper-realistic, the ability to spot fakes is still present in some of the
videos that Sora produces.
·
This raised another question
of how well a product like this would work in the industry.
·
Our Hollywood insider stated
the following, “We will come out on the other side bigger and better because
humans will have figured out their place atop a technology that is clearly more
powerful than we can currently imagine. But the desire to act, write, direct,
compose, collaborate, etc is deeply innate in humans. It’s not going anywhere.
So is it bad for the humans in the industry? The answer is no, then yes, then
no. Is it good for the industry itself? Yes.
· How does Sora work?
·
As with ChatGPT,
users type in a text command, question or prompt and the AI responds – in the
case of Sora, with generated video sequences.
·
To do this, Sora uses a
combination of machine learning and natural language processing (NLP) to
generate a video sequence. NLP is a form of artificial intelligence that
understands the interaction between computers and human language. Machine
learning allows Sora to get better over time while improving its responses
through patterns and feedback.
·
Sora uses “computer vision”
to understand and interpret visual information from images or videos. Computer
vision is a software framework which tells Sora to “recognise” visual
representations of real-world objects, people and environments from text
descriptions which include visual language. For example, the prompts “cat
moving” or “waves crashing in an ocean” indicate certain attributes and
characteristics. Sora needs this visual language to interpret the text prompt,
and then accurately present a visual depiction of an object.
·
Sora can harvest incomplete
or partial data and transform it into comprehensible video content that looks
very real. Sora works like a super-powered zoom tool. It starts with large,
blurry blocks of colour or objects and then refines them into smaller, more
defined shapes based on your prompt.
What does Sora mean for creative jobs in the film industry?
It
is still unclear which if any, tasks normally undertaken by human creators
could be taken over by Sora. The AI’s ability to replicate camera shots,
lighting and characters on the fly makes for unchartered territory for
directors and filmmakers. However, film professionals expect it to shake up the
industry considerably.
One
Hollywood insider who spoke to Al Jazeera on condition of anonymity, said: “I
don’t see it as a threat to production so much as a threat to the way
production is done as we currently know it. We’ve seen events like this in the
past, particularly in post-production when folks started editing on laptops
instead of the big expensive post houses. Lots of people got wiped out in that
transition while others could suddenly afford a proper editor without the
overhead a post house demands.”
When
asked whose jobs could be replaced by AI generators, he added: “Maybe asking
‘who will get replaced’ is the wrong question. I think it’s the system that
will get chipped away at and replaced. In a couple of years, maybe the term
‘director’ will refer to the guy who prompts the AI, and the rest is done
completely digitally. And if that approach is accepted by audiences, and makes
money, and makes people feel human emotion – then it’s game over for most of
us.”
What are the copyright and legal issues?
Sora
pulls content from already existing images and videos, then recreates a video
based on the user’s prompt. Who exactly owns that regenerated video? Should a
fee be paid to each of the photo and video creators and characters whose work
Sora draws upon to create the final video? These are questions which have yet
to be fully answered.
At
the root of many of the above questions is how to track the originator of any
content generated, including the individuals who have been included in the
final video.
Speaking
on his YouTube channel, technology lawyer Paul Haswell, explained: “If
someone’s just using an AI model and then it inadvertently somehow sucks in
some data that then ends up looking like you, what are your rights – is your
personal data actually being misused? How can you prove that your data was used
to create that likeness?”
He
added: “Suddenly you find yourself an actor in a completely AI-generated soap.
You may be world famous, yet get no credit for it. You might have a squeaky
voice rather than a deep voice but your face would be the same. For example,
you would have no credit because you’ve essentially been used by AI, hoovered
up and regurgitated into another format.”
There
are also international considerations as copyright law is different depending
on the country. If the video originated in one country and is distributed in
another, whose copyright law applies?
On
his blog, Wallace Collins, an entertainment lawyer who specialises in copyright
and trademark law, warned that Sora would expand all these problems
“exponentially” and could even lead to social unrest or other forms of social disruption.
“AI
has already disrupted copyright law for creators, particularly in the music
space, and has challenged established copyright and intellectual property norms
in the entertainment world. Without some type of common sense regulations in
place, Sora could be used by the most vile of individuals to create videos that
could defile, mislead and scare people, or even instigate riots based on the
appearance of something that is completely fabricated but entirely realistic in
appearance.”
How will these issues be decided?
A
significant portion of the legal discussion surrounding generative AI revolves
around the issue of who should be considered the author of what these tools
generate, as it relates to “fair use”. Fair use copyright laws allow for
limited uses of copyrighted materials or for transforming the copyrighted work
into a different piece of work.
At
present, there is no legal precedent covering the current advances in
text-to-video generation. However, in December last year, the New York Times
filed a federal lawsuit against
OpenAI’s ChatGPT (a text-to-text generation tool) and Microsoft’s Copilot for
copyright infringement in the Southern District of New York (the federal
district court in Manhattan). The Times alleges that OpenAI’s ChatGPT provides
users with the exact same content that the Times has already provided.
Ian
Crosby, a lawyer for the Times, said: “Defendants seek to free-ride on the
Times’s massive investment in its journalism by using it to build substitutive
products without permission or payment. That’s not fair use by any measure.”
In
February, OpenAI filed a motion to dismiss the Times’ case in federal court.
Two
more copyright infringement cases were filed in the Manhattan court against
OpenAI – one by The Intercept and the other a joint case filed by Raw
Story and AlterNet – in February.
0 Comments