Credit: Mashable composite: Ian Moore / Boarding1Now / iStock / Getty Images

OpenAI recently introduced a groundbreaking video generation model named Sora. The demonstrations showcased photorealistic videos of high quality and complexity generated from simple text inputs. For example, a video prompted by “Reflections in the window of a train traveling through the Tokyo suburbs” appeared authentic, complete with the feel of a phone-shot video capturing the train’s reflections and passengers without any distortions.

Similarly, a video inspired by “A movie trailer featuring the adventures of the 30-year-old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors” resembled a blend of Christopher Nolan and Wes Anderson’s styles.

Additionally, a video depicting golden retriever puppies playing in the snow realistically rendered soft fur and fluffy snow layers, creating an immersive visual experience.

The big question arises – how did OpenAI accomplish this feat with Sora? Despite limited information shared by OpenAI regarding its training data, it is assumed that Sora was trained on a vast dataset of video content sourced from various sources across the internet. Speculations suggest that this dataset might include copyrighted material, though OpenAI has not responded to requests for clarification on Sora’s training data.

8 remarkable videos created by Sora, the new OpenAI tool you must watch

In its technical documentation, OpenAI primarily discusses the methodology behind Sora’s functioning, highlighting it as a diffusion model converting visual data into digestible “patches” for the model to interpret. However, details about the origin of the visual data remain scarce.

OpenAI mentions drawing inspiration from large language models that develop versatile capabilities through training on a massive scale of internet-based data. The ambiguous reference to this inspiration is the only indirect allusion to Sora’s training data source. The document further states that “training text-to-video generation systems necessitates a significant volume of videos with corresponding textual descriptions,” implying reliance on internet-derived visual content.

The ethical and legal implications of obtaining training data for AI models have been under scrutiny since OpenAI’s ChatGPT launch. Both OpenAI and Google have faced allegations of utilizing data obtained from platforms like social media, Reddit, Quora, Wikipedia, private literature collections, and news outlets to train their language models.

While the rationale behind scraping internet data for training purposes often hinges on its public availability, the actual legal status may be more complex. For instance, the New York Times has filed a lawsuit against OpenAI and Microsoft for alleged copyright violations, accusing them of using their content without permission or proper attribution. This recent ambiguity surrounding video training models like Sora might trigger responses from the entertainment industry.

Despite the ongoing secrecy maintained by OpenAI regarding its training methodologies, concerns among artists and creators are growing. Justine Bateman, a filmmaker and SAG-AFTRA generative AI advisor, expressed strong criticism, stating that every aspect of such AI developments is potentially built on unauthorized usage of artists’ work.

Professionals in creative sectors are apprehensive about the implications of advancements like Sora and its impact on their industries. For example, individuals in film VFX are expressing concerns and uncertainties about the future in light of these technological developments.

Although OpenAI acknowledges the potential risks associated with Sora, particularly regarding deepfakes and disinformation, the focus remains on addressing these issues through rigorous testing mechanisms. The company is currently conducting red-team evaluations to identify and mitigate inappropriate and harmful content. While OpenAI mentions plans to engage with policymakers, educators, and artists to explore positive applications for this technology, existing concerns about the development and use of Sora persist.

Artificial Intelligence

Mashable Image

Cecily Mauran

Cecily is a tech reporter at Mashable covering AI, Apple, and emerging tech trends. With a master’s degree from Columbia Journalism School, she has extensive experience working with startups and social impact businesses. Prior to this, she co-founded a startup consulting business focusing on emerging entrepreneurial hubs across continents. Follow Cecily on Twitter at @cecily_mauran.


Leave a Reply

Your email address will not be published. Required fields are marked *