The marriage of advanced media asset management with emerging AI platforms for media analysis offers powerful potential for transforming media workflows. In this article, we’ll describe how the integration of these two technologies can make it easier for operations to access, manage, and archive content.
In today’s fast-paced media environments, more new content is being created than production teams can possibly manage without specialized tools. At the same time, the clock is ticking for digitizing historical content that exists in legacy, analog formats like tape before the original content degrades. It’s critical that all of theseassets be logged and tagged so that they can be found easily, but teams have no time to do this essential work. At the same time, the current generation of media assetmanagement tools have grown up in an environment where they were starved of metadata. As a result, content ams’ options are limited to pulling technical metadata
from media files or streams, extracting meaning from file and folder names, or manual logging. Artificial intelligence (AI) is beginning to changehow media organizations meet these challenges. A new and emerging breed of AI platforms for media analysis, when paired with leading-edge media asset management (MAM) tools, offers great potential for transforming media workflows and making it easier than ever for operations to access, manage, and archive tremendous volumes of content. Through powerful tools such as speech to text and automatic language translation, AI engines bring new power to the MAM task of logging and tagging content – with the ability to tag assets automatically based on attributes such as people, places, things, and even sentiment.
It sounds almost too good to be true: suddenly you can unlock the potential of all of your content and make it immediately searchable, reusable, and monetisable. At last, you can get some traction on those digitization projects and get a better handle on all of the content in your existing library! But wait - while the potential exists to realize these benefits someday, the truth is that thetechnology needs to overcome some issues in order to become mainstream.
One area that needs improvement is accuracy. While AI analysis is getting better all the time, particularly with speech-to-text offerings from players such as Google, Microsoft, Amazon, and IBM, fine-tuning is still needed. For instance, the engine might not be able to distinguish between U.K. or American English, and abbreviationsand jargon are likely to generate mistakes. The industry s still working on easy methods to train the AI engine to recognize these language variations and correct mistakes.Also, for image or video analysis, the sophistication of AI tools varies considerably. Some platforms offer only ery basic video analysis, meaning the best way to capturemetadata for people, places, objects, and sentiments s to make a set of image sequences and analyze them manually.
AI aggregators can help users avoid some of the costs and complexities of setup by making it easier to choose the right AI engine for a specific task. But even so, pickingthe AI tool that’s best for a given activity is not trivial. At the same time, cost structures across the industry are far from transparent, making it difficult to work out the total expense of applying AI to a media library. It’s a multi-step process: first, you have to work out how to get your content into the AI engine, which is often in the cloud. That might involve aving to create a video proxy, separate the audio files, create an image sequence, and other steps, and then uploading the content and managing its lifecycle. Should you leave the content on thevendor platform or delete it to save on storage? Is it in the right format for the AI engines to understand? Which AI tool should you run, and is there a separate cost for each style of analysis? There might be different price tiers for different content formats; for instance, 4K assets might ost more. With each vendor having its own price list, it’s pretty difficult to compare apples to apples.
Also, the technology is advancing so quickly that any AI analysis done today may have to be refreshed later, as the tools improve. Managing these refreshed data sets, especially if they have been corrected or updated by a human after the original analysis, adds another layer of complexity. And of course security is a concern, especially if the data is uploaded to cloud providers.
The AI-MAM Connection
As these powerful AI technologies continue to mature, strong media asset management capabilities will become increasingly important. On the metadata side, tools that can store, search, and easily correct a huge volume of time-based metadata are crucial. Good metadata and user interface design are vital to keep the system from overloading users with too much information. And on the workflow and automation side, feeding the AI engines with the right data and automating the analysis, while eeping down costs, will separate the true enterpriseofferings from the also-rans.
So what might an AI-powered MAM solution look like? One approach is to supercharge the MAM system’s logging, tagging, and search functions through integrations with leading AI vendors and aggregators, such as Google, Microsoft, Amazon, and IBM. Integrations withbest-of-breed AI platforms and cognitive engines could low the MAM to leverage advanced AI-based speech recognition and video/image analysis, with the flexibility to be deployed either in the cloud or in hybrid on-premises/ loud environments. Here are a few of the advanced capabilities that could result:
- Speech-to-text, to automatically create transcripts and time-based metadata
- Language translation
- Place analysis, including identification of buildings and locations without using GPS-tagged shots
- Object and scene detection (e.g. daytime shots or shots of specific animals)
- Sentiment analysis, for finding and retrieving all content that expresses a certain emotion or sentiment (e.g. “find me celebrations (in a sports event)”)
- Logo detection, to identify when certain brands appear in shots
- Text recognition, to enable text to be extracted from haracters in video
- People recognition, for identifying people, including executives and celebrities
The next frontier
Of course, these capabilities are just the start. The MAM system can also be a powerful tool to train and improve AI engines; e.g. content manually tagged in the MAM could perhaps be used to identify the executives in a corporation. The MAM could use this manual tagging to train AI engines to do a better job of logging and tagging new content.
The industry is being transformed by AI and the explosion in sometimes low-quality metadata. Only the ost powerful, flexible, easy-to-integrate, secure, and scalable MAM platforms are embracing this challenge and will thrive.
In the right hands, AI becomes the key that unlocks the next generation of MAM technologies.