Applying AI to Summarize Key Video Topics

Dr. Manish Gupta is a co-founder and CEO of VideoKen, a video technology startup, and the Infosys Foundation Chair Professor at IIIT Bangalore. Previously, Manish has served as Vice President and Director of Xerox Research Centre India and has held various leadership positions with IBM, including that of Director, IBM Research - India and Chief Technologist, IBM India/South Asia. As a Senior Manager at the IBM T.J. Watson Research Center in Yorktown Heights, New York, Manish led the team developing system software for the Blue Gene/L supercomputer. IBM was awarded a National Medal of Technology and Innovation for Blue Gene by US President Barack Obama in 2009. Manish holds a Ph.D. in Computer Science from the University of Illinois at Urbana Champaign. He has co-authored about 75 papers, with more than 6,000 citations in Google Scholar (and an h-index of 45), and has been granted 19 US patents. While at IBM, Manish received two Outstanding Technical Achievement Awards, an Outstanding Innovation Award and the Lou Gerstner Team Award for Client Excellence. Manish is currently serving as the chair of IKDD, the ACM India Special Interest Group on Knowledge Discovery and Data Mining, and was General Co-Chair for IKDD Conference on Data Sciences 2015. He is an ACM Fellow, a Fellow of the Indian National Academy of Engineering, and a recipient of a Distinguished Alumnus Award from IIT Delhi.

I spoke to him on Videoken and his journey thus far.

Namagiri: How did Videoken come about? Dr Gupta: VideoKen is a spin-off from a research project at Xerox Labs in India. I had asked my team to look at the problem that despite widespread availability of online videos on practically any topic, the engagement levels of learners with those videos tend to be low. Our team got the inspiration from textbooks. People don’t just read a textbook from cover to cover. They look up the table-of-contents at the beginning and the index at the end to identify the key topics being covered in the book, and at a given time, would usually read just the portion covering a specific topic. We felt that learning videos should be similarly navigable, and since videos do not already come with these indices, we decided to apply artificial intelligence techniques to automatically generate these indices. We realized later that our indexing capabilities would benefit not just learning videos, but all kinds of informational videos, including marketing and communication videos.

When Xerox went through a major transition, we signed an agreement to purchase the intellectual property associated with this project and launched VideoKen as a new company in January 2017. In addition to the core team that came from Xerox Research, I was able to sign up Ashish Vikram (my classmate from IIT Delhi, who was then a VP of Engineering with Flipkart) and Vishnu Raned (VP of Sales at Zend Technologies, who had previously worked with Ashish at Rational Software) as co-founders.

Namagiri: What are the user contexts which Videoken tries to solve?

Dr Gupta: Enterprises are producing an increasingly large number of informational videos, such as webinars, customer event presentations, product marketing, training and internal communication videos, for a variety of stakeholders, including customers, partners, and employees. However, these videos are highly underutilized, given the paucity of time and short attention span of people while consuming information. Most people would view only the first few minutes (or seconds) of these videos before dropping off.

VideoKen applies AI techniques on the audio and visual content of a video to generate three indices (table-of-contents, phrase cloud, and searchable transcript) that summarize the key topics being covered in a video and let the viewer jump directly to the portion covering a topic of interest. This often leads to the average watch times of these videos going up by factors of 2-8x. Furthermore, VideoKen makes these videos more discoverable, and provides insights to the organizations about the topics of interest to the viewers of their video content. VideoKen also helps enterprises generate business leads more effectively from their marketing videos.

VideoKen is being used by over 40 customers, including enterprises like Accenture, Bosch, Kotak Mahindra Bank, Netgear, Oracle, and TCS. Another class of customers is e-learning companies, such as UpGrad and TalentSprint, universities like IIT Kharagpur, University of Mumbai and Meghe Group of Institutions, and organizations like Asian Development Bank Institute. Some of the public websites using VideoKen include ACM TechTalks and prestigious conferences like NeurIPS, ACM MobiCom, and Falling Walls.

Namagiri: IP-driven startups are not the norm in India. Videoken has a robust patent portfolio, with granted and pending patents. How the IP focus has helped?

Dr Gupta: We have 7 granted US patents and more pending. The IP focus led to a unique and powerful solution to the problem of low levels of engagement seen with informational videos. Our solution often creates a “wow” moment when we present it to our potential customers. The uniqueness of our technology globally has also enabled us to enter the highly competitive US market. We have won numerous awards for our innovation, such as the “Best in Class AI Start-up” award (one of three) at NASSCOM Technology & Leadership Forum 2019, NASSCOM AI Game Changer Award 2018, and Elevate 100 from the State of Karnataka in 2017.

However, there is also a downside to being innovative – many people are unaware of the kind of indexing that is possible for videos (existing notions of video indexing in the marketplace are usually limited to closed captioning), and hence it does not show up as a requirement or even an item on the customers’ wish list. We have to create the market by making more people aware of this capability and its benefits.

Namagiri: What is the go-to-market strategy for your startup? Is the focus India-specific, or beyond?

Dr Gupta: Virtually every organization creates videos and is a potential customer of our Software as a Service (SaaS) offering. We approach most customers directly – we started by targeting the Learning & Development groups of enterprises, but now are also seeing traction with Sales Enablement, Marketing, and Communication functions of companies, who are all using informational videos. We have integrated our AI engine with public video platforms like YouTube and Vimeo, as well as enterprise video platforms like Kaltura, JW Player, Brightcove, Wistia, VdoCipher, and with video conferencing providers like Zoom. We have signed formal partnership agreements with many of them. For instance, we are a premier partner for Kaltura. While our initial set of customers were primarily from the Indian market, we are seeing much higher growth in the US market and expect the US market to account for the bulk of our business very soon.

Namagiri: Let me ask one last question: What’s the best advice you’ve ever received?

Dr Gupta: The best advice, which I also pass on to people who I mentor, is to aim high, persevere, and not be afraid of failures. Even if you fail a few times, success on a single ambitious goal usually carries you much further than a series of incremental advances. Furthermore, the very act of going beyond your comfort zone leads to significant learning and growth.

  • LinkedIn Social Icon
  • Twitter Social Icon