This Tokyo-based startup aims to replace the physical camera with a single software platform
While working at Yahoo in Japan, Issay Yoshida was involved in the research and development of computer vision, computer graphics, and machine learning, along with launching several innovative applications. Around 2016, he felt the need for a virtual camera platform that will be the foundation for video chat, live streaming, and video production.
Five years back, computer vision had just started picking up, and Issay felt that the physical camera could be replaced with a single software platform. So, he founded EmbodyMe in Tokyo, Japan, in 2016.
“The first product we developed was software which allowed users to create a 3D avatar from a single photo. After that, we set our sights on developing ‘xpression’, which facilitates the control of any person's face and head in an image or video in real time to generate new video content. It was released in May 2018,” Issay tells YourStory.
The EmbodyMe team
The team just started developing xpression camera when the COVID-19 pandemic struck.
“Our goal was to create a foundational virtual camera platform intended for video chatting, live streaming, and video production, which, in the age of COVID-19, have all become indispensable modes of communication,” he adds.
What does it do?
Instead of directly capturing frames with a physical camera, xpression enables users to animate anyone in an image or video in real time, according to their facial expressions, voice, and body movements.
“You can pretend to be any person and have a Zoom conversation, stream on Twitch, or create a YouTube video. The platform can be used to conduct any activity that is essential in the age of coronavirus, such as remote work, online medical care, online classes, and online events,” Issay says.
In 2020, the team had realised that the video communication market was growing rapidly owing to the pandemic, which created many opportunities for them to capitalise on it. With this in mind, xpression camera was created to bolster existing forms of sharing and socialising that were gaining relevance.
“You can assume the likeness of anyone, and redefine your own video sharing experience, or produce original content. It’s useful in a variety of environments and settings such as remote work and learning, medical care, virtual events, social media, creative content creation, personal endeavours, and so on,” adds Issay.
The xpression camera is a virtual camera app based on EmbodyMe’s patented core technology that uses AI-generated deep learning to allow users to imprint the movements of their face and head on to that of the image they choose.
It uses 3D Dense Face Tracking that can track over 50,000 3D points, precisely pinpointing the features of the entire face, mirroring the user’s expressions.
“Our deep generative model learns to generate any visual element that is indistinguishable from reality, and we streamline this process so that it works seamlessly on low-end PCs and mobile devices interactively – on both image and video, enabling it to handle high-resolution images and videos in real-time,” Issay says.
The revenue model
When Issay hit upon the idea of EmbodyMe, he set the core team by reaching out to his colleagues and friends at Yahoo. Today, the core team consists of 10 people and the startup has 15 members.
While based out of Tokyo, the team adds that over 144,594 users are from India. EmbodyMe functions primarily as an R&D company, so the team has dedicated most of their efforts to joint research and development with other enterprises, without running after profits.
“As of late, our primary focus has been on the development and ensuring that the app is accessible to as many users as possible. Thus, we decided to keep it free and that prompted us to set aside selling the units of xpression camera by making assessments of the profitability. However, in the future, we hope to revisit our current model and make changes based on the success and current status of the application in the market,” Issay says.
The team is hoping to adopt a freemium model by the end of 2021 where users will have to pay $14 per month on average for additional features that expand the functionality of the free version of xpression camera.
Under this premise, the team is looking to provide three types of paid plans: a Basic plan for casual video-chat users; a Pro plan for video creators, i.e. virtual YouTubers and so forth; and an Enterprise plan intended for video platforms such as Zoom, Microsoft Teams, and Google Meet.
The startup has raised undisclosed early-stage funding by IncubateFund, DEEPCORE (SoftBank's AI-focused fund), TechStars, and Deep30.
The market and future
Due to remote working, the phenomenon of Zoom fatigue is gaining prominence. However, COVID-19 has made virtual communication integral to our daily lives.
Many startups are slowly looking at ways to make video communication fun such as San Francisco-based app Mmhmm. There is Y-combinator-backed made in India video calling platform, Dyte, that competes with Zoom, allowing users to integrate plug-ins (apps) right into the call.
Remote working has transformed the way teams meet and communicate. Owing to this, many startups are entering the video calling space. US-based Zoom has seen 30X growth in users since the start of the pandemic.
“Our technology is 50 times faster than any of our competitors, and works in real time. The one-of-a-kind technology that we’ve developed can work on video chats, live streams, and games, and requires no pre-processing time. It can handle high-resolution images and videos in real-time,” Issay says.
Speaking about the future, Issay says the ultimate goal for EmbodyMe is to be a leader in the video communication market. The company aims to construct a world in which any visual content imaginable can be created using deep learning – not only for the video-chat market, but also for video creation and streaming sectors.
The team is in the process of developing technology so that it can be used for professional video production as well.
“Another consequence of the COVID-19 pandemic has been that due to the lockdown, it has become increasingly difficult to film content, and in many cases, we’ve seen industries come to a complete standstill as many in-person activities were halted indefinitely. We want to give filmmakers, producers, showrunners, actors, and all those involved in film production an opportunity to renew their craft by eliminating the necessity of in-person interaction, and make them free of constraints of collaborative production,” Issay says.
He adds they aim to allow creators to explore their talents through technology.
“Moreover, we’re seeking to provide new avenues for our favourite celebrities and characters to appear in a wider array of media. We also hope to license our tech to platform applications, such as Zoom, Twitch, and Discord, as well as companies in other industries,” Issay signs off.