Microsoft’s New Speech and Hologram Tech Can Make You Fluent in Any Language
Technology never fails to apprise us! And we too await the announcement of different trends and technological disruptions, curiously. Developers were in a state of awe after the announcements made by Apple in WWDC 2019. However, experiments and innovation see no boundaries. Surfacing itself on top of the ongoing technological enhancements, the tech giant, Microsoft is back with an eye pounding technology.
Yes, you heard that right.
Microsoft, in its recent Inspire Partner Conference, uncovered one of the most fascinating discoveries in the field of human communication. Julia White, an executive at Microsoft took over to demonstrate the new hologram model which is custom fit speak in different languages but in the exact voice of its human counterpart.
To build a lifelike hologram is a challenge in itself but Microsoft has pounced ahead, integrated the current day technologies and here they are with a model that can not only build your hologram but also make him speak in your voice.
The solution is backed by two of the most talked-about technology - Mixed Reality & Neural text-to-speech. Truly, such a technology is bound to reshape the future where humans no longer need to deal with communication barriers.
The Ground-Breaking Technology - Quick Run Through
The demo definitely captured huge attention with developers applauding as White enclosed the same. However, it took tons of modifications before finally launching the model.
This all-new Azure AI Hologram model is Microsoft's product and using this, one can create an exact human dummy bestowed with the capability of speaking in one’s personalized voice. The key concept behind the AI-driven hologram model is to bridge the language and communication gap between intervening parties. No matter where you are or what language you speak, your holographic appearance is capable of being present at remote places and speaking in any of the desired languages.
The first public demonstration of the model was done at Microsoft Inspire 2019 in Mandalay Bay Convention Center which is conducted from July 14th, Sunday to July 18th by Julia White. To start with, Microsoft scanned her visit to one of the company's studios of Mixed Reality. Here, her speech was recorded in English to have a photographic duplicate. This duplicate would be seen on stage delivering a speech in Japanese.
Microsoft then integrated the technology of Azure AI’s neural text-to-speech and further captured the signature voice of White. Once the two technologies were authentically merged, the holographic model of Julia was seen by the audience in Japan through their HoloLenses.
To be precise, the audience saw an exact image of Julia White, wearing the same dress and addressing them via the speech given in Japanese.
No wonder, the audience was in a state of shock and surprise!
Though the technology is barely available to all, yet the future seems to replace the speaker-audience communication barrier.
Microsoft Hologram – Technology at A Glance
The resultant model is no doubt excellent. However, driving attention to the set of technologies that back the same is also important. AI and Neural text-to-speech as we know are the two technology facets merged to turn this real-time fiction a reality.
With a more technical view, the project involves two major drawings
- Reflecting a holographic view of the speaker
- Converting speech from one language to another
To attain the above, Microsoft used:
- Microsoft Reality & Azure AI Holograms
Emerged as the game-changing technology, the duo helps in virtual teleportation from one space point to another disregarding the time and location constraints. So by this, Microsoft was successful in transporting the holographic view of the speaker via space. However, one thing to note here is the fact that the speaker's image is prescanned at the Mixed Reality studio prior to projection.
- Neural TTS
Next, Microsoft had to attain the translation in the voice the same as that of the speaker and so the technology of Neural Text to Speech converter was used. Using the concept of deep learning, the technology is capable of encompassing the exact voice keeping in mind the stress and intonation patterns of the speaker. Further, the neural TTS circumvented the pitfalls of traditional TTS and further optimize the generated voice. Voice synthesis and prosody prediction are done to add fluidity to the sampled voice.
Though the technology has steered revolution it is too long to see its incorporation in everyday life happenings.
To have a holographic image, you would need to capture the event at the Mixed Reality Studio, first. And further buying a HoloLense is a costly affair, $3500 to be precise. So, it is pretty bulky and you might need to wait. But the fact is that technology has made its way and industries are trying hard to make use of the same.
The Future Is Here
True that all of this seems to be pretty enticing and surely it is, but Microsoft is innovating faster than our imagination. The organization is doing amazing things and has so many rabbits to put out of their hat.
Imagine a leader addressing the audience in one city bit viewers all across the world could experience as if he was actually sitting in their room. Wonderful right? Now, this is what all-new AI Hologram plans to do. It might not reach to your homes now, but sooner or later, it will pace its way to every next door. Similarly, their new speech translation technology is going to be a game-changer, especially when utilized alongside the holograms.
In combination, these two emerging ideas will change the way how meetings are held, the importance of learning languages, the facet of corporate travel and so many related things at once. You may sense the impact it all is going to have on business, political and global communication already.
Summing up, Mixed Reality & AI, together they make the world a lot more futuristic and in terms of what Julia said: ‘All of these technologies exist today. The future is here.’