Achieving a natural collaboration between VTuber and real artists What is the “volumetric video technology” introduced in “Buzz Rhythm LIVE V”?[Interview with Canon & Balus]- MoguLive

Achieving a natural collaboration between VTuber and real artists What is the “volumetric video technology” introduced in “Buzz Rhythm LIVE V”?[Interview with Canon & Balus]- MoguLive

Music live held on July 29th “Buzz Rhythm LIVE V 2023”. The annual music event of the music program “Buzz Rhythm 02”, where the comedian Bakarhythm serves as MC, will be held as the first virtual live. The live performance of popular VTubers and real artists was streamed online, and a live viewing was also held at Ikebukuro HUMAX Cinemas.

There were also special stages such as “Hoshimachi Suisei x Fuji Fabric” and “Mori Calliope x Creepy Nuts”, and the curtain closed on a high note.

Canon Inc.’s volumetric video technology and Valus Inc.’s virtual live technology have made this virtual and real collaboration live possible. The combination of the advanced video technologies of both parties must have surprised many people who were actually watching the live performances at the lack of discomfort in the images.

How was the special stage where VTubers and real artists stood side by side realized?Mogulive provided volumetric video technology at Buzz Rhythm LIVE V 2023Canon Incand was in charge of the development and operation of the live system for this event and the production of the virtual stage.Valus Co., Ltd., interviewed the person in charge of both companies. We asked him about the backstage of the live performance and his commitment to technology.

It is also used for sports broadcasts at stadiums! What is a “volumetric video system” that can convert human movements into 3D data in just 3 seconds?

――In the first place, what kind of technology is the “volumetric video system” used in Buzz Rhythm LIVE V 2023?

Canon contact:
A volumetric video system is a system that captures images with multiple cameras placed around the subject and generates 3D data from the captured images.By reconstructing the captured images as 3D data, you can enjoy the images from any viewpoint, such as in the sky where there are no actual cameras, or in the fields where sports competitions are held.

Another strength of our system is its ability to handle vast spaces in real time. It can be used for sports broadcasts in stadiums that involve large movements, and it is also possible to shoot an entire studio with multiple performers at the same time.

――I heard that examples of the use of the volumetric video system include broadcasts of professional baseball games and live music performances by artists. Please let me know if there are other examples of using this technology.

Canon contact:
Examples of its use in sports include international soccer and rugby tournaments in Japan, baseball games at the Tokyo Dome, professional basketball in the United States, and even a slightly different example of bicycle racing.

Examples of using the studio include live streaming, music videos, commercials, dramas, variety shows, fashion shows, lectures, and teaching materials for technical guidance.Also, in this systemObjects in the space can also be photographed and converted into data, but basically it is a technology that captures “people”I would appreciate it if you could recognize it.

――All of the sports you mentioned as examples involve vigorous movements, but does that mean that even people moving at high speed can be converted into data with high accuracy?

Canon contact:
The system was originally designed and implemented with sports in mind, so it can respond to fast movements, and it’s rather a specialty. Of course, even when shooting in the studio, I can capture fast movements, and I am confident in the high accuracy.

However, one of the features of this system is its real-time capability, which allows it to be used for live broadcasts of sports. there was.By performing video processing with our proprietary technology, we were able to reduce the delay until the data was output to just 3 seconds.

――What kind of difficulties did you face in putting this technology into practical use?

Canon contact:
The same applies to the part of “processing in real time”, but in order to actually shoot, it was necessary to solve various other problems.

For example, in the case of sports, we initially considered arranging cameras to focus on areas of interest such as in front of the goal. However, doing so will result in variations in image quality depending on where you shoot. The wider the space that becomes the shooting site, the more the difference in quality will appear from place to place.

I thought about increasing the number of cameras to eliminate that difference, but that would increase costs and make system construction more difficult.ThereforeInstead of simply increasing the number of cameras, we focused on the optimal placement of the cameras to cover the entire space, and by repeatedly conducting detailed simulations, we were able to handle large spaces..one

In addition, since the subject is shot from a distance in a large space, it is necessary to perform processing while estimating the position where each installed camera is shooting. However, there are situations where that alone is not enough… I’m mainly talking about the stadium, but when I’m filming during the game, the venue itself shakes. We were able to deal with the tremors by applying damping treatment, but there were other issues that needed to be resolved at each site, so we had a hard time in the beginning.

――So, you need to do some research before you start shooting.

Canon contact:
For stadiums where we are shooting for the first time, we first investigate the design data of the venue, then consider the placement of the cameras and conduct simulations.

However, since we have accumulated shooting know-how over the years, we don’t need to do many test shots in the field.With a small group of people heading to the survey, we were able to perform simulations in advance using the data we brought back, and were able to face the actual shooting in perfect condition.

Just to be sure, I have been working on the development of the volumetric video system since around 2016, and what I have just talked about is the technical issues that have been resolved so far. So, I would like to tell you that it didn’t take much time to prepare for this project, which was filmed in cooperation with Mr. Barusu.

Read more:  Coronation concert: Take That and Katy Perry bring show to a close

――When you hear the word “stadium,” you imagine a fairly large space.

Canon contact:
The largest areas are rugby and soccer fields. It is a space of 100m in length and 75m in width, and we have a track record of shooting at Nissan Stadium as an actual venue.

――Is there any difference in shooting between the music stage case like this time’s “Buzz Rhythm LIVE V 2023” and the sports held in stadiums and arenas?

Canon contact:
I think that you can think that it is basically the same system. However, considering the distance between the subject and the camera, we believe that studio shooting can output images with more stable quality than a large stadium.

Also, at this event, when we invited real artists to the virtual space, we also took pictures with the equipment. We believe that this is possible only because of our system, which can capture not only the movements of the performers but also a wide area as 3D data.


(The volumetric video studio in the Canon Kawasaki office. The space is surrounded by a green screen, and countless cameras are installed on the walls and ceiling. On this day, during the rehearsal of “Buzz Rhythm LIVE V 2023” A microphone and keyboard were brought in.)

――There seems to be room for further applications in the future. You said that you already have a track record of utilization in various fields, but please tell us about your future prospects.

Canon contact:
As with this example, the next request from users who have seen such live video is that they want to see the video from their preferred direction. am thinking. Therefore,We are currently working on technology development with the aim of providing a service that allows users to freely manipulate images taken with volumetrics from their own devices.

Also, from the perspective of applications, services using virtual spaces such as AR and VR have recently appeared one after another. Even in such a virtual space, we are currently exploring whether it is possible to make good use of volumetric video and make it possible for more users to enjoy it.

Familiar with VTuber’s VR / AR live Valus motion capture system

――Next, please tell us about the motion capture system used by VTubers.

Valus contact person:
We have been using an optical motion capture system called “VICON” since 2018.We usually shoot in our own studio, but this time we brought equipment from our company to Canon’s studio and set it up.

We mainly provide motion capture for VTubers, but we are not a motion capture technology company. We plan, produce, and distribute events such as virtual live performances from scratch. That is our main activity.

Even if you say “virtual event” in one word, the content of the event and the conditions of the venue vary, so we propose the best method on a case-by-case basis. For example, we sometimes transmit motion from our studio to the venue for live performances, and sometimes we bring LED screens or transmissive displays to the site and capture motion on the spot. There are also events like fan meetings that can be easily done.

In a nutshell,Using the motion capture system, we are working on events such as virtual live performances.

――You said that you often record in your own studio, but is it unusual to use an external motion capture system to produce an event?

Valus contact person:
For example, in the case of live music performances held at live houses, delays will inevitably occur when motion is transmitted from our studio. Although the round trip delay is about 2 seconds, it’s easier to get a sense of the “live” atmosphere by bringing it to the site and doing motion capture.

Especially at a large-scale event like this one, and when there are technical issues involved, we can’t even transmit from the studio. Remote communication makes it difficult to link the real and virtual worlds, and conversations are delayed by one tempo.Normally, we often adopt the method of “bringing in only the virtual live system and transmitting the motion from the studio,” but “If necessary, we can bring in a motion capture system and set it up on-site. There is also a mechanism”It feels like it.


(Bals motion capture system set up in Canon Kawasaki office for rehearsal of Buzz Rhythm LIVE V 2023. It uses a room on the same floor as the volumetric video studio mentioned above.)

――I think the system provided by Valus is familiar to VTuber fans. On the other hand, I felt that the combination with the volumetric video technology introduced in this live show was a special case that had never been seen before. How did this attempt come about?

Valus contact person:
Actually, the project itself, “Collaboration between VTubers and real artists using volumetric video technology,” was something I proposed in-house. The impetus for this was a live performance using AR, which was held as the cutting edge in the VTuber industry in 2022.

As one of the viewers, I was watching the live performances, and while I certainly knew that they were doing amazing things, there were also things that bothered me at the same time. That is the limit of camera work.In an AR live, the camera cannot be moved as freely as a normal music live, and the stage and venue can only be viewed from a specific angle of view. For that reason, I had the impression that the longer the live was, the more I would get tired of watching it. I felt that this “we can only show live with limited camera work” is a problem with AR live.

So, while discussing internally how we could eliminate the limitations of camera work, we came up with volumetric video technology. Using this technology, it may be possible to realize collaboration between VTubers and real artists in a space where there are no restrictions on camera work. This idea led to the launch of the project, which eventually led to an opportunity to talk with Canon.

Read more:  Alcaraz - Zverev: schedule, TV and where to watch the Mutua Madrid Open online

――When you say “restrictions on camera work”, when combining real and virtual images as a live video, if the angles of view are different, it creates a sense of incongruity. Is it correct to recognize that there is a limit to the image expression that can be done?

Valus contact person:
I agree. As a supplement, we have been working on AR live performances where VTubers and real artists co-star for a long time.

Therefore, we adopted a method of matching the angle of view of the live-action camera and the angle of view of the virtual camera, but the angle of view was limited due to physical restrictions such as the size of the studio and the number of cameras. There were problems with various restrictions, such as being able to express only from the beginning, or being unable to express depth well because the order of overlapping images was fixed.

It was also the time when several cases of live music using volumetric video technology were being talked about. However, we haven’t had a case where VTuber has appeared yet, so we thought it would be nice if we could work on it. Originally, it started as an independent project, but the story changed little by little, and it was realized in the form of this time.

――So the idea itself was there before this event. What was the process behind introducing it in “Buzz Rhythm LIVE V 2023”?

Valus contact person:
We couldn’t complete a live show with this technology alone, so we consulted Nittele ClaN, which has strengths in comprehensive content production capabilities such as casting, live production, and promotion. Therefore, I received a proposal saying, “I’m planning an event like this in the summer, what do you think?”

Problems with delays, strangeness on stage, restrictions on camera work—ingenuity and ideas for solving individual problems

――Now that you have talked about each technology, how was the virtual and real collaboration in “Buzz Rhythm LIVE V 2023” realized? Also, please tell us what kind of ingenuity you have in the process.

Valus contact person:
There were many challenges before realizing this real-virtual collaboration.

As a particularly big one, it is still a problem of “discomfort”. In order for real artists and VTubers to appear in the same space, they must be able to fit in on the stage without any discomfort.When different entities stood side by side in the same place, it was necessary to create a state in which they would not feel stressed about their appearance.That’s it. Regarding this task, I mainly challenged it from two approaches, “lighting” and “special effects (smoke)”.

For the lighting, we used different lighting on the real side and the VTuber side to make each other’s existence closer. By burning smoke, we were able to express the depth of the space as a whole, which made it possible to feel that the VTuber and the real artist existed in the same space.

Another issue was communication. As you mentioned earlier, the lag that occurs between the real side and the virtual side is the problem of delay. Even from the viewer’s point of view, there is a sense of incongruity when there is a lag in the conversation between the performers, even though they are standing in the same space in the video.Regarding the problem of communication lag, we use the method of “matching the delay value”.

Finally, let’s talk about camera work. By utilizing volumetric video technology, we have eliminated the restrictions on the angle of view of the camera that are common in AR live, realizing camera work that will not get bored. I think that there are three main points in this event and the ingenuity to solve them.

――Especially regarding the problem of “delay”, there was a recognition that the operation side was struggling with many virtual live performances so far. You mentioned “matching the delay value”, but may I ask you for more details?

Valus contact person:
As a premise, Canon’s volumetric video system, which “generates 3D data in 3 seconds,” is a ridiculous technology in the first place, but even so, “3 seconds” is a length that becomes a bottleneck in human communication. I think.

What did you do to eliminate the delay? In a word:I decided to consider “the time of the volumetric world created by rendering with a delay of 3 seconds” and “the time when the performer actually performs”.

First, the real and virtual performers will be shown a fixed-point image synthesized in real time on a monitor set up in front of them. Canon’s volumetric video studio has a green screen, so the video taken with the camera there and the video on the virtual side are combined and output to the monitor. What is projected is a fixed-point image that is just taken from the front of the stage, but this is a simple “real-time composite image” that does not use a volumetric system, so there is no delay between real artists and VTubers. You can communicate while checking each other’s positions with .

Valus contact person:
While the performers will be performing while watching the video, we will also be making the video at the same time. In this video, the volumetric output data is rendered on the stage in a virtual space, so the camera work can be freely created as a “live video”.

Read more:  Used books by David Leadbeater at rebuy

To summarize, first of all, by preparing “fixed-point video of real-time synthesis of real and virtual performers standing on the same stage”, performers can communicate and perform without stress. At the same time, we will also create a “live video using volumetric data that will be output after 3 seconds”.By dividing the performance in this way, we are able to achieve natural communication between the performers, a natural stage performance, and realistic live footage with free camera work.

――It seems that more difficult adjustments were made than we had imagined.

Valus contact person:
When you look at the footage of the actual performance, it may look like “I’m just doing music live in real time”, but in reality, I was doing something like this behind the scenes.

in a wayThere are two types of studios: “a studio that moves in real time” and “a studio that moves everyone in the world 3 seconds later”., may be taken as By operating with that kind of feeling, you managed to realize this real-virtual collaboration stage.

At the beginning, when adjusting the delay, we proceeded with wishful thinking that “synchronization can be done without any problems in terms of the system”, but when we actually tried it, the way of thinking about delay calculation etc. was different from each other. A difficult part came out. Our engineers and Canon engineers took the lead in doing a lot of fine-tuning. There is no end to all the various adjustments, but probably the most time consuming part is reproducing the movements of real artists in volumetrics so that they can perform together with VTubers in a virtual live space. I think it was an adjustment to do.

Canon contact:
Mr. Balus must have had a lot of trouble with the part of “displaying images in real time”. Mr. Barusu will work on reducing the weight, and Canon will also work on real-time processing. It was very difficult to finally combine them.

Valus contact person:
In the first place, we had different systems and studios, so we had to spend a lot of time reconciling everything in order to combine them into one system. Since each system has a different way of thinking, we made adjustments while checking the parts that we wanted to match in terms of the system.

The “success” of a live experience is to make people feel that real and virtual artists are “together as a matter of course.”

――From what you have said so far, I can tell that you have really spent a lot of time through trial and error in order to realize a natural collaboration between the real and virtual worlds. Did you actually get a response after working on this live production?

Valus contact person:
It was nice to be able to match that “the state of talking in real time really comes out as a picture”. By challenging various challenges one by one, which were high hurdles in live performances until now,I was able to output a video of a real artist and a VTuber standing side by side as a picture on the same time axis.Realizing the real and virtual collaboration stage in a way that does not feel out of place is a great achievement obtained from this event.I feel.

Canon contact:
In this live performance, we are able to create a wider range of productions by fusing the free expression of the virtual world with the realistic expression of the real world. Of course, performances by real artists and VTubers,We were able to achieve very natural communication with each other, and it was a live show that delivered images that we had never seen before.I think.

――Speaking of virtual live performances, Mr. Barusu is a field that you have been involved in on a regular basis.

Valus contact person:
Volumetric video technology has made it possible for real artists to appear in virtual space in 3D. by this,The so-called “virtual-only production” that we have been giving to VTubers until now, for example, changing the scale of people greatly, can now be given to real artists.

Even if it is a familiar expression for VTuber fans, if you apply it to a real artist, it will look different. By using the expressions we have made for VTubers and the experience we have cultivated in virtual live performances by real artists, it may be seen as a new production. I think it would be great if we could create new productions and expressions without limits, which is possible only because of the 3D appearance.

――From now on, some people may want to check the archives of the live performances, and others may want to check the live footage again after reading this interview. Are there any points that should be noted when watching the stage of “Buzz Rhythm LIVE V 2023”?

Valus contact person:
On the contrary, I think it would be nice if you could feel that “it’s too natural to understand”. It looks like a real artist and a VTuber are really there together, and if you don’t tell them, you won’t notice the discomfort. No matter what direction the camera turns around, it feels like you are really there without any discomfort.I think the “success” of a live experience is to make people feel that real and virtual artists are “naturally together”.and. I would be happy if you could feel that “I am really in the same space” instead of “seeming to be there”.

Buzz Rhythm LIVE V 2023

(* Time shift viewing available until 23:59 on September 1)

Facebook
Twitter
LinkedIn
Pinterest
Pocket
WhatsApp

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.