video language models – Computerworld

Read more at:

“In physical AI, this model would have to capture the 3D visual geometry and physical laws — gravity, friction, collisions, etc. — involved in interacting with all types of objects in arbitrary environments,” said Kenny Siebert, AI research engineer at Standard Bots.

World models then help robots understand and evaluate the consequences of ththeyactions it may take. Some world models generate short video-like simulations of possible outcomes at each step, which helps robots choose the best action. 

“I think the difference with world models is [that] it’s not enough just to predict words on a sign or the pixels that might happen next, but it has to actually understand what might happen,” Galda said. For example, a robot could read signs such as “stop” or “dangerous zone” on a factory floor or the road and understand it has to be extra cautious moving forward.

Source link

spot_img
Multi-Function Air Blower: Blowing, suction, extraction, and even inflation
spot_img

Leave a reply

Please enter your comment!
Please enter your name here