Researchers at the University of Surrey have achieved a groundbreaking feat in artificial intelligence (AI) technology, teaching a system to predict the three-dimensional (3D) pose of dogs from two-dimensional (2D) images. Led by postgraduate research student Moira Shooter, the team utilized an innovative approach that opens doors to diverse applications, ranging from ecology to animation.
In a pioneering move, the researchers turned to the virtual world of the popular video game Grand Theft Auto to create a myriad of virtual dogs. By modding the game, they substituted the main character with eight different breeds of dogs, generating a wealth of digital canine behaviors. This initiative, dubbed DigiDogs, provided a rich dataset comprising 27,900 frames capturing various dog activities, including sitting, walking, barking, and running, across different environmental conditions.
Training AI on DigiDogs: A leap forward in predictive capabilities
Traditionally, teaching AI systems to perceive 3D information from 2D images involves providing them with knowledge about the 3D ‘ground truth,’ typically achieved through motion capture suits for humans. However, extending this approach to dogs presented a unique challenge. Undeterred, the researchers leveraged the DigiDogs dataset to train their AI model, overcoming the absence of canine motion capture data.
Ms. Shooter emphasized the versatility of their solution, envisioning applications ranging from wildlife conservation to virtual world development. Initially trained on CGI dogs, the AI model demonstrates the potential to extrapolate 3D skeletal models from real animal photographs. This capability holds promise for various domains, enabling conservationists to identify injured wildlife and empowering artists to create more lifelike animals in virtual environments.
Future directions
Moving forward, the research team aims to refine their AI system using Meta’s DINOv2 model, ensuring its ability to predict 3D poses accurately from real dog images. By bridging the gap between virtual and real-world data, they aspire to enhance the model’s performance and broaden its applicability across diverse scenarios. Ms. Shooter emphasized the wealth of information embedded in 3D poses, underscoring their superiority over 2D photographs.
The University of Surrey’s pioneering work exemplifies the transformative potential of AI technology when coupled with innovative approaches and interdisciplinary collaboration. As AI continues to evolve, fueled by advancements in data acquisition and algorithmic sophistication, the possibilities for enhancing our understanding of the world and leveraging it for practical applications appear boundless.
The convergence of virtual simulation, AI, and real-world data holds promise for revolutionizing numerous fields, from wildlife conservation to entertainment. The University of Surrey’s research represents a significant step forward in unlocking the predictive capabilities of AI, marking a milestone in the quest for harnessing technology to address complex challenges and unlock new opportunities. With continued innovation and collaboration, the future of AI holds immense potential to reshape our world for the better.