audio2photoreal: Synthesizing Photorealistic Codec Avatars from Audio
This repository profile is provided by osrepos.com, an open source repository discovery platform.
Summary
audio2photoreal is a powerful GitHub repository from Facebook Research that provides code and a dataset for generating photorealistic Codec Avatars driven solely from audio input. This project enables the synthesis of human embodiment in conversations, offering tools for training, testing, and running pretrained models to create lifelike digital representations. It represents a significant advancement in AI-driven computer graphics and virtual reality.
Repository Information
Topics
Click on any tag to explore related repositories
Use at your own risk
OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.
Introduction
The audio2photoreal repository from Facebook Research presents a groundbreaking project focused on synthesizing photorealistic Codec Avatars directly from audio input. This PyTorch implementation provides the code and dataset necessary to generate lifelike human embodiment in conversational settings. It's a significant step towards creating highly realistic digital representations of people, offering comprehensive tools for training, testing, and running pretrained models. Researchers and developers can leverage this work to explore advanced applications in computer graphics, AI, and virtual reality.
Installation
To get started with audio2photoreal, follow these steps for a quick setup and demo run. Ensure you have CUDA 11.7 and gcc/++ 9.0 for PyTorch3D compatibility.
First, create a Conda environment and install the necessary components, which include environment configuration, rendering assets, prerequisite models, and pretrained models:
conda create --name a2p_env python=3.9
conda activate a2p_env
sh demo/install.sh
Once the installation is complete, you can run the interactive demo:
python -m demo.demo
This demo allows you to record audio and then render corresponding photorealistic videos.
Examples
The audio2photoreal project enables the generation of photorealistic avatars from audio. You can experiment with the provided demo or delve into generating face and body movements separately.
A quick way to experience the project is through its interactive demo, where you can record an audio clip and generate a video of a photorealistic avatar speaking and moving in sync with your voice.
For more advanced usage, you can generate face codes and body poses independently using the pretrained models. For instance, to generate face codes for a participant like PXB184:
python -m sample.generate --model_path checkpoints/diffusion/c1_face/model000155000.pt --num_samples 10 --num_repetitions 5 --timestep_respacing ddim500 --guidance_param 10.0
After generating face codes, you can then generate body poses, optionally combining them for a full photorealistic avatar visualization:
python -m sample.generate --model_path checkpoints/diffusion/c1_pose/model000340000.pt --resume_trans checkpoints/guide/c1_pose/checkpoints/iter-0100000.pt --num_samples 10 --num_repetitions 5 --timestep_respacing ddim500 --guidance_param 2.0 --face_codes ./checkpoints/diffusion/c1_face/samples_c1_face_000155000_seed10_/results.npy --pose_codes ./checkpoints/diffusion/c1_pose/samples_c1_pose_000340000_seed10_guide_iter-0100000.pt/results.npy --plot
For an immediate hands-on experience without local setup, try the official Colab demo.
Why Use audio2photoreal?
audio2photoreal stands at the forefront of AI research in photorealistic avatar generation. By providing a robust framework to synthesize human embodiment from audio, it opens up numerous possibilities:
- Cutting-edge Research: It offers a solid foundation for researchers in computer vision, graphics, and AI to build upon and advance the state-of-the-art in digital human creation.
- Realistic Digital Humans: The project's ability to create highly convincing avatars driven by speech has implications for virtual assistants, realistic video conferencing, and immersive virtual reality experiences.
- Comprehensive Toolkit: With train and test code, pretrained models, and access to a dataset, it provides a complete ecosystem for both experimentation and development.
- Open-Source Contribution: As a Facebook Research project, it contributes valuable open-source resources to the community, fostering innovation in the field.
Links
- GitHub Repository: https://github.com/facebookresearch/audio2photoreal
- Research Paper: "From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations"
- Colab Demo: Try the interactive demo on Google Colab
Related repositories
Similar repositories that may be relevant next.

EvalPlus: Rigorous Evaluation for LLM-Synthesized Code
June 30, 2026
EvalPlus is a robust framework designed for the rigorous evaluation of code generated by Large Language Models (LLMs). It extends standard benchmarks like HumanEval and MBPP with significantly more tests, offering precise assessment of code correctness and efficiency. This tool is crucial for developers and researchers aiming to thoroughly validate LLM-synthesized code.

AgentEvals: Robust Evaluation Tools for LLM Agent Trajectories
June 30, 2026
AgentEvals is a powerful open-source package from LangChain designed to simplify the evaluation of agentic applications. It provides a collection of ready-made evaluators and utilities, with a particular focus on analyzing agent trajectories, the intermediate steps an agent takes to solve problems. This helps developers understand and improve the reliability and performance of their LLM agents.

Phoenix: AI Observability and Evaluation Platform for LLMs
June 28, 2026
Phoenix is an open-source AI observability platform from Arize AI, designed for comprehensive experimentation, evaluation, and troubleshooting of LLM applications. It provides robust features including OpenTelemetry-based tracing, LLM evaluation, and systematic prompt management. This platform helps developers optimize and debug their AI models effectively across various environments.

Observers: A Lightweight Library for AI Observability in Python
June 28, 2026
Observers is a Python library designed for AI observability, enabling developers to track and store interactions with generative AI APIs. It provides a flexible framework with various observers for popular LLM providers and multiple storage backends. This tool helps in monitoring, debugging, and analyzing AI model behavior effectively.
Source repository
Open the original repository on GitHub.