Revolutionizing 3D Model Generation Using Diffusion-Based Generative AI
Revolutionizing 3D Model Generation Using Diffusion-Based Generative AI
Abstract— The emergence of generative AI, in comparative analysis while describing features of the study.
particular, has brought a new 3D way of modeling which This work also goes about the challenges and how some AI
includes the automatic synthesis of complex models from insights could help to mitigate some of those challenges,
plain text prompts. The traditional 3D modeling enabling optimizations on a 3D model.
technique, which is labor-intensive, demands a great
deal of knowledge and time. This study offers a II. LITERATURE SURVEY
generative AI-based system that applies diffusion models
Refer Authors Model/Met Datasets Accurac
to produce 3D models of the highest quality meanwhile
ence hod y
providing with the minimum of labor. The system uses a
natural language input approach that users can create
detailed and surface-coated 3D models without the need
for the mastery of technical skills and know-how. IT is a [1] Chen et al. IT3D ShapNet, 85%
multi-faced tool, it can serve different industries like ModelNet
gaming, architecture, and animation at the same time,
due to the fact that it is easy to operate and is available [2] Canfes et 3D Avatar 3D Avatar 90%
everywhere. The purpose of this paper is to minimize the al. Generation Dataset,
creative process for 3D model generation using Diffusion & Editing Mixamo
models.
[3] Raj et al. Dreamboot coco, 87%
Keywords— Generative AI, 3D Modeling, Diffusion Models, h 3D Flickr30K
Creative Automation, AI in Design, Text-to-3D
[4] Chen et al. Text-to- Textures 82%
I. INTRODUCTION Texture from
The recent advancements in artificial intelligence now Synthesis OpenImag
make many possibilities possible, hence, 3D modeling from es
descriptions is possible. This technology bridges the gap
between NLP and computer graphics to create models of [5] Li et al. Dreanfont3 Google 88%
real 3D models on the basis of text input. Despite such D Fonts,
improvements, technology in 3D modeling of objects still custom
experienced many problems-mainly low accuracy, no visual Font
integrity, and huge constraints for the case of many different Dataset
materials. Most of the design procedures or scanning used in
them, rather sadly, are not only time-consuming but also a [6] Gorbatsevi GAN-based Terrain 83%
process that consumes much of one's time and energy. ch et al. Terrain Dataset,
Recent advances in deep learning did some tricks in Modeling OpenTop
improving such techniques by the power of modeling. ography
However, problems in these models that include GANs and
VAEs exist as they rely on prior knowledge, limited text
understanding, and problems with the creation of 3D [7] Dundar et Gradual Synthetic 86%
representation. Diffusion models promise to be brilliant in al. Learning in 3D
image design as they transform noise information into 3D Shapes
models out. We will, in this work, apply a diffusion model Networks Dataset
in three-dimensional text in order to come up with a sound
model that portrays the nuances of the text better. Our
approach promises to make an enormous difference to
existing models since it enhances 3D model visibility and is
compatible with annotations. The algorithm performs
462
Authorized licensed use limited to: Zhejiang University. Downloaded on July 06,2025 at 08:06:51 UTC from IEEE Xplore. Restrictions apply.
2024 Eighth International Conference on Parallel, Distributed and Grid Computing (PDGC)
463
Authorized licensed use limited to: Zhejiang University. Downloaded on July 06,2025 at 08:06:51 UTC from IEEE Xplore. Restrictions apply.
2024 Eighth International Conference on Parallel, Distributed and Grid Computing (PDGC)
464
Authorized licensed use limited to: Zhejiang University. Downloaded on July 06,2025 at 08:06:51 UTC from IEEE Xplore. Restrictions apply.
2024 Eighth International Conference on Parallel, Distributed and Grid Computing (PDGC)
[6] Gorbatsevich, Vladimir, Mikhail Melnichenko, and Oleg "DreamMesh: Jointly Manipulating and Texturing Triangle
Vygolov. "Enhancing detail of 3D terrain models using Meshes for Text-to-3D Generation." arXiv preprint
GAN." In Modeling Aspects in Optical Metrology VII, vol. arXiv:2409.07454 (2024).
11057, pp. 296-302. SPIE, 2019 [21] Selvarajan, S. A comprehensive study on modern
[7] Dundar, Aysegul, Jun Gao, Andrew Tao, and Bryan optimization techniques for engineering applications. Artif
Catanzaro. "Progressive learning of 3d reconstruction network Intell Rev 57, 194 (2024)
from 2d gan data." IEEE Transactions on Pattern Analysis [22] Swathi, A., Sandeep Kumar, Shilpa Rani, Abhishek Jain, and
and Machine Intelligence (2023). Ramakrishna Kumar MVNM. "Emotion Classification using
[8] Liu, Jerry, Fisher Yu, and Thomas Funkhouser. "Interactive Feature Extraction of Facial Expression." In 2022 2nd
3D modeling with a generative adversarial network." In 2017 International Conference on Technological Advancements in
International Conference on 3D Vision (3DV), pp. 126-134. Computational Sciences (ICTACS), pp. 283-288. IEEE, 2022.
IEEE, 2017. [23] Gowroju, Swathi, and Sandeep Kumar. "Robust pupil
[9] Wang, Xiaolong, and Abhinav Gupta. "Generative image segmentation using UNET and morphological image
modeling using style and structure adversarial networks." In processing." In 2021 International Mobile, Intelligent, and
European conference on computer vision, pp. 318-335. Cham: Ubiquitous Computing Conference (MIUCC), pp. 105-109.
Springer International Publishing, 2016 IEEE, 2021.
[10] Zdziebko, Paweł, and Krzysztof Holak. "Synthetic image
[24] Gowroju, Swathi, and Sandeep Kumar. "Robust deep learning
generation using the finite element method and blender
technique: U-net architecture for pupil segmentation." In 2020
graphics program for modeling of vision-based measurement
11th IEEE Annual Information Technology, Electronics and
systems." Sensors 21, no. 18 (2021): 6046.
Mobile Communication Conference (IEMCON), pp. 0609-
[11] Khan, Sallar, Sallar Channa, Syed Abbas Ali, Muhammad
0613. IEEE, 2020.
Haaris Khan, Arhum Hayat Qazi, and Kamran Mengal. "3D
[25] Gowroju, Swathi, K. Sravani, N. Santhosh Ramchandar, D.
Modeling for Wildlife Encyclopedia Using Blender." 3C
Tecnología. Glosas de innovaciónaplicadas a la pyme. Sai Kamesh, and J. Nasrasimha Murthy. "Robust Indian
Special Issue, November 2019 (2019): 133-147. Currency Recognition Using Deep Learning." In Advanced
[12] Kuzina, Valentina, and Alexander Koshev. "3D Modelling of Informatics for Computing Research: 4th International
Construction Objects Based on the Integrated AutoCAD Conference, ICAICR 2020, Gurugram, India, December 26–
System." In IOP Conference Series: Materials Science and 27, 2020, Revised Selected Papers, Part I 4, pp. 477-486.
Engineering, vol. 960, no. 3, p. 032040. IOP Publishing, Springer Singapore, 2021.
2020. [26] Swathi, A., and Shilpa Rani. "Intelligent fatigue detection by
[13] Atieh, Abd Alrahman. "Auto Generate CAD Drawings From using ACS and by avoiding false alarms of fatigue detection."
Text Descriptions: TEXT-TO-CAD." (2024). In Innovations in Computer Science and Engineering:
[14] Li, Canlin, Chao Yin, Jiajie Lu, and Lizhuang Ma. Proceedings of the Sixth ICICSE 2018, pp. 225-233. Springer
"Automatic 3D scene generation based on Maya." In 2009 Singapore, 2019.
IEEE 10th International Conference on Computer-Aided [27] Gowroju Swathi, and Sandeep Kumar. "Robust deep learning
Industrial Design & Conceptual Design, pp. 981-985. IEEE, technique: U-net architecture for pupil segmentation." In 2020
2009. 11th IEEE Annual Information Technology, Electronics and
[15] Malah, Mehdi, Ramzi Agaba, and Fayçal Abbas. "Generating Mobile Communication Conference (IEMCON), pp. 0609-
3D Reconstructions Using Generative Models." In 0613. IEEE, 2020.
Applications of Generative AI, pp. 403-419. Cham: Springer
International Publishing, 2024.
[16] Wijmans, Johannes G., and Richard W. Baker. "The solution-
diffusion model: a review." Journal of membrane science 107,
no. 1-2 (1995): 1-21.
[17] Ho, Cheng-Ju, Chen-Hsuan Tai, Yen-Yu Lin, Ming-Hsuan
Yang, and Yi-Hsuan Tsai. "Diffusion-ss3d: Diffusion model
for semi-supervised 3d object detection." Advances in Neural
Information Processing Systems 36 (2023): 49100-49112.
[18] Waibel, Dominik JE, Ernst Röell, Bastian Rieck, Raja Giryes,
and Carsten Marr. "A diffusion model predicts 3d shapes from
2d microscopy images." In 2023 IEEE 20th International
Symposium on Biomedical Imaging (ISBI), pp. 1-5. IEEE,
2023.
[19] Karnewar, Animesh, Andrea Vedaldi, David Novotny, and
Niloy J. Mitra. "Holodiffusion: Training a 3d diffusion model
using 2d images." In Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition, pp.
18423-18433. 2023.
[20] Yang, Haibo, Yang Chen, Yingwei Pan, Ting Yao, Zhineng
Chen, Zuxuan Wu, Yu-Gang Jiang, and Tao Mei.
465
Authorized licensed use limited to: Zhejiang University. Downloaded on July 06,2025 at 08:06:51 UTC from IEEE Xplore. Restrictions apply.