°ÇÃ൵½Ã°ø°£¿¬±¸¼Ò

Architecture & Urban Research Institute

pdf¿ø¹®º¸±â ¿¡·¯ ÇØ°á¹æ¹ý ¹Ù·Î°¡±â



¹®ÇåȨ > ¿¬±¸³í¹® > »ó¼¼

[¿ø¹®º¸±â½Ã ¼ÒºñµÇ´Â Æ÷ÀÎÆ® : 100 Æ÷ÀÎÆ®] ¹Ì¸®º¸±â Àοë

´ëÇѰÇÃàÇÐȸ|³í¹®Áý 2025³â 8¿ù

³í¹®¸í LoRA¿Í ControlNetÀ» Ȱ¿ëÇÑ Stable Diffusion ±â¹Ý °ÇÃà ÀÔ¸é »ý¼º ¹æ¹ý°ú Æò°¡¿¡ °üÇÑ ¿¬±¸ / A Generation Method and Evaluation of Architectural Facade Design Using Stable Diffusion with LoRA and ControlNet
ÀúÀÚ¸í ¹ÚÁ¤¹Î(Park, Jungmin) ; È«¼ø¹Î(Hong, Soonmin) ; Ã߽¿¬(Choo, Seungyeon)
¹ßÇà»ç ´ëÇѰÇÃàÇÐȸ
¼ö·Ï»çÇ× ´ëÇѰÇÃàÇÐȸ³í¹®Áý, Vol.41 No.8 (2025-08)
ÆäÀÌÁö ½ÃÀÛÆäÀÌÁö(85) ÃÑÆäÀÌÁö(12)
ISSN 2733-6247
ÁÖÁ¦ºÐ·ù °èȹ¹×¼³°è / ÀÌ·Ð
ÁÖÁ¦¾î ½ºÅ×ÀÌºí µðÇ»Àü; °ÇÃà ¸Å½º; ÀÔ¸é µðÀÚÀÎ; »ý¼ºÇü AI; ·Î¶ó; ÄÁÆ®·Ñ³Ý ; Stable Diffusion; Architecture massing; Facade Design; Generative AI; LoRA; ControlNet
¿ä¾à1 º» ¿¬±¸´Â Stable Diffusion¿¡ LoRA¿Í ControlNetÀ» °áÇÕÇÏ¿© °ÇÃà ÀÔ¸é À̹ÌÁö¸¦ »ý¼ºÇÏ´Â »õ·Î¿î ¹æ¹ýÀ» Á¦¾ÈÇÑ´Ù. ±âÁ¸ Stable DiffusionÀº °ÇÃà ¿ä¼Ò¿Í Àç·á Ç¥Çö¿¡ ÇѰ谡 ÀÖÀ¸¸ç, À̸¦ ÇØ°áÇϱâ À§ÇØ LoRA¸¦ Ȱ¿ëÇÑ °ÇÃà Æ¯È­ ÇнÀ°ú ControlNetÀÇ Canny Edge ¹× Depth Map Á¦¾î¸¦ Àû¿ëÇÏ¿´´Ù. »ý¼ºµÈ À̹ÌÁö´Â CLIP Æò°¡¿Í GPT-4V ±â¹Ý Æò°¡¸¦ ÅëÇØ ºÐ¼®µÇ¾ú´Ù. º» ¿¬±¸´Â Ãʱ⠰ÇÃà ¼³°è °úÁ¤¿¡¼­ÀÇ È¿À²¼ºÀ» ³ôÀ̰í, ÀÔ¸é µðÀÚÀÎ ÀÚµ¿È­¿¡ ±â¿©ÇÒ ¼ö ÀÖÀ½À» ½Ã»çÇÑ´Ù.
¿ä¾à2 This study proposes a novel approach for generating architectural facade images by combining the Stable Diffusion model with Low-Rank
Adaptation (LoRA) and ControlNet. The standard Stable Diffusion model faces limitations in accurately reflecting architectural elements and
material characteristics, which are critical in the design process. To address these challenges, this research integrates domain-specific
fine-tuning using LoRA and precise shape control through ControlNet. LoRA allows the model to effectively learn architectural styles and
details, ensuring better representation of essential design elements such as windows, balconies, and facade materials. Meanwhile, ControlNet
utilizes Canny Edge and Depth Map information to enhance shape accuracy and spatial consistency, enabling more reliable image generation.
The generated images were evaluated through Contrastive Language-Image Pretraining (CLIP) scores for quantitative analysis and
GPT-4V-based qualitative evaluation, providing a more comprehensive understanding of architectural coherence and visual fidelity. The
GPT-4V assessment offered insights into spatial relationships, contextual relevance, and material expression that are not easily captured
through traditional metrics. This combined approach reduces the repetitive manual adjustments commonly required in text-prompt-based image
generation and facilitates a more intuitive and efficient design process during the early stages of architectural planning. By improving control
over detailed architectural features, the proposed method contributes to the automation of facade design, offering significant potential for
real-world applications in architectural design and visualization. Future research will focus on expanding the dataset to include diverse
architectural styles and validating its practical application in design and construction.
¼ÒÀåó ´ëÇѰÇÃàÇÐȸ
¾ð¾î Çѱ¹¾î
DOI https://doi.org/10.5659/JAIK.2025.41.8.85