LegoGPT
This is the official repository for LegoGPT, the first approach for generating physically stable LEGO brick models from text prompts.
Installation
Prerequisites: Gurobi
Running stability analysis requires a Gurobi licence to use Gurobi. Academics may request a free licence from the Gurobi website here.
Installing as a standalone project
This repo uses the Python project manager uv. To install this repo as a standalone project, first install all prerequisites. Then,
- Clone the repo:
git clone "https://github.com/AvaLovelace1/LegoGPT.git" && cd LegoGPT. - (Optional, required for running the
inferscript) Follow these instructions to install ImportLDraw and Blender, required for rendering LEGO visualizations:- Install the ImportLDraw submodule with
git submodule update --init.- Some files in the ImportLDraw submodule are stored using the Git LFS system. To download these files,
install Git LFS,
cdinto the ImportLDraw directory, and rungit lfs pull.
- Some files in the ImportLDraw submodule are stored using the Git LFS system. To download these files,
install Git LFS,
- Download the LDraw parts library
complete.zipfrom here, and extract it in your home directory. - Install Blender.
- Install the ImportLDraw submodule with
- Finally, install uv. A virtual environment will be
created, and the remaining dependencies installed automatically, upon invoking
uv run [SCRIPT_NAME].
Installing as a package
To install this repo as a package in your own Python project, first install all prerequisites. Then, run
uv add "https://github.com/AvaLovelace1/LegoGPT.git"
if using uv, or
pip install "https://github.com/AvaLovelace1/LegoGPT.git"
if using pip.
Running inference interactively
You can run inference with the fine-tuned LegoGPT model using:
uv run infer
This script starts an interactive session where you can input a prompt and get a response from the model. The model weights will automatically be downloaded from Hugging Face; they can be found here.
If you wish to run inference with a different set of model weights, specify them using the --model_name_or_path
option. See uv run infer -h for a full list of options.
Example interaction
Here is an example interaction using the infer script:
> uv run infer
Enter a prompt, or <Return> to exit: Table featuring a flat rectangular surface over four evenly spaced legs.
Enter a filename to save the output image (default=output.png): output.png
Enter a generation seed (default=42): 42
Generating...
Set parameter Username
Academic license - for non-commercial use only - expires 2026-02-19
--------------------
Finished generating in 63.53s.
Total # bricks: 59
Total # brick rejections: 98
Brick rejection reasons: {'collision': 5, 'already_rejected': 93}
Total # regenerations: 4
Saved results to /home/apun/LegoGPT/output.txt, /home/apun/LegoGPT/output.ldr, and /home/apun/LegoGPT/output.png
--------------------
Enter another prompt, or <Return> to exit:
Three output files are created: output.png, output.txt, and output.ldr.
output.png contains a rendered image of the generated LEGO structure:
output.txt contains the LEGO structure in brick-by-brick text format, where each line of the form hxw (x,y,z)
represents a LEGO brick of height h and width w at position (x,y,z):
1x2 (16,18,0)
1x2 (16,13,0)
2x2 (0,18,0)
2x2 (0,13,0)
1x2 (16,18,1)
[...]
And finally, output.ldr contains the LEGO structure in LDraw format, which can be opened with any LDraw-compatible
software.
Running fine-tuning
LegoGPT was created by fine-tuning Llama-3.2-1B-Instruct on the custom LEGO dataset StableText2Lego, converted into instructional format. We used Hugging Face TRL with Accelerate for fine-tuning.
To replicate the fine-tuning process, follow these instructions:
- Prepare the LEGO dataset for fine-tuning with
uv run prepare_finetuning_dataset --input_path cmu-gil/StableText2Lego --output_path [FINETUNING_DATASET_PATH]. This converts the dataset into the instructional format required for fine-tuning LLaMA.- If you wish to run fine-tuning with your own LEGO dataset, replace
cmu-gil/StableText2Legowith the path to your dataset. This dataset should have the fields "captions" and "lego". The "lego" field should contain a LEGO structure in the text format described in the paper, and the "captions" field should contain a list of one or more descriptions of the LEGO structure.
- If you wish to run fine-tuning with your own LEGO dataset, replace
- Download the pretrained Llama-3.2-1B-Instruct model to
some directory
[PRETRAINED_DIR]. IMPORTANT: Replace theconfig.json,special_tokens_map.json, andtokenizer_config.jsonfiles with the ones in thefinetuning_config_filesdirectory. This specifies thepad_tokento be different from theeos_token, fixing a fine-tuning issue where the model will not learn to output EOS tokens properly. - Initialize the Accelerate config file with
uv run accelerate config. - Run fine-tuning with
uv run ./finetune.zsh [PRETRAINED_DIR] [OUTPUT_DIR] [RUN_NAME] [FINETUNING_DATASET_PATH]. The fine-tuned model will be saved to[OUTPUT_DIR]/[RUN_NAME].