LegoGPT
Installation
Prerequisites
Running stability analysis requires a Gurobi licence to use Gurobi. Academics may request a free licence from the Gurobi website here.
Installing as a standalone project
This repo uses the Python project manager uv. To install this repo as a standalone project:
- Clone the repo:
git clone "git@github.com:AvaLovelace1/LegoGPT.git" && cd LegoGPT. - Install the following dependencies required for rendering LEGO visualizations:
- Install the ImportLDraw submodule with
git submodule init && git submodule update. - Some files in the ImportLDraw submodule are stored using the Git LFS system. To download these files,
install Git LFS,
cdinto the ImportLDraw directory, and rungit lfs pull. - Download the LDraw parts library
complete.zipfrom here, and extract it in your home directory.
- Install the ImportLDraw submodule with
- Finally, install uv. A virtual environment will be
created, and the remaining dependencies installed automatically, upon invoking
uv run [SCRIPT_NAME].
Installing as a package
To install this repo as a package in your own Python project, run
uv add "git+ssh://git@github.com/AvaLovelace1/LegoGPT.git"
if using uv, or
pip install "git+ssh://git@github.com/AvaLovelace1/LegoGPT.git"
if using pip.
Running inference interactively
You can run inference with the fine-tuned LegoGPT model using:
uv run infer --model_name_or_path MODEL_PATH
This script starts an interactive session where you can input a prompt and get a response from the model. See
uv run infer -h for a full list of options.
Example interaction
Here is an example interaction using the infer script:
uv run infer --model_name_or_path '/data/apun/finetuned_hf/LegoGPT'
Enter a prompt, or <Return> to exit: Table featuring a flat rectangular surface over four evenly spaced legs.
Enter a filename to save the output image (default=output.png): output.png
Enter a generation seed (default=42): 42
Generating...
Set parameter Username
Academic license - for non-commercial use only - expires 2026-02-19
--------------------
Finished generating in 63.53s.
Total # bricks: 59
Total # brick rejections: 98
Brick rejection reasons: {'collision': 5, 'already_rejected': 93}
Total # regenerations: 4
Saved results to /home/apun/LegoGPT/output.txt, /home/apun/LegoGPT/output.ldr, and /home/apun/LegoGPT/output.png
--------------------
Enter another prompt, or <Return> to exit:
Three output files are created: output.png, output.txt, and output.ldr.
output.png contains a rendered image of the generated LEGO structure:
output.txt contains the LEGO structure in brick-by-brick text format, where each line of the form hxw (x,y,z) represents a LEGO brick of height h and width w at position (x,y,z):
1x2 (16,18,0)
1x2 (16,13,0)
2x2 (0,18,0)
2x2 (0,13,0)
1x2 (16,18,1)
[...]
And finally, output.ldr contains the LEGO structure in LDraw format, which can be opened with any LDraw-compatible software.
Fine-tuning LegoGPT
We use Hugging Face TRL with Accelerate for fine-tuning. To run fine-tuning, follow these instructions:
- Start with a LEGO dataset with the fields "caption" and "lego". The "caption" field should contain a description of the LEGO model, and the "lego" field should contain the corresponding LEGO model, in the text format described in the paper.
- Prepare the dataset for finetuning with
uv run prepare_finetuning_dataset --input_path LEGO_DATASET_PATH --output_path FINETUNING_DATASET_PATH. - Download the pretrained Llama-3.2-1B-Instruct model to
some directory
[PRETRAINED_DIR]. IMPORTANT: Replace theconfig.json,special_tokens_map.json, andtokenizer_config.jsonfiles with the ones in thefinetuning_config_filesdirectory. This specifies thepad_tokento be different from theeos_token, fixing a fine-tuning issue where the model will not learn to output EOS tokens properly. - Initialize the Accelerate config file with
uv run accelerate config. - Run finetuning with
uv run ./finetune.zsh [PRETRAINED_DIR] [OUTPUT_DIR] [RUN_NAME] [FINETUNING_DATASET_PATH]. The finetuned model will be saved to[OUTPUT_DIR]/[RUN_NAME].