Weekly Challenges

Pilsner · ‎08-02-2025

My solution:

RM_MOHANTY · ‎08-02-2025

LEGO Dataset Explanations

To identify a LEGO model based on an inventory ID, we are using four related datasets. Each one plays a unique role in helping us connect the inventory to the final LEGO set information, such as name, year, number of parts, and the theme it belongs to.

Dataset 1 – Inventory Sets

Column Name Description

id	Unique inventory ID for a specific LEGO box or set of bricks
version	Version of the inventory (not used in our joins)
set_num	LEGO set number that this inventory ID belongs to

Purpose:
This dataset tells us which LEGO set each inventory ID refers to.

Example: Inventory ID 11148 belongs to set_num = 75911-1.

Dataset 2 – Sets

Column Name Description

set_num	LEGO set number (primary key)
name	Name of the LEGO set
year	Year the set was released
theme_id	Reference to the theme or collection the set belongs to

Purpose:
This dataset provides the actual details of each LEGO set, including its name, release year, and the theme it belongs to (through theme_id).

Dataset 3 – Set Parts Info

Column Name Description

model_num	Same as set_num — identifies the LEGO set
num_parts	Total number of parts in the set

Purpose:
This dataset tells us how many bricks or parts are in each LEGO set.
We join it using set_num = model_num.

Dataset 4 – Themes

Column Name Description

id	Unique identifier for the LEGO theme
name	Name of the theme (e.g., "Star Wars", "Technic", "City")
parent_id	Used for nested themes (not used here)

Purpose:
This dataset gives us the theme name for each theme_id.
We use it to convert theme IDs into readable theme labels.

Summary of Data Flow

Start with Dataset 1 (inventory ID)
Join with Dataset 2 to get set details (name, year, theme ID)
Join with Dataset 3 to get number of parts
Join with Dataset 4 to get the theme name

Ajay_Singha · ‎08-02-2025

Loved it!!

Weekly Challenges

IDEAS WANTED

Challenge #365: LEGO® My Data!