mirror of https://github.com/getml/sqlgen.git synced 2026-01-06 09:30:07 -06:00

Files

Dr. Patrick Urbanke (劉自成) 775f15babf Added aggregations and GROUP BY (#17 )

2025-06-08 12:30:59 +02:00

7.2 KiB

Raw Blame History

`sqlgen::group_by` and Aggregations

The sqlgen::group_by interface provides a type-safe way to group and aggregate data in SQL queries. It supports various aggregation functions like count, avg, sum, min, max, and count_distinct.

Usage

Basic Aggregations

Perform aggregations without grouping:

using namespace sqlgen;

struct Children {
    int num_children;
    int num_last_names;
    double avg_age;
    double max_age;
    double min_age;
    double sum_age;
};

const auto get_children = select_from<Person>(
    avg("age"_c).as<"avg_age">(),
    count().as<"num_children">(),
    max("age"_c).as<"max_age">(),
    min("age"_c).as<"min_age">(),
    sum("age"_c).as<"sum_age">(),
    count_distinct("last_name"_c).as<"num_last_names">()
) | where("age"_c < 18) | to<Children>;

const auto children = sqlite::connect()
    .and_then(get_children)
    .value();

This generates the following SQL:

SELECT 
    AVG("age") as "avg_age",
    COUNT(*) as "num_children",
    MAX("age") as "max_age",
    MIN("age") as "min_age",
    SUM("age") as "sum_age",
    COUNT(DISTINCT "last_name") as "num_last_names"
FROM "Person"
WHERE "age" < 18;

Group By with Aggregations

Group data and perform aggregations:

using namespace sqlgen;

struct Children {
    std::string last_name;
    int num_children;
    int num_last_names;
    double avg_age;
    double max_age;
    double min_age;
    double sum_age;
};

const auto get_children = select_from<Person>(
    "last_name"_c,
    avg("age"_c).as<"avg_age">(),
    count().as<"num_children">(),
    max("age"_c).as<"max_age">(),
    min("age"_c).as<"min_age">(),
    sum("age"_c).as<"sum_age">(),
    count_distinct("last_name"_c).as<"num_last_names">()
) | where("age"_c < 18) | group_by("last_name"_c) | to<std::vector<Children>>;

const auto children = sqlite::connect()
    .and_then(drop<Person> | if_exists)
    .and_then(write(std::ref(people)))
    .and_then(get_children)
    .value();

This generates the following SQL:

SELECT 
    "last_name",
    AVG("age") as "avg_age",
    COUNT(*) as "num_children",
    MAX("age") as "max_age",
    MIN("age") as "min_age",
    SUM("age") as "sum_age",
    COUNT(DISTINCT "last_name") as "num_last_names"
FROM "Person"
WHERE "age" < 18
GROUP BY "last_name";

Available Aggregation Functions

The following aggregation functions are available:

count() - Counts all rows
count_distinct(col) - Counts distinct values in a column
avg(col) - Calculates average of a column
sum(col) - Calculates sum of a column
max(col) - Finds maximum value in a column
min(col) - Finds minimum value in a column

Each aggregation function can be aliased using the .as<"alias">() syntax.

Aggregating Nullable Values

When working with nullable columns (e.g., std::optional<T>), it's important to understand how aggregations handle null values:

The result of aggregating a nullable column must also be nullable (e.g., std::optional<double>)
Null values are excluded from aggregation calculations
If all values in a group are null, the aggregation result will be null

Example with Nullable Values

struct Person {
    sqlgen::PrimaryKey<uint32_t> id;
    std::string first_name;
    std::string last_name;
    std::optional<double> age;  // Nullable age field
};

struct Children {
    int num_children;           // Non-nullable count
    int num_last_names;        // Non-nullable count
    std::optional<double> avg_age;  // Nullable aggregation
    std::optional<double> max_age;  // Nullable aggregation
    std::optional<double> min_age;  // Nullable aggregation
    std::optional<double> sum_age;  // Nullable aggregation
};

const auto get_children = select_from<Person>(
    avg("age"_c).as<"avg_age">(),      // Results in std::optional<double>
    count().as<"num_children">(),      // Results in int (non-nullable)
    max("age"_c).as<"max_age">(),      // Results in std::optional<double>
    min("age"_c).as<"min_age">(),      // Results in std::optional<double>
    sum("age"_c).as<"sum_age">(),      // Results in std::optional<double>
    count_distinct("last_name"_c).as<"num_last_names">()
) | where("age"_c < 18 or "age"_c.is_null()) | to<Children>;

In this example:

The age column is nullable (std::optional<double>)
All aggregations of age result in nullable values (std::optional<double>)
The count() and count_distinct() functions return non-nullable integers
The where clause includes both non-null values less than 18 and null values
If all values in a group are null, the aggregation results (avg, max, min, sum) will be null

Important Notes

Always use nullable types (std::optional<T>) for aggregated values when the source column is nullable
Count operations (count(), count_distinct()) always return non-nullable integers
Null values are excluded from aggregation calculations (avg, sum, max, min)
When all values in a group are null, the aggregation result will be null
The type system will enforce these rules at compile time

The `.as` Operator

The .as operator allows you to alias columns and expressions in your SQL queries. This is particularly important when mapping query results to C++ structs, as the field names in your struct must match the aliases used in the query.

Usage

You can use .as in several ways:

// Method 1: Using the .as() member function
"field1"_c.as<"field">()

// Method 2: Using the pipe operator with as
"nullable"_c | as<"nullable_field">

// Method 3: Using as directly with values
as<"one">(1)
"hello" | as<"hello">

Important Notes

Field names in your result struct must exactly match the aliases used in the query
Fields are matched by name, not by order
You can alias any column, expression, or literal value
The alias must be a string literal (using the as<"alias"> syntax)

Example

struct Result {
    std::string field;           // Matches "field1"_c.as<"field">()
    double avg_field2;           // Matches avg("field2"_c).as<"avg_field2">()
    std::optional<std::string> nullable_field;  // Matches "nullable"_c | as<"nullable_field">
    int one;                     // Matches as<"one">(1)
    std::string hello;           // Matches "hello" | as<"hello">
};

const auto query = select_from<TestTable>(
    "field1"_c.as<"field">(),
    avg("field2"_c).as<"avg_field2">(),
    "nullable"_c | as<"nullable_field">,
    as<"one">(1),
    "hello" | as<"hello">
) | to<Result>;

In this example, each field in the Result struct must have a name that matches one of the aliases used in the query. The order of the fields in the struct doesn't matter, but the names must match exactly.

Notes

The group_by clause must be used before order_by or limit clauses
You cannot use group_by multiple times in the same query
Aggregation functions can be used with or without group_by
The result type must match the structure of your select statement. Note that fields are matched by name, not order.
All aggregations are type-safe and will map to appropriate SQL types
The Result<...> type provides error handling; use .value() to extract the result (will throw an exception if there's an error) or handle errors as needed

7.2 KiB Raw Blame History

sqlgen::group_by and Aggregations

Usage

Basic Aggregations

Group By with Aggregations

Available Aggregation Functions

Aggregating Nullable Values

Example with Nullable Values

Important Notes

The .as Operator

Usage

Important Notes

Example

Notes

7.2 KiB

Raw Blame History

`sqlgen::group_by` and Aggregations

The `.as` Operator