Reference

Reference
- Contents
- Index

Index

Base.merge
TulipaIO.as_table
TulipaIO.create_tbl
TulipaIO.create_tbl
TulipaIO.create_tbl
TulipaIO.create_tbl
TulipaIO.flow_from_json
TulipaIO.flow_from_json_impl!
TulipaIO.get_table
TulipaIO.get_tbl_name
TulipaIO.json_get
TulipaIO.read_csv_folder
TulipaIO.read_esdl_json
TulipaIO.reduce_unless
TulipaIO.rename_cols
TulipaIO.resolve!
TulipaIO.select_tbl
TulipaIO.show_tables
TulipaIO.tbl_cols
TulipaIO.update_tbl

Base.merge — Method

merge(args...)

Given a set of structs, merge them and return a single struct. Fields are merged when they are equal or nothing. Anything else raises an error with a summary of the fields with conflicting values.

source

TulipaIO.as_table — Method

as_table(op::Function, con::DB, name::String, args...)

Temporarily "import" a Julia object into a DuckDB session. It does it by first creating a DataFrame. args... are passed on to the DataFrame constructor as is. It is registered with the DuckDB connection con as the table name. This function can be used with a do-block like this:

using DuckDB: DBInterface, DB, query
using DataFrames: DataFrame

con = DBInterface.connect(DB)

as_table(con, "mytbl", (;col=collect(1:5))) do con, name
    query(con, "SELECT col, col+2 as 'shift_2' FROM '$name'")
end |> DataFrame

# output

5×2 DataFrame
 Row │ col    shift_2
     │ Int64  Int64
─────┼────────────────
   1 │     1        3
   2 │     2        4
   3 │     3        5
   4 │     4        6
   5 │     5        7

source

TulipaIO.create_tbl — Method

create_tbl(
    con::DB,
    base_source::String,
    alt_source::String;
    on::Vector{Symbol},
    cols::Vector{Symbol},
    name::String = "",
    fill::Bool = true,
    fill_values::Union{Missing,Dict} = missing,
    tmp::Bool = false,
    show::Bool = false,
    opts...
)

Create a table from two sources. The first is used as the base, and the second source is used as a source for alternative values by doing a LEFT JOIN, i.e. all rows in the base source are retained.

Either sources can be a table in DuckDB, or a file source as in the single source variant.

The resulting table is saved as the table name. The name of the created table is returned. The behaviour for tmp, and show are identical to the single source variant.

The LEFT JOIN is performend on the columns specified by on. The set of columns picked from the alternative source after the join are specified by cols.

If the alternate source has a subset of rows, the default behaviour is to back-fill the corresponding values from the base table. If this is not desired, then fill can be set to false. In that case they will be missing values.

To fill an alternate value, you can set fill_values to a dictionary, where the keys are column names, and the values are the corresponding fill value. If any columns are missing, it falls back to back-fill.

TODO: In the future an "error" option would also be supported, to fail loudly when the number of rows do not match between the base and alternative source.

Any remaining keyword arguments are passed on to the read_* table functions of DuckDB. Any options here will override options provided earlier, e.g. you can override the default header=true option set by TulipaIO.

source

TulipaIO.create_tbl — Method

create_tbl(
    con::DB,
    source::String;
    name::String = "",
    tmp::Bool = false,
    show::Bool = false,
    types = Dict(),
    replace_if_exists = true,
    opts...
)

Create a table from a file source (CSV, Parquet, line delimited JSON, etc)

The resulting table is saved as the table name. The name of the created table is returned.

Optionally, if show is true, the table is returned as a Julia DataFrame. This can be useful for interactive debugging in the Julia REPL.

It is also possible to create the table as a temporary table by setting the tmp flag, i.e. the table is session scoped. It is deleted when you close the connection with DuckDB.

When show is false, and name was not provided, a table name is automatically generated from the basename of the filename.

If replace_if_exists is true, then the CREATE OR REPLACE is used instead of just CREATE, allowing DuckDB to replace the table, if a table with the same name already exists.

To enforce data types of a column, you can provide the keyword argument types as a dictionary with column names as keys, and corresponding DuckDB types as values.

TODO: add option to select while creating table

source

TulipaIO.create_tbl — Method

create_tbl(
    con::DB,
    source::String,
    cols::Dict{K, Vector{V}};
    on::Symbol,
    name::String,
    tmp::Bool = false,
    show::Bool = false,
    opts...
) where {K <: Union{String, Symbol}, V <: Union{Bool, Real, String, Any, Nothing}}

Create a table from a source (either a DuckDB table or a file), where columns can be set to vectors provided in a dictionary cols. The keys are the new column names, and the vector values are the column entries. This transform is very similar to create_tbl, except that the alternate source is a data structure in Julia.

The resulting table is saved as the table name. The name of the created table is returned.

All other options behave as the two source version of create_tbl, including additional keyword arguments.

source

TulipaIO.create_tbl — Method

create_tbl(
    con::DB,
    source::String,
    cols::Dict{K, V};
    on::Symbol,
    name::String = "",
    where_::String = "",
    tmp::Bool = false,
    show::Bool = false,
    opts...
) where {K <: Union{String, Symbol}, V <: Union{Bool, Real, String, Any, Nothing}}

Create a table from a source (either a DuckDB table or a file), where a column can be set to the values provided by the dictionary cols. The keys are the column names, whereas the values are the column entries. Note that in this case, all entries in a column are set to the same value. Unlike the vector variant of this function, all values of the column are set to this value.

All other options and behaviour are same as the vector variant of this function, including additional keyword arguments.

source

TulipaIO.flow_from_json — Method

flow_from_json(json)

Returns an array of from/to node names from a JSON document (as parsed by JSON3.jl):

[(from_name, to_name, Asset(...)), (..., ..., ...), ...]

source

TulipaIO.flow_from_json_impl! — Method

flow_from_json_impl!(json, flows; find_edge)

Find all flows (from/to node names) from a JSON document.

json: JSON document
flows: The flows are returned by appending to this vector
find_edge: Function invoked as find_edge(asset::JSON3.Object) to find the flows originating from an asset

source

TulipaIO.get_table — Method

df = get_table(connection, table_name)
query = get_table(Val(:raw), connection, table_name)

Run the SELECT * FROM table_name sql command.

The Val(:raw) variant returns the raw output from DuckDB, otherwise we construct a DataFrame.

source

TulipaIO.get_tbl_name — Method

get_tbl_name(source::String, tmp::Bool)

Generate table name from a filename by removing special characters. If tmp is true, then the table name is prefixed by 't_'.

source

TulipaIO.json_get — Method

json_get(json, reference; trunc = 0)

Given a JSON document, find the object pointed to by the reference (e.g. "//@<key>.<array_idx>/@<key>"); truncate the last trunc components of the reference.

source

TulipaIO.read_csv_folder — Method

read_csv_folder(connection, folder; kwargs...)

Read all CSV files in the folder and create a table for each in the connection.

Keywords arguments

database_schema = ""
table_name_prefix = ""
table_name_suffix = ""
schemas = Dict() Dictionary of dictionaries, where the inner dictionary is a table schema (partial schemas are allowed). The keys of the outer dictionary are the table names (without prefix and suffix)
kwargs... are keyword options that are passed on to DuckDB's read_csv function

source

TulipaIO.read_esdl_json — Method

read_esdl_json(json_path)

This is the entry point for the parser. It reads the ESDL JSON file at json_path and returns an array of from/to node names, along with a struct of Asset type. The Asset attribute values are determined by combining the attribute values of the from & to ESDL assets nodes. If the two nodes have conflicting asset values, an error is raised:

[(from_name, to_name, Asset(...)), (..., ..., ...), ...]

source

TulipaIO.reduce_unless — Method

reduce_unless(fn, itr; init, sentinel)

A version of reduce that stops if reduction returns sentinel at any point

fn: reduction function
itr: iterator to reduce
init: initial value (unlike standard, mandatory)
sentinel: stop if reduction returns sentinel

Returns reduced value, or sentinel

source

TulipaIO.rename_cols — Method

rename_cols(con::DB, tblname::String; col_remap...)

Rename the columns of a table. The old to new column name mapping is passed as keyword arguments.

source

TulipaIO.resolve! — Method

resolve!(field, values, errs)

Given a set of values, ensure they are either all equal or nothing. On failure, push field to errs.

field: the field to push in errs to signal failure
values: values to check
errs: vector of field names with errors

Returns resolved value

source

TulipaIO.select_tbl — Method

select_tbl(con::DB, source::String, where_::String; opts...)

Select a subset of rows from a source (table or file) by passing an SQL where clause as where_.

All keyword arguments are passed to the read_* function if the source is a file, ignored otherwise.

source

TulipaIO.show_tables — Method

df = show_tables(connection)
query = show_tables(Val(:raw), connection)

Run the SHOW TABLES sql command.

The Val(:raw) variant returns the raw output from DuckDB, otherwise we construct a DataFrame.

source

TulipaIO.tbl_cols — Method

tbl_cols(con::DB, tbl::String)

Return all the column names for the given table as a DataFrame.

Example:

using DuckDB, TulipaIO
con = DBInterface.connect(DuckDB.DB)
DBInterface.execute(con, "CREATE TABLE mytbl AS SELECT range AS a, range+2 AS b FROM range(3)")
TulipaIO.tbl_cols(con, "mytbl")

# output

2×1 DataFrame
 Row │ column_name
     │ String
─────┼─────────────
   1 │ a
   2 │ b

source

TulipaIO.update_tbl — Method

update_tbl(
    con::DB,
    tbl::String,
    cols::Dict{K, V};
    where_::String ="",
    show = false
) where {K <: Union{String, Symbol}, V <: Union{Bool, Real, String, Any, Nothing}}

Update the values of a column in an existing table

source

Reference

Contents

Index