Reference
Contents
Index
Base.merge
TulipaIO.as_table
TulipaIO.create_tbl
TulipaIO.create_tbl
TulipaIO.create_tbl
TulipaIO.create_tbl
TulipaIO.flow_from_json
TulipaIO.flow_from_json_impl!
TulipaIO.get_table
TulipaIO.get_tbl_name
TulipaIO.json_get
TulipaIO.read_csv_folder
TulipaIO.read_esdl_json
TulipaIO.reduce_unless
TulipaIO.resolve!
TulipaIO.show_tables
Base.merge
— Methodmerge(args...)
Given a set of structs, merge them and return a single struct. Fields are merged when they are equal or nothing
. Anything else raises an error with a summary of the fields with conflicting values.
TulipaIO.as_table
— Methodas_table(op::Function, con::DB, name::String, args...)
Temporarily "import" a Julia object into a DuckDB session. It does it by first creating a DataFrame
. args...
are passed on to the DataFrame
constructor as is. It is registered with the DuckDB connection con
as the table name
. This function can be used with a do
-block like this:
using DuckDB: DBInterface, DB, query
using DataFrames: DataFrame
con = DBInterface.connect(DB)
as_table(con, "mytbl", (;col=collect(1:5))) do con, name
query(con, "SELECT col, col+2 as 'shift_2' FROM '$name'")
end |> DataFrame
# output
5×2 DataFrame
Row │ col shift_2
│ Int64? Int64?
─────┼─────────────────
1 │ 1 3
2 │ 2 4
3 │ 3 5
4 │ 4 6
5 │ 5 7
TulipaIO.create_tbl
— Methodcreate_tbl(
con::DB,
base_source::String,
alt_source::String;
on::Vector{Symbol},
cols::Vector{Symbol},
name::String = "",
fill::Bool = true,
fill_values::Union{Missing,Dict} = missing,
tmp::Bool = false,
show::Bool = false,
)
Create a table from two sources. The first is used as the base, and the second source is used as a source for alternative values by doing a LEFT JOIN
, i.e. all rows in the base source are retained.
Either sources can be a table in DuckDB, or a file source as in the single source variant.
The resulting table is saved as the table name
. The name of the created table is returned. The behaviour for tmp
, and show
are identical to the single source variant.
The LEFT JOIN
is performend on the columns specified by on
. The set of columns picked from the alternative source after the join are specified by cols
.
If the alternate source has a subset of rows, the default behaviour is to back-fill the corresponding values from the base table. If this is not desired, then fill
can be set to false
. In that case they will be missing
values.
To fill an alternate value, you can set fill_values
to a dictionary, where the keys are column names, and the values are the corresponding fill value. If any columns are missing, it falls back to back-fill.
TODO: In the future an "error" option would also be supported, to fail loudly when the number of rows do not match between the base and alternative source.
TulipaIO.create_tbl
— Methodcreate_tbl(
con::DB,
source::String;
name::String = "",
tmp::Bool = false,
show::Bool = false,
types = Dict(),
)
Create a table from a file source (CSV, Parquet, line delimited JSON, etc)
The resulting table is saved as the table name
. The name of the created table is returned.
Optionally, if show
is true
, the table is returned as a Julia DataFrame. This can be useful for interactive debugging in the Julia REPL.
It is also possible to create the table as a temporary table by setting the tmp
flag, i.e. the table is session scoped. It is deleted when you close the connection with DuckDB.
When show
is false
, and name
was not provided, a table name is automatically generated from the basename of the filename.
To enforce data types of a column, you can provide the keyword argument types
as a dictionary with column names as keys, and corresponding DuckDB types as values.
TulipaIO.create_tbl
— Methodcreate_tbl(
con::DB,
source::String,
cols::Dict{Symbol, T};
on::Symbol,
name::String = "",
where_::String = "",
tmp::Bool = false,
show::Bool = false,
) where T
Create a table from a source (either a DuckDB table or a file), where a column can be set to the values provided by the dictionary cols
. The keys are the column names, whereas the values are the column entries. Note that in this case, all entries in a column are set to the same value. Unlike the vector variant of this function, all values of the column are set to this value.
All other options and behaviour are same as the vector variant of this function.
TulipaIO.create_tbl
— Methodcreate_tbl(
con::DB,
source::String,
cols::Dict{Symbol,Vector{T}};
on::Symbol,
name::String,
tmp::Bool = false,
show::Bool = false,
) where T <: Union{Int64, Float64, String, Bool}
Create a table from a source (either a DuckDB table or a file), where columns can be set to vectors provided in a dictionary cols
. The keys are the new column names, and the vector values are the column entries. This transform is very similar to create_tbl
, except that the alternate source is a data structure in Julia.
The resulting table is saved as the table name
. The name of the created table is returned.
All other options behave as the two source version of create_tbl
.
TulipaIO.flow_from_json
— Methodflow_from_json(json)
Returns an array of from/to node names from a JSON document (as parsed by JSON3.jl):
[(from_name, to_name, Asset(...)), (..., ..., ...), ...]
TulipaIO.flow_from_json_impl!
— Methodflow_from_json_impl!(json, flows; find_edge)
Find all flows (from/to node names) from a JSON document.
json
: JSON documentflows
: The flows are returned by appending to this vectorfind_edge
: Function invoked asfind_edge(asset::JSON3.Object)
to find the flows originating from an asset
TulipaIO.get_table
— Methoddf = get_table(connection, table_name)
query = get_table(Val(:raw), connection, table_name)
Run the SELECT * FROM table_name
sql command.
The Val(:raw)
variant returns the raw output from DuckDB, otherwise we construct a DataFrame.
TulipaIO.get_tbl_name
— Methodget_tbl_name(source::String, tmp::Bool)
Generate table name from a filename by removing special characters. If tmp
is true, then the table name is prefixed by 't_'.
TulipaIO.json_get
— Methodjson_get(json, reference; trunc = 0)
Given a JSON document, find the object pointed to by the reference (e.g. "//@<key>.<array_idx>/@<key>"); truncate the last trunc
components of the reference.
TulipaIO.read_csv_folder
— Methodread_csv_folder(connection, folder)
Read all CSV files in the folder
and create a table for each in the connection
.
Keywords arguments
table_name_prefix = ""
table_name_suffix = ""
schemas = Dict()
Dictionary of dictionaries, where the inner dictionary is a table schema (partial schemas are allowed). The keys of the outer dictionary are the table names
TulipaIO.read_esdl_json
— Methodread_esdl_json(json_path)
This is the entry point for the parser. It reads the ESDL JSON file at json_path
and returns an array of from/to node names, along with a struct of Asset type. The Asset attribute values are determined by combining the attribute values of the from & to ESDL assets nodes. If the two nodes have conflicting asset values, an error is raised:
[(from_name, to_name, Asset(...)), (..., ..., ...), ...]
TulipaIO.reduce_unless
— Methodreduce_unless(fn, itr; init, sentinel)
A version of reduce
that stops if reduction returns sentinel
at any point
fn
: reduction functionitr
: iterator to reduceinit
: initial value (unlike standard, mandatory)sentinel
: stop if reduction returnssentinel
Returns reduced value, or sentinel
TulipaIO.resolve!
— Methodresolve!(field, values, errs)
Given a set of values
, ensure they are either all equal or nothing
. On failure, push field
to errs
.
field
: the field to push inerrs
to signal failurevalues
: values to checkerrs
: vector of field names with errors
Returns resolved value
TulipaIO.show_tables
— Methoddf = show_tables(connection)
query = show_tables(Val(:raw), connection)
Run the SHOW TABLES
sql command.
The Val(:raw)
variant returns the raw output from DuckDB, otherwise we construct a DataFrame.