Reference

Contents

Index

TulipaClustering.append_period_from_source_df_as_rp!Method
append_period_from_source_df_as_rp!(df; source_df, period, rp, key_columns)

Extracts a period with index period from source_df and appends it as a representative period with index rp to df, using key_columns as keys.

Examples

julia> source_df = DataFrame([:period => [1, 1, 2, 2], :time_step => [1, 2, 1, 2], :a .=> "b", :value => 5:8])
4×4 DataFrame
 Row │ period  time_step  a       value
     │ Int64   Int64      String  Int64
─────┼──────────────────────────────────
   1 │      1          1  b           5
   2 │      1          2  b           6
   3 │      2          1  b           7
   4 │      2          2  b           8

julia> df = DataFrame([:rep_period => [1, 1, 2, 2], :time_step => [1, 2, 1, 2], :a .=> "a", :value => 1:4])
4×4 DataFrame
 Row │ rep_period  time_step  a       value
     │ Int64       Int64      String  Int64
─────┼──────────────────────────────────────
   1 │          1          1  a           1
   2 │          1          2  a           2
   3 │          2          1  a           3
   4 │          2          2  a           4

julia> TulipaClustering.append_period_from_source_df_as_rp!(df; source_df, period = 2, rp = 3, key_columns = [:time_step, :a])
6×4 DataFrame
 Row │ rep_period  time_step  a       value
     │ Int64       Int64      String  Int64
─────┼──────────────────────────────────────
   1 │          1          1  a           1
   2 │          1          2  a           2
   3 │          2          1  a           3
   4 │          2          2  a           4
   5 │          3          1  b           7
   6 │          3          2  b           8
source
TulipaClustering.combine_periods!Method
combine_periods!(df)

Modifies a dataframe df by combining the columns time_step and period into a single column time_step of global time steps. The period duration is inferred automatically from the maximum time step value, assuming that periods start with time step 1.

Examples

julia> df = DataFrame([:period => [1, 1, 2], :time_step => [1, 2, 1], :value => 1:3])
3×3 DataFrame
 Row │ period  time_step  value
     │ Int64   Int64      Int64
─────┼──────────────────────────
   1 │      1          1      1
   2 │      1          2      2
   3 │      2          1      3

julia> TulipaClustering.combine_periods!(df)
3×2 DataFrame
 Row │ time_step  value
     │ Int64      Int64
─────┼──────────────────
   1 │         1      1
   2 │         2      2
   3 │         3      3
source
TulipaClustering.df_to_matrix_and_keysMethod
df_to_matrix_and_keys(df, key_columns)

Converts a dataframe df (in a long format) to a matrix, ignoring the columns specified as key_columns. The key columns are converted from long to wide format and returned alongside the matrix.

Examples

julia> df = DataFrame([:period => [1, 1, 2, 2], :time_step => [1, 2, 1, 2], :a .=> "a", :value => 1:4])
4×4 DataFrame
 Row │ period  time_step  a       value
     │ Int64   Int64      String  Int64
─────┼──────────────────────────────────
   1 │      1          1  a           1
   2 │      1          2  a           2
   3 │      2          1  a           3
   4 │      2          2  a           4

julia> m, k = TulipaClustering.df_to_matrix_and_keys(df, [:time_step, :a]); m
2×2 Matrix{Float64}:
 1.0  3.0
 2.0  4.0

julia> k
2×2 DataFrame
 Row │ time_step  a
     │ Int64      String
─────┼───────────────────
   1 │         1  a
   2 │         2  a
source
TulipaClustering.find_auxiliary_dataMethod
find_auxiliary_data(clustering_data)

Calculates auxiliary data associated with the clustering_data. These include:

  • key_columns_demand: key columns in the demand dataframe
  • key_columns_generation_availability: key columns in the generation availability dataframe
  • period_duration: duration of time periods (in time steps)
  • last_period_duration: duration of the last period
  • n_periods: total number of periods
source
TulipaClustering.find_period_weightsMethod
find_period_weights(period_duration, last_period_duration, n_periods, drop_incomplete_periods)

Finds weights of two different types of periods in the clustering data:

  • complete periods: these are all of the periods with length equal to period_duration.
  • incomplete last period: if last period duration is less than period_duration, it is incomplete.
source
TulipaClustering.find_representative_periodsMethod

findrepresentativeperiods( clusteringdata; nrp = 10, rescaledemanddata = true, dropincompletelastperiod = false, method = :kmeans, distance = SqEuclidean(), args..., )

Finds representative periods via data clustering.

  • clustering_data: the data to perform clustering on.
  • n_rp: number of representative periods to find.
  • rescale_demand_data: if true, demands are first divided by the maximum demand value, so that they are between zero and one like the generation availability data
  • drop_incomplete_last_period: controls how the last period is treated if it is not complete: if this parameter is set to true, the incomplete period is dropped and the weights are rescaled accordingly; otherwise, clustering is done for n_rp - 1 periods, and the last period is added as a special shorter representative period
  • method: clustering method to use, either :k_means and :k_medoids
  • distance: semimetric used to measure distance between data points.
  • other named arguments can be provided; they are passed to the clustering method.
source
TulipaClustering.fit_rep_period_weights!Method

fitrepperiodweights!(weightmatrix, clusteringmatrix, rpmatrix; weight_type, tol, args...)

Given the initial weight guesses, finds better weights for convex or conical combinations of representative periods. For conical weights, it is possible to bound the total weight by one.

The arguments:

  • clustering_result: the result of running TulipaClustering.find_representative_periods
  • weight_type: the type of weights to find; possible values are:
    • :convex: each period is represented as a convex sum of the representative periods (a sum with nonnegative weights adding into one)
    • :conical: each period is represented as a conical sum of the representative periods (a sum with nonnegative weights)
    • :conical_bounded: each period is represented as a conical sum of the representative periods (a sum with nonnegative weights) with the total weight bounded from above by one.
  • tol: algorithm's tolerance; when the weights are adjusted by a value less then or equal to tol, they stop being fitted further.
  • other arguments control the projected subgradient method; they are passed through to TulipaClustering.projected_subgradient_descent!.
source
TulipaClustering.fit_rep_period_weights!Method

fitrepperiodweights!(weightmatrix, clusteringmatrix, rpmatrix; weight_type, tol, args...)

Given the initial weight guesses, finds better weights for convex or conical combinations of representative periods. For conical weights, it is possible to bound the total weight by one.

The arguments:

  • weight_matrix: the initial guess for weights; the weights are adjusted using a projected subgradient descent method
  • clustering_matrix: the matrix of raw clustering data
  • rp_matrix: the matrix of raw representative period data
  • weight_type: the type of weights to find; possible values are:
    • :convex: each period is represented as a convex sum of the representative periods (a sum with nonnegative weights adding into one)
    • :conical: each period is represented as a conical sum of the representative periods (a sum with nonnegative weights)
    • :conical_bounded: each period is represented as a conical sum of the representative periods (a sum with nonnegative weights) with the total weight bounded from above by one.
  • tol: algorithm's tolerance; when the weights are adjusted by a value less then or equal to tol, they stop being fitted further.
  • show_progress: if true, a progress bar will be displayed.
  • other arguments control the projected subgradient method; they are passed through to TulipaClustering.projected_subgradient_descent!.
source
TulipaClustering.matrix_and_keys_to_dfMethod
matrix_and_keys_to_df(matrix, keys)

Converts a a matrix matrix to a dataframe, appending the key columns given by keys.

Examples

julia> m = [1.0 3.0; 2.0 4.0]
2×2 Matrix{Float64}:
 1.0  3.0
 2.0  4.0

julia> k = DataFrame([:time_step => 1:2, :a .=> "a"])
2×2 DataFrame
 Row │ time_step  a
     │ Int64      String
─────┼───────────────────
   1 │         1  a
   2 │         2  a

julia> TulipaClustering.matrix_and_keys_to_df(m, k)
4×4 DataFrame
 Row │ rep_period  time_step  a       value
     │ Int64       Int64      String  Float64
─────┼────────────────────────────────────────
   1 │          1          1  a           1.0
   2 │          1          2  a           2.0
   3 │          2          1  a           3.0
   4 │          2          2  a           4.0
source
TulipaClustering.projected_subgradient_descent!Method

projectedsubgradientdescent!(x; gradient, projection, niters, rtol, learningrate, adaptivegrad)

Fits x using the projected gradient descent scheme.

The arguments:

source
TulipaClustering.read_csv_with_schemaMethod
read_csv_with_schema(file_path, schema)

Reads the csv with file_name at location path validating the data using the schema. It is assumes that the file's header is at the second row. The first row of the file contains some metadata information that is not used.

source
TulipaClustering.split_into_periods!Method
split_into_periods!(df; period_duration=nothing)

Modifies a dataframe df by separating the column time_step into periods of length period_duration. The new data is written into two columns:

  • period: the period ID;
  • time_step: the time step within the current period.

If period_duration is nothing, then all of the time steps are within the same period with index 1.

Examples

julia> df = DataFrame([:time_step => 1:4, :value => 5:8])
4×2 DataFrame
 Row │ time_step  value
     │ Int64      Int64
─────┼──────────────────
   1 │         1      5
   2 │         2      6
   3 │         3      7
   4 │         4      8

julia> TulipaClustering.split_into_periods!(df; period_duration=2)
4×3 DataFrame
 Row │ period  time_step  value
     │ Int64   Int64      Int64
─────┼──────────────────────────
   1 │      1          1      5
   2 │      1          2      6
   3 │      2          1      7
   4 │      2          2      8

julia> df = DataFrame([:period => [1, 1, 2], :time_step => [1, 2, 1], :value => 1:3])
3×3 DataFrame
 Row │ period  time_step  value
     │ Int64   Int64      Int64
─────┼──────────────────────────
   1 │      1          1      1
   2 │      1          2      2
   3 │      2          1      3

julia> TulipaClustering.split_into_periods!(df; period_duration=1)
3×3 DataFrame
 Row │ period  time_step  value
     │ Int64   Int64      Int64
─────┼──────────────────────────
   1 │      1          1      1
   2 │      2          1      2
   3 │      3          1      3

julia> TulipaClustering.split_into_periods!(df)
3×3 DataFrame
 Row │ period  time_step  value
     │ Int64   Int64      Int64
─────┼──────────────────────────
   1 │      1          1      1
   2 │      1          2      2
   3 │      1          3      3
source
TulipaClustering.validate_df_and_find_key_columnsMethod
validate_df_and_find_key_columns(df)

Checks that dataframe df contains the necessary columns and returns a list of columns that act as keys (i.e., unique data identifiers within different periods).

Examples

julia> df = DataFrame([:period => [1, 1, 2], :time_step => [1, 2, 1], :a .=> "a", :value => 1:3])
3×4 DataFrame
 Row │ period  time_step  a       value
     │ Int64   Int64      String  Int64
─────┼──────────────────────────────────
   1 │      1          1  a           1
   2 │      1          2  a           2
   3 │      2          1  a           3

julia> TulipaClustering.validate_df_and_find_key_columns(df)
2-element Vector{Symbol}:
 :time_step
 :a

julia> df = DataFrame([:value => 1])
1×1 DataFrame
 Row │ value
     │ Int64
─────┼───────
   1 │     1

julia> TulipaClustering.validate_df_and_find_key_columns(df)
ERROR: DomainError with 1×1 DataFrame
 Row │ value
     │ Int64
─────┼───────
   1 │     1:
DataFrame must contain columns `time_step` and `value`
source
TulipaClustering.weight_matrix_to_dfMethod

weightmatrixto_df(weights)

Converts a weight matrix from a (sparse) matrix, which is more convenient for internal computations, to a dataframe, which is better for saving into a file. Zero weights are dropped to avoid cluttering the dataframe.

source
TulipaClustering.write_csv_with_prefixesMethod
write_csv_with_prefixes(file_path, df; prefixes)

Writes the dataframe df into a csv file at file_path. If prefixes are provided, they are written above the column names. For example, these prefixes can contain metadata describing the columns.

source