Analysis Methods#

Analysis Methods work on openoa.plant.PlantData objects to produce high level analyses, such as the long term AEP. These methods rely on the more generic utils modules, by chaining them together to create reproducible analysis workflows.

All models use mixin classes to provide three additional methods:

  • cls.from_dict(data_dictionary), which allows the use of creating a class from a dictionary of inputs that can be shared across workflows if particular settings work better, or users don’t wish to use a standard class definition interface.

  • cls.set_values(run_parameter_dictionary), which enables users to set any or all of the analysis parameters that are allowed to be manually set post-initialization (see cls.run_parameters for specific parameter listings per class, or the documentation of cls.run()).

  • cls.reset_defaults(which="""None, a single parameter, or a list of parameters"""), which allows a user to reset all of the analysis parameters back the class defaults.

class openoa.analysis.aep.MonteCarloAEP(plant, reg_temperature=False, reg_wind_direction=False, reanalysis_products: Sequence | str | int | float | None = None, uncertainty_meter=0.005, uncertainty_losses=0.05, uncertainty_windiness=(10.0, 20.0), uncertainty_loss_max=(10.0, 20.0), outlier_detection=False, uncertainty_outlier=(1.0, 3.0), uncertainty_nan_energy=0.01, time_resolution: str = 'MS', end_date_lt: str | pd.Timestamp = None, reg_model='lin', ml_setup_kwargs={})[source]#

A serial (Pandas-driven) implementation of the benchmark PRUF operational analysis implementation. This module collects standard processing and analysis methods for estimating plant level operational AEP and uncertainty.

The preprocessing should run in this order:
  1. Process revenue meter energy - creates monthly/daily data frame, gets revenue meter on monthly/daily basis, and adds data flag

  2. Process loss estimates - add monthly/daily curtailment and availabilty losses to monthly/daily data frame

  3. Process reanalysis data - add monthly/daily density-corrected wind speeds, temperature (if used) and wind direction (if used) from several reanalysis products to the monthly data frame

  4. Set up Monte Carlo - create the necessary Monte Carlo inputs to the OA process

  5. Run AEP Monte Carlo - run the OA process iteratively to get distribution of AEP results

The end result is a distribution of AEP results which we use to assess expected AEP and associated uncertainty

Parameters:
  • plant (PlantData) – PlantData object from which PlantAnalysis should draw data.

  • reg_temperature (bool) – Indicator to include temperature (True) or not (False) as a regression input. Defaults to False.

  • reg_wind_direction (bool) – Indicator to include wind direction (True) or not (False) as a regression input. Defaults to False.

  • reanalysis_products (list[str]) – List of reanalysis products to use for Monte Carlo sampling. Defaults to None, which pulls all the products contained in plant.reanalysis.

  • uncertainty_meter (float) – Uncertainty on revenue meter data. Defaults to 0.005.

  • uncertainty_losses (float) – Uncertainty on long-term losses. Defaults to 0.05.

  • uncertainty_windiness (tuple[int, int]) – number of years to use for the windiness correction. Defaults to (10, 20).

  • uncertainty_loss_max (tuple[int, int]) – Threshold for the combined availabilty and curtailment monthly loss threshold. Defaults to (10, 20).

  • outlier_detection (bool) – whether to perform (True) or not (False - default) outlier detection filtering. Defaults to False.

  • uncertainty_outlier (tuple[float, float]) – Min and max thresholds (Monte-Carlo sampled) for the outlier detection filter. At monthly resolution, this is the tuning constant for Huber’s t function for a robust linear regression. At daily/hourly resolution, this is the number of stdev of wind speed used as threshold for the bin filter. Defaults to (1, 3).

  • uncertainty_nan_energy (float) – Threshold to flag days/months based on NaNs. Defaults to 0.01.

  • time_resolution (string) – whether to perform the AEP calculation at monthly (“ME” or “MS”), daily (“D”) or hourly (“h”) time resolution. Defaults to “MS”.

  • end_date_lt (string or pandas.Timestamp) – The last date to use for the long-term correction. Note that only the component of the date corresponding to the time_resolution argument is considered. If None, the end of the last complete month of reanalysis data will be used. Defaults to None.

  • reg_model (string) – Which model to use for the regression (“lin” for linear, “gam” for, general additive, “gbm” for gradient boosting, or “etr” for extra treees). At monthly time resolution only linear regression is allowed because of the reduced number of data points. Defaults to “lin”.

  • ml_setup_kwargs (kwargs) – Keyword arguments to openoa.utils.machine_learning_setup.MachineLearningSetup class. Defaults to {}.

Method generated by attrs for class MonteCarloAEP.

run(num_sim: int, reg_model: str = None, reanalysis_products: list[str] = None, uncertainty_meter: float = None, uncertainty_losses: float = None, uncertainty_windiness: float | tuple[float, float] = None, uncertainty_loss_max: float | tuple[float, float] = None, outlier_detection: bool = None, uncertainty_outlier: float | tuple[float, float] = None, uncertainty_nan_energy: float = None, time_resolution: str = None, end_date_lt: str | Timestamp | None = None, ml_setup_kwargs: dict = None) None[source]#

Process all appropriate data and run the MonteCarlo AEP analysis.

Note

If None is provided to any of the inputs, then the last used input value will be used for the analysis, and if no prior values were set, then this is the model’s defaults.

Parameters:
  • num_sim (int) – number of simulations to perform

  • reanal_products(objlist[str]) : List of reanalysis products to use for Monte Carlo sampling. Defaults to None, which pulls all the products contained in plant.reanalysis.

  • uncertainty_meter (float) – Uncertainty on revenue meter data. Defaults to 0.005.

  • uncertainty_losses (float) – Uncertainty on long-term losses. Defaults to 0.05.

  • uncertainty_windiness (tuple[int, int]) – number of years to use for the windiness correction. Defaults to (10, 20).

  • uncertainty_loss_max (tuple[int, int]) – Threshold for the combined availabilty and curtailment monthly loss threshold. Defaults to (10, 20).

  • outlier_detection (bool) – whether to perform (True) or not (False - default) outlier detection filtering. Defaults to False.

  • uncertainty_outlier (tuple[float, float]) – Min and max thresholds (Monte-Carlo sampled) for the outlier detection filter. At monthly resolution, this is the tuning constant for Huber’s t function for a robust linear regression. At daily/hourly resolution, this is the number of stdev of wind speed used as threshold for the bin filter. Defaults to (1, 3).

  • uncertainty_nan_energy (float) – Threshold to flag days/months based on NaNs. Defaults to 0.01.

  • time_resolution (string) – whether to perform the AEP calculation at monthly (“ME” or “MS”), daily (“D”) or hourly (“h”) time resolution. Defaults to “ME”.

  • end_date_lt (string or pandas.Timestamp) – The last date to use for the long-term correction. Note that only the component of the date corresponding to the time_resolution argument is considered. If None, the end of the last complete month of reanalysis data will be used. Defaults to None.

  • reg_model (string) – Which model to use for the regression (“lin” for linear, “gam” for, general additive, “gbm” for gradient boosting, or “etr” for extra treees). At monthly time resolution only linear regression is allowed because of the reduced number of data points. Defaults to “lin”.

  • ml_setup_kwargs (kwargs) – Keyword arguments to openoa.utils.machine_learning_setup.MachineLearningSetup class. Defaults to {}.

Returns:

None

groupby_time_res(df)[source]#

Group pandas dataframe based on the time resolution chosen in the calculation.

Parameters:

df (dataframe) – dataframe that needs to be grouped based on time resolution used

Returns:

None

calculate_aggregate_dataframe()[source]#

Perform pre-processing of the plant data to produce a monthly/daily data frame to be used in AEP analysis.

process_revenue_meter_energy()[source]#
Initial creation of monthly data frame:
  1. Populate monthly/daily data frame with energy data summed from 10-min QC’d data

  2. For each monthly/daily value, find percentage of NaN data used in creating it and flag if percentage is greater than 0

process_loss_estimates()[source]#

Append availability and curtailment losses to monthly data frame.

process_reanalysis_data()[source]#
Process reanalysis data for use in PRUF plant analysis:
  • calculate density-corrected wind speed and wind components

  • get monthly/daily average wind speeds and components

  • calculate monthly/daily average wind direction

  • calculate monthly/daily average temperature

  • append monthly/daily averages to monthly/daily energy data frame

trim_monthly_df()[source]#

Remove first and/or last month of data if the raw data had an incomplete number of days.

calculate_long_term_losses()[source]#

This function calculates long-term availability and curtailment losses based on the reported data grouped by the time resolution, filtering for those data that are deemed representative of average plant performance.

setup_monte_carlo_inputs()[source]#

Create and populate the data frame defining the simulation parameters. This data frame is stored as self.mc_inputs

filter_outliers(n)[source]#

This function filters outliers based on a combination of range filter, unresponsive sensor filter, and window filter.

We use a memoized funciton to store the regression data in a dictionary for each combination as it comes up in the Monte Carlo simulation. This saves significant computational time in not having to run robust linear regression for each Monte Carlo iteration.

Parameters:

n (float) – Monte Carlo iteration

Returns:

Filtered monthly/daily data ready for linear regression

Return type:

pandas.DataFrame

set_regression_data(n)[source]#

This will be called for each iteration of the Monte Carlo simulation and will do the following:

  1. Randomly sample monthly/daily revenue meter, availabilty, and curtailment data based on specified uncertainties and correlations

  2. Randomly choose one reanalysis product

  3. Calculate gross energy from randomzied energy data

  4. Normalize gross energy to 30-day months

  5. Filter results to remove months/days with NaN data and with combined losses that exceed the Monte Carlo sampled max threhold

  6. Return the wind speed and normalized gross energy to be used in the regression relationship

Parameters:

n (int) – The Monte Carlo iteration number

Returns:

Monte-Carlo sampled wind speeds and other variables (temperature, wind direction) if used in the regression pandas.Series: Monte-Carlo sampled normalized gross energy

Return type:

pandas.Series

run_regression(n)[source]#

Run robust linear regression between Monte-Carlo generated monthly/daily gross energy, wind speed, temperature and wind direction (if used)

Parameters:

n (int) – The Monte Carlo iteration number.

Returns:

A trained regression model.

run_AEP_monte_carlo()[source]#

Loop through OA process a number of times and return array of AEP results each time

Returns:

numpy.ndarray Array of AEP, long-term avail, long-term curtailment calculations

sample_long_term_reanalysis()[source]#

This function returns the long-term monthly/daily wind speeds based on the Monte-Carlo generated sample of:

  1. The reanalysis product

  2. The number of years to use in the long-term correction

Returns:

the windiness-corrected or ‘long-term’ monthly/daily wind speeds

Return type:

pandas.DataFrame

sample_long_term_losses(gross_lt)[source]#

This function calculates long-term availability and curtailment losses based on the Monte Carlo sampled historical availability and curtailment data. To estimate long-term losses, average percentage monthly losses are weighted by monthly long-term gross energy.

Parameters:

gross_lt (pandas.Series) – Time series of long-term gross energy

Returns:

long-term availability loss expressed as fraction float: long-term curtailment loss expressed as fraction

Return type:

float

plot_normalized_monthly_reanalysis_windspeed(xlim: tuple[datetime, datetime] = (None, None), ylim: tuple[float, float] = (None, None), return_fig: bool = False, figure_kwargs: dict = {}, plot_kwargs: dict = {}, legend_kwargs: dict = {}) None | tuple[Figure, Axes][source]#

Make a plot of the normalized annual average wind speeds from reanalysis data to show general trends for each, and highlighting the period of record for the plant data.

Parameters:
  • aep (openoa.analysis.MonteCarloAEP) – An initialized MonteCarloAEP object.

  • xlim (tuple[datetime.datetime, datetime.datetime], optional) – A tuple of datetimes representing the x-axis plotting display limits. Defaults to (None, None).

  • ylim (tuple[float, float], optional) – A tuple of the y-axis plotting display limits. Defaults to (None, None).

  • return_fig (bool, optional) – Flag to return the figure and axes objects. Defaults to False.

  • figure_kwargs (dict, optional) – Additional figure instantiation keyword arguments that are passed to plt.figure(). Defaults to {}.

  • plot_kwargs (dict, optional) – Additional plotting keyword arguments that are passed to ax.plot(). Defaults to {}.

  • legend_kwargs (dict, optional) – Additional legend keyword arguments that are passed to ax.legend(). Defaults to {}.

Returns:

If return_fig is

True, then the figure and axes objects are returned for further tinkering/saving.

Return type:

None | tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes]

plot_reanalysis_gross_energy_data(outlier_threshold: int, xlim: tuple[float, float] = (None, None), ylim: tuple[float, float] = (None, None), return_fig: bool = False, figure_kwargs: dict = {}, plot_kwargs: dict = {}, legend_kwargs: dict = {}) None | tuple[Figure, Axes][source]#

Makes a plot of the gross energy vs wind speed for each reanalysis product, with outliers highlighted in a contrasting color and separate marker.

Parameters:
  • reanalysis (dict[str, pandas.DataFrame]) – PlantData.reanalysis dictionary of reanalysis DataFrame.

  • outlier_thres (float) – outlier threshold (typical range of 1 to 4) which adjusts outlier sensitivity detection.

  • xlim (tuple[float, float], optional) – A tuple of datetimes representing the x-axis plotting display limits. Defaults to (None, None).

  • ylim (tuple[float, float], optional) – A tuple of the y-axis plotting display limits. Defaults to (None, None).

  • return_fig (bool, optional) – Flag to return the figure and axes objects. Defaults to False.

  • figure_kwargs (dict, optional) – Additional figure instantiation keyword arguments that are passed to plt.figure(). Defaults to {}.

  • plot_kwargs (dict, optional) – Additional plotting keyword arguments that are passed to ax.scatter(). Defaults to {}.

  • legend_kwargs (dict, optional) – Additional legend keyword arguments that are passed to ax.legend(). Defaults to {}.

Returns:

If return_fig is True, then

the figure and axes objects are returned for further tinkering/saving.

Return type:

None | tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes]

plot_aggregate_plant_data_timeseries(xlim: tuple[datetime, datetime] = (None, None), ylim_energy: tuple[float, float] = (None, None), ylim_loss: tuple[float, float] = (None, None), return_fig: bool = False, figure_kwargs: dict = {}, plot_kwargs: dict = {}, legend_kwargs: dict = {})[source]#

Plot timeseries of monthly/daily gross energy, availability and curtailment.

Parameters:
  • data (pandas.DataFrame) – A pandas DataFrame containing energy production and losses.

  • energy_col (str) – The name of the column in data containing the energy production.

  • loss_cols (list[str]) – The name(s) of the column(s) in data containing the loss data.

  • energy_label (str) – The legend label and y-axis label for the energy plot.

  • loss_labels (list[str]) – The legend labels losses plot.

  • xlim (tuple[datetime.datetime, datetime.datetime], optional) – A tuple of datetimes representing the x-axis plotting display limits. Defaults to None.

  • ylim_energy (tuple[float, float], optional) – A tuple of the y-axis plotting display limits for the gross energy plot (top figure). Defaults to None.

  • ylim_loss (tuple[float, float], optional) – A tuple of the y-axis plotting display limits for the loss plot (bottom figure). Defaults to (None, None).

  • return_fig (bool, optional) – Flag to return the figure and axes objects. Defaults to False.

  • figure_kwargs (dict, optional) – Additional figure instantiation keyword arguments that are passed to plt.figure(). Defaults to {}.

  • plot_kwargs (dict, optional) – Additional plotting keyword arguments that are passed to ax.scatter(). Defaults to {}.

  • legend_kwargs (dict, optional) – Additional legend keyword arguments that are passed to ax.legend(). Defaults to {}.

Returns:

If return_fig is True, then the figure and axes objects are returned for further tinkering/saving.

Return type:

None | tuple[matplotlib.pyplot.Figure, tuple[matplotlib.pyplot.Axes, matplotlib.pyplot.Axes]]

plot_result_aep_distributions(xlim_aep: tuple[float, float] = (None, None), xlim_availability: tuple[float, float] = (None, None), xlim_curtail: tuple[float, float] = (None, None), ylim_aep: tuple[float, float] = (None, None), ylim_availability: tuple[float, float] = (None, None), ylim_curtail: tuple[float, float] = (None, None), return_fig: bool = False, figure_kwargs: dict = {}, plot_kwargs: dict = {}, annotate_kwargs: dict = {}) None | tuple[Figure, Axes][source]#

Plot a distribution of AEP values from the Monte-Carlo OA method

Parameters:
  • xlim_aep (tuple[float, float], optional) – A tuple of floats representing the x-axis plotting display limits for the AEP subplot. Defaults to (None, None).

  • xlim_availability (tuple[float, float], optional) – A tuple of floats representing the x-axis plotting display limits for the availability subplot. Defaults to (None, None).

  • xlim_curtail (tuple[float, float], optional) – A tuple of floats representing the x-axis plotting display limits for the curtailment subplot. Defaults to (None, None).

  • ylim_aep (tuple[float, float], optional) – A tuple of floats representing the y-axis plotting display limits for the AEP subplot. Defaults to (None, None).

  • ylim_availability (tuple[float, float], optional) – A tuple of floats representing the y-axis plotting display limits for the availability subplot. Defaults to (None, None).

  • ylim_curtail (tuple[float, float], optional) – A tuple of floats representing the y-axis plotting display limits for the curtailment subplot. Defaults to (None, None).

  • return_fig (bool, optional) – Flag to return the figure and axes objects. Defaults to False.

  • figure_kwargs (dict, optional) – Additional figure instantiation keyword arguments that are passed to plt.figure(). Defaults to {}.

  • plot_kwargs (dict, optional) – Additional plotting keyword arguments that are passed to ax.hist(). Defaults to {}.

  • annotate_kwargs (dict, optional) – Additional annotation keyword arguments that are passed to ax.annotate(). Defaults to {}.

Returns:

If return_fig is True, then

the figure and axes objects are returned for further tinkering/saving.

Return type:

None | tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes]

plot_aep_boxplot(x: Series, xlabel: str, ylim: tuple[float, float] = (None, None), with_points: bool = False, points_label: str = 'Individual AEP Estimates', return_fig: bool = False, figure_kwargs: dict = {}, plot_kwargs_box: dict = {}, plot_kwargs_points: dict = {}, legend_kwargs: dict = {}) None | tuple[Figure, Axes][source]#

Plot box plots of AEP results sliced by a specified Monte Carlo parameter

Parameters:
  • x (pandas.Series) – The data that splits the results in y.

  • xlabel (str) – The x-axis label.

  • ylim (tuple[float, float], optional) – A tuple of the y-axis plotting display limits. Defaults to None.

  • with_points (bool, optional) – Flag to plot the individual points like a seaborn swarmplot. Defaults to False.

  • points_label (bool | None, optional) – Legend label for the points, if plotting. Defaults to None.

  • return_fig (bool, optional) – Flag to return the figure and axes objects. Defaults to False.

  • figure_kwargs (dict, optional) – Additional figure instantiation keyword arguments that are passed to plt.figure(). Defaults to {}.

  • plot_kwargs_box (dict, optional) – Additional plotting keyword arguments that are passed to ax.boxplot(). Defaults to {}.

  • plot_kwargs_points (dict, optional) – Additional plotting keyword arguments that are passed to ax.boxplot(). Defaults to {}.

  • legend_kwargs (dict, optional) – Additional legend keyword arguments that are passed to ax.legend(). Defaults to {}.

Returns:

If return_fig is

True, then the figure object, axes object, and a dictionary of the boxplot objects are returned for further tinkering/saving.

Return type:

None | tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes, dict]

class openoa.analysis.turbine_long_term_gross_energy.TurbineLongTermGrossEnergy(plant, UQ=True, num_sim=20000, reanalysis_products: Sequence | str | int | float | None = None, uncertainty_scada=0.005, wind_bin_threshold: NDArrayFloat = (1.0, 3.0), max_power_filter: NDArrayFloat = (0.8, 0.9), correction_threshold: NDArrayFloat = (0.85, 0.95))[source]#

Calculates long-term gross energy for each turbine in a wind farm using methods implemented in the utils subpackage for data processing and analysis.

The method proceeds as follows:

  1. Filter turbine data for normal operation

  2. Calculate daily means of wind speed, wind direction, and air density from reanalysis products

  3. Calculate daily sums of energy from each turbine

  4. Fit daily data (features are atmospheric variables, response is turbine power) using a generalized additive model (GAM)

  5. Apply model results to long-term atmospheric varaibles to calculate long term gross energy for each turbine

A Monte Carlo approach is implemented to obtain distribution of results, from which uncertainty can be quantified for the long-term gross energy estimate. A pandas DataFrame of long-term gross energy values is produced, containing each turbine in the wind farm. Note that this gross energy metric does not back out losses associated with waking or turbine performance. Rather, gross energy in this context is what turbine would have produced under normal operation (i.e. excluding downtime and underperformance).

Required schema of PlantData:

  • _scada_freq

  • reanalysis products with columns [‘time’, ‘WMETR_HorWdSpdU’, ‘WMETR_HorWdSpdV’, ‘WMETR_HorWdSpd’, ‘WMETR_AirDen’]

  • scada with columns: [‘time’, ‘asset_id’, ‘WMET_HorWdSpd’, ‘WTUR_W’, ‘WTUR_SupWh’]

Parameters:
  • UQ (bool) – Indicator to perform (True) or not (False) uncertainty quantification.

  • num_sim (int) – Number of simulations to run when UQ is True, otherwise set to 1. Defaults to 20000.

  • uncertainty_scada (float) – Uuncertainty imposed to the SCADA data when UQ is True only Defaults to 0.005.

  • reanalysis_products(objlist[str]) : List of reanalysis products to use for Monte Carlo sampling. Defaults to None, which pulls all the products contained in plant.reanalysis.

  • wind_bin_threshold (tuple) – The filter threshold for each vertical bin, expressed as number of standard deviations from the median in each bin. When UQ is True, then this should be a tuple of the lower and upper limits of this threshold, otherwise a single value should be used. Defaults to (1.0, 3.0)

  • max_power_filter (tuple) – Maximum power threshold, in the range (0, 1], to which the bin filter should be applied. When UQ is True, then this should be a tuple of the lower and upper limits of this filter, otherwise a single value should be used. Defaults to (0.8, 0.9).

  • correction_threshold (tuple) – The threshold, in the range of (0, 1], above which daily scada energy data should be corrected. When UQ is True, then this should be a tuple of the lower and upper limits of this threshold, otherwise a single value should be used. Defaults to (0.85, 0.95)

Method generated by attrs for class TurbineLongTermGrossEnergy.

run(num_sim: int | None = None, reanalysis_products: list[str] | None = None, uncertainty_scada: float | None = None, wind_bin_threshold: float | tuple[float, float] | None = None, max_power_filter: float | tuple[float, float] | None = None, correction_threshold: float | tuple[float, float] | None = None) None[source]#

Pre-process the run-specific data settings for each simulation, then fit and apply the model for each simualtion.

Note

If None is provided to any of the inputs, then the last used input value will be used for the analysis, and if no prior values were set, then this is the model’s defaults.

Parameters:
  • num_sim (int) – Number of simulations to run when UQ is True, otherwise set to 1. Defaults to 20000.

  • uncertainty_scada (float) – Uuncertainty imposed to the SCADA data when UQ is True only Defaults to 0.005.

  • reanalysis_products(objlist[str]) : List of reanalysis products to use for Monte Carlo sampling. Defaults to None, which pulls all the products contained in plant.reanalysis.

  • wind_bin_threshold (tuple) – The filter threshold for each vertical bin, expressed as number of standard deviations from the median in each bin. When UQ is True, then this should be a tuple of the lower and upper limits of this threshold, otherwise a single value should be used. Defaults to (1.0, 3.0)

  • max_power_filter (tuple) – Maximum power threshold, in the range (0, 1], to which the bin filter should be applied. When UQ is True, then this should be a tuple of the lower and upper limits of this filter, otherwise a single value should be used. Defaults to (0.8, 0.9).

  • correction_threshold (tuple) – The threshold, in the range of (0, 1], above which daily scada energy data should be corrected. When UQ is True, then this should be a tuple of the lower and upper limits of this threshold, otherwise a single value should be used. Defaults to (0.85, 0.95)

setup_inputs() None[source]#

Create and populate the data frame defining the simulation parameters. This data frame is stored as self._inputs

sort_scada_by_turbine() None[source]#

Sorts the SCADA DataFrame by the asset_id and timestamp index columns, respectively.

filter_turbine_data() None[source]#

Apply a set of filtering algorithms to the turbine wind speed vs power curve to flag data not representative of normal turbine operation

Performs the following manipulations:
  1. Drops any scada rows that don’t have any windspeed or energy data

  2. Flags windspeed values outside the range [0, 40]

  3. Flags windspeed values that have stayed the same for at least 3 straight readings

  4. Flags power values less than 2% of turbine capacity when wind speed above cut-in

  5. Flags windspeed and power values that don’t mutually coincide within a reasonable range

  6. Combine the flags using an “or” combination to be a new column in scada: “flag_final”

setup_daily_reanalysis_data() None[source]#

Process reanalysis data to daily means for later use in the GAM model.

filter_sum_impute_scada() None[source]#

Filter SCADA data for unflagged data, gather SCADA energy data into daily sums, and correct daily summed energy based on amount of missing data and a threshold limit. Finally impute missing data for each turbine based on reported energy data from other highly correlated turbines. threshold

setupturbine_model_dict() None[source]#

Setup daily atmospheric variable averages and daily energy sums by turbine.

fit_model() None[source]#

Fit the daily turbine energy sum and atmospheric variable averages using a GAM model using wind speed, wind direction, and air density.

apply_model(i: int) None[source]#

Apply the model to the reanalysis data to calculate long-term gross energy for each turbine.

Parameters:

i (int) – The Monte Carlo iteration number.

plot_filtered_power_curves(turbines: list[str] | None = None, flag_labels: tuple[str, str] | None = None, max_cols: int = 3, xlim: tuple[float, float] = (None, None), ylim: tuple[float, float] = (None, None), legend: bool = False, return_fig: bool = False, figure_kwargs: dict = {}, legend_kwargs: dict = {}, plot_kwargs: dict = {})[source]#

Plot the raw and flagged power curve data.

Parameters:
  • turbines (list[str], optional) – The list of turbines to be plot, if not all of the keys in data.

  • flag_labels (tuple[str, str], optional) – The labels to give to the scatter points, where the first entryis the flagged points, and the second entry correpsponds to the standard power curve. Defaults to None.

  • max_cols (int, optional) – The maximum number of columns in the plot. Defaults to 3.

  • xlim (tuple[float, float], optional) – A tuple of the x-axis (min, max) values. Defaults to (None, None).

  • ylim (tuple[float, float], optional) – A tuple of the y-axis (min, max) values. Defaults to (None, None).

  • legend (bool, optional) – Set to True to place a legend in the figure, otherwise set to False. Defaults to False.

  • return_fig (bool, optional) – Set to True to return the figure and axes objects, otherwise set to False. Defaults to False.

  • figure_kwargs (dict, optional) – Additional keyword arguments that should be passed to plt.figure(). Defaults to {}.

  • plot_kwargs (dict, optional) – Additional keyword arguments that should be passed to ax.scatter(). Defaults to {}.

  • legend_kwargs (dict, optional) – Additional keyword arguments that should be passed to ax.legend(). Defaults to {}.

Returns:

If return_fig is True, then

the figure and axes objects are returned for further tinkering/saving.

Return type:

None | tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes]

plot_daily_fitting_result(turbines: list[str] | None = None, flag_labels: tuple[str, str, str] = ('Modeled', 'Imputed', 'Input'), max_cols: int = 3, xlim: tuple[float, float] = (None, None), ylim: tuple[float, float] = (None, None), legend: bool = False, return_fig: bool = False, figure_kwargs: dict = {}, legend_kwargs: dict = {}, plot_kwargs: dict = {})[source]#

Plot the raw, imputed, and modeled power curve data.

Parameters:
  • turbines (list[str], optional) – The list of turbines to be plot, if not all of the keys in data.

  • labels (tuple[str, str], optional) – The labels to give to the scatter points, corresponding to the modeled, imputed, and input data, respectively. Defaults to (“Modeled”, “Imputed”, “Input”).

  • max_cols (int, optional) – The maximum number of columns in the plot. Defaults to 3.

  • xlim (tuple[float, float], optional) – A tuple of the x-axis (min, max) values. Defaults to (None, None).

  • ylim (tuple[float, float], optional) – A tuple of the y-axis (min, max) values. Defaults to (None, None).

  • legend (bool, optional) – Set to True to place a legend in the figure, otherwise set to False. Defaults to False.

  • return_fig (bool, optional) – Set to True to return the figure and axes objects, otherwise set to False. Defaults to False.

  • figure_kwargs (dict, optional) – Additional keyword arguments that should be passed to plt.figure(). Defaults to {}.

  • plot_kwargs (dict, optional) – Additional keyword arguments that should be passed to ax.scatter(). Defaults to {}.

  • legend_kwargs (dict, optional) – Additional keyword arguments that should be passed to ax.legend(). Defaults to {}.

Returns:

If :py:attr`return_fig`

is True, then the figure and axes objects are returned for further tinkering/saving.

Return type:

None | tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes]

class openoa.analysis.electrical_losses.ElectricalLosses(plant, UQ: bool = False, num_sim=20000, uncertainty_meter: float = 0.005, uncertainty_scada: float = 0.005, uncertainty_correction_threshold: ndarray[Any, dtype[float64]] | tuple[float, float] | float = (0.9, 0.995))[source]#

A serial implementation of calculating the average monthly and annual electrical losses at a wind power plant, and the associated uncertainty. Energy output from the turbine SCADA meter and the wind plant revenue meter are used to estimate electrical losses.

First, the daily sums of turbine and revenue meter energy are calculated over the plant’s period of record where all turbines and the revenue meter contan every considered timestep. Electrical losses are then calculated as the difference between the total turbine energy production and the meter production over those concurrent days.

For uncertainty quantification, a Monte Carlo (MC) approach is used to sample the revenue meter data and SCADA data with a default 0.5% imposed uncertainty, alongside a sampled filtering parameter. The uncertainty in estimated electrical losses is quantified as the standard deviation of the distribution of losses obtained from the MC sampling.

If the revenue meter data is not provided on a daily or sub-daily basis (e.g. monthly), the the sum of daily turbine energy is corrected for any missing reported energy data from the turbines based on the ratio of expected number of data points per day to the actual data points available. The daily corrected sum of turbine energy is then summed on a monthly basis. Electrical loss is then the difference between the total corrected turbine energy production and meter production over those concurrent months.

Parameters:
  • plant (PlantData) – A openoa.plant.PlantData object that has been validated with at least :py:attr:`openoa.plant.PlantData.analysis_type = “ElectricalLosses”.

  • UQ (bool) – Indicator to perform (True) or not (False) uncertainty quantification.

  • num_sim (int) – Number of Monte Carlo simulations to perform.

  • uncertainty_meter (float) – Uncertainty imposed on the revenue meter data (for UQ = True case).

  • uncertainty_scada (float) – Uncertainty imposed on the scada data (for UQ = True case).

  • uncertainty_correction_threshold (tuple | float) – Data availability thresholds, in the range of (0, 1), under which months should be eliminated. If UQ = True, then a 2-element tuple containing an upper and lower bound for a randomly selected value should be given, otherwise, a scalar value should be provided.

Method generated by attrs for class ElectricalLosses.

run(num_sim: int | None = None, uncertainty_meter: ndarray[Any, dtype[float64]] | float = None, uncertainty_scada: ndarray[Any, dtype[float64]] | float = None, uncertainty_correction_threshold: ndarray[Any, dtype[float64]] | tuple[float, float] | float = None)[source]#

Run the electrical losses calculation.

Note

If None is provided to any of the inputs, then the last used input value will be used for the analysis, and if no prior values were set, then this is the model’s defaults.

Parameters:
  • num_sim (int) – Number of Monte Carlo simulations to perform.

  • uncertainty_meter (float) – Uncertainty imposed on the revenue meter data (for UQ = True case).

  • uncertainty_scada (float) – Uncertainty imposed on the scada data (for UQ = True case).

  • uncertainty_correction_threshold (tuple | float) – Data availability thresholds, in the range of (0, 1], under which months should be eliminated. If UQ = True, then a 2-element tuple containing an upper and lower bound for a randomly selected value should be given, otherwise, a scalar value should be provided.

setup_inputs()[source]#

Create and populate the data frame defining the simulation parameters. This data frame is stored as self.inputs.

process_scada()[source]#

Calculate daily sum of turbine energy only for days when all turbines are reporting at all time steps.

process_meter()[source]#

Calculate daily sum of meter energy only for days when meter data is reporting at all time steps.

calculate_electrical_losses()[source]#

Apply Monte Carlo approach to calculate electrical losses and their uncertainty based on the difference in the sum of turbine and metered energy over the compiled days.

plot_monthly_losses(xlim: tuple[datetime | None, datetime | None] = (None, None), ylim: tuple[float | None, float | None] = (None, None), return_fig: bool = False, figure_kwargs: dict = {}, legend_kwargs: dict = {}, plot_kwargs: dict = {}) None | tuple[Figure, Axes][source]#

Plots the monthly timeseries of electrical losses as a percent.

Parameters:
  • xlim( – obj: tuple[float, float], optional): A tuple of the x-axis (min, max) values. Defaults to (None, None).

  • ylim( – obj: tuple[float, float], optional): A tuple of the y-axis (min, max) values. Defaults to (None, None).

  • return_fig (bool, optional) – Set to True to return the figure and axes objects, otherwise set to False. Defaults to False.

  • figure_kwargs (dict, optional) – Additional keyword arguments that should be passed to plt.figure(). Defaults to {}.

  • scatter_kwargs (dict, optional) – Additional keyword arguments that should be passed to ax.plot(). Defaults to {}.

  • legend_kwargs (dict, optional) – Additional keyword arguments that should be passed to ax.legend(). Defaults to {}.

Returns:

If return_fig, then return the figure

and axes objects in addition to showing the plot.

Return type:

None | tuple[plt.Figure, plt.Axes]

class openoa.analysis.eya_gap_analysis.EYAGapAnalysis(plant: PlantData, eya_estimates: dict, oa_results: dict, data: list = _Nothing.NOTHING, compiled_data: list = _Nothing.NOTHING)[source]#

Performs a gap analysis between the estimated annual energy production (AEP) from an energy yield estimate (EYA) and the actual AEP as measured from an operational assessment (OA).

The gap analysis is based on comparing the following three key metrics:

  1. Availability loss

  2. Electrical loss

  3. Sum of turbine ideal energy

Here turbine ideal energy is defined as the energy produced during ‘normal’ or ‘ideal’ turbine operation, i.e., no downtime or considerable underperformance events. This value encompasses several different aspects of an EYA (wind resource estimate, wake losses,turbine performance, and blade degradation) and in most cases should have the largest impact in a gap analysis relative to the first two metrics.

This gap analysis method is fairly straighforward. Relevant EYA and OA metrics are passed in when defining the class, differences in EYA estimates and OA results are calculated, and then a ‘waterfall’ plot is created showing the differences between the EYA and OA-estimated AEP values and how they are linked from differences in the three key metrics.

Parameters:
  • plant (PlantData object) – PlantData object from which EYAGapAnalysis should draw data.

  • eya_estimates (EYAEstimate) – Numpy array with EYA estimates listed in required order

  • oa_results (OAResults) – Numpy array with OA results listed in required order.

Method generated by attrs for class EYAGapAnalysis.

run()[source]#

Run the EYA Gap analysis functions in order by calling this function.

Parameters:

(None)

Returns:

(None)

compile_data()[source]#

Compiles the EYA and OA metrics, and computes the differences.

Returns:

The list of EYA AEP, and differences in turbine gross energy,

availability losses, electrical losses, and unaccounted losses.

Return type:

list[float]

plot_waterfall(index: list[str] = ['EYA AEP', 'TIE', 'Availability\nLosses', 'Electrical\nLosses', 'Unexplained', 'OA AEP'], ylabel: str = 'Energy (GWh/yr)', ylim: tuple[float, float] = (None, None), return_fig: bool = False, plot_kwargs: dict = {}, figure_kwargs: dict = {}) None | tuple[source]#

Produce a waterfall plot showing the progression from the EYA estimates to the calculated OA estimates of AEP.

Parameters:
  • index (list) – List of string values to be used for x-axis labels, which should have one more value than the number of points in data to account for the resulting OA total. Defaults to [“EYA AEP”, “TIE”, “Availability Losses”, “Electrical Losses”, “Unexplained”, “OA AEP”].

  • ylabel (str) – The y-axis label. Defaults to “Energy (GWh/yr)”.

  • ylim (tuple[float | None, float | None]) – The y-axis minimum and maximum display range. Defaults to (None, None).

  • return_fig (bool, optional) – Set to True to return the figure and axes objects, otherwise set to False. Defaults to False.

  • figure_kwargs (dict, optional) – Additional keyword arguments that should be passed to plt.figure(). Defaults to {}.

  • plot_kwargs (dict, optional) – Additional keyword arguments that should be passed to ax.plot(). Defaults to {}.

  • legend_kwargs (dict, optional) – Additional keyword arguments that should be passed to ax.legend(). Defaults to {}.

Returns:

If return_fig, then return the figure

and axes objects in addition to showing the plot.

Return type:

None | tuple[plt.Figure, plt.Axes]

class openoa.analysis.wake_losses.WakeLosses(plant, wind_direction_col='WMET_HorWdDir', wind_direction_data_type: str = 'scada', wind_direction_asset_ids: list[str] = None, UQ=True, num_sim=100, start_date: str | pd.Timestamp = None, end_date: str | pd.Timestamp = None, reanalysis_products: Sequence | str | int | float | None = None, end_date_lt: str | pd.Timestamp = None, wd_bin_width: float = 5.0, freestream_sector_width: float | tuple[float, float] = (50.0, 110.0), freestream_power_method: str = 'mean', freestream_wind_speed_method: str = 'mean', correct_for_derating: bool = True, derating_filter_wind_speed_start: float | tuple[float, float] = (4.0, 5.0), max_power_filter: float | tuple[float, float] = (0.92, 0.98), wind_bin_mad_thresh: float | tuple[float, float] = (4.0, 13.0), wd_bin_width_LT_corr: float = 5.0, ws_bin_width_LT_corr: float = 1.0, num_years_LT: int | tuple[int, int] = (10, 20), assume_no_wakes_high_ws_LT_corr: bool = True, no_wakes_ws_thresh_LT_corr: float = 13.0, min_ws_bin_lin_reg: float = 3.0, bin_count_thresh_lin_reg: int = 50)[source]#

A serial implementation of a method for estimating wake losses from SCADA data. Wake losses are estimated for the entire wind plant as well as for each individual turbine for a) the period of record for which data are available, and b) the estimated long-term wind conditions the wind plant will experience based on historical reanalysis wind resource data.

The method is comprised of the following core steps:
  1. Calculate a representative wind plant-level wind direction at each time step using the mean wind direction of the specified wind turbines or meteorological (met) towers. Note that time steps for which any necessary plant-level or turbine-level data are missing are discarded.

    1. If UQ is selected, wake losses are calculated multiple times using a Monte Carlo approach with randomly chosen analysis parameters and randomly sampled, with replacement, time steps for each iteration. The remaining steps described below are performed for each Monte Carlo iteration. If UQ is not used, wake losses are calculated once using the specified analysis parameters for the full set of available time steps.

  2. Identify the set of derated, curtailed, or unavailable turbines (i.e., turbines whose power production is limited not by wake losses but by operating mode) for each time step using a power curve outlier detection method.

  3. Calculate the average wind speed and power production for the set of normally operating (i.e., not derated) freestream turbines for each time step.

    1. Freestream turbines are those without any upstream turbines located within a user-specified sector of wind directions centered on the representative plant-level wind direction.

  4. Calculate the POR wake losses for the wind plant by comparing the potential energy production (sum of the mean freestream power production at each time step multiplied by the number of turbines in the wind power plant) to the actual energy production (sum of the actual wind plant power production at each time step). This procedure is then used to estimate the wake losses for each individual wind turbine.

    1. If correct_for_derating is True, then the potential power production of the wind plant is assumed to be the actual power produced by the derated turbines plus the mean power production of the freestream turbines for all other turbines in the wind plant. Again, a similar procedure is used to estimate individual turbine wake losses.

  5. Finally, estimate the long-term corrected wake losses using the long-term historical reanalysis data. Note that the long-term correction is determined for each reanalysis product specified by the user. If UQ is used, a random reanalysis product is selected each iteration. If UQ is not selected, the long-term corrected wake losses are calculated as the average wake losses determined for all reanalysis products.

    1. Calculate the long-term occurence frequencies for a set of wind direction and wind speed bins based on the hourly reanalysis data (typically, 10-20 years).

    2. Next, using a linear regression, compare the mean freestream wind speeds calculated from the SCADA data to the wind speeds from the reanalysis data and correct to remove biases.

    3. Compute the average potential and actual wind power plant production using the representative wind plant wind directions from the SCADA or met tower data in conjunction with the corrected freestream wind speeds for each wind direction and wind speed bin.

    4. Estimate the long-term corrected wake losses by comparing the long-term corrected potential and actual energy production. These are computed by weighting the average potential and actual power production for each wind condition bin with the long-term frequencies.

    5. Repeat to estimate the long-term corrected wake losses for each individual turbine.

Parameters:
  • plant (PlantData) – A openoa.plant.PlantData object that has been validated with at least openoa.plant.PlantData.analysis_type = “WakeLosses”.

  • wind_direction_col (string, optional) – Column name to use for wind direction. Defaults to “WMET_HorWdDir”

  • wind_direction_data_type (string, optional) – Data type to use for wind directions (“scada” for turbine measurements or “tower” for meteorological tower measurements). Defaults to “scada”.

  • wind_direction_asset_ids (list, optional) – List of asset IDs (turbines or met towers) used to calculate the average wind direction at each time step. If None, all assets of the corresponding data type will be used. Defaults to None.

  • UQ (bool, optional) – Dertermines whether to perform uncertainty quantification using Monte Carlo simulation (True) or provide a single wake loss estimate (False). Defaults to True.

  • start_date (pandas.Timestamp or string, optional) – Start datetime for wake loss analysis. If None, the earliest SCADA datetime will be used. Default is None.

  • end_date (pandas.Timestamp or string, optional) – End datetime for wake loss analysis. If None, the latest SCADA datetime will be used. Default is None.

  • reanalysis_products (list, optional) – List of reanalysis products to use for long-term correction. If UQ = True, a single product will be selected form this list each Monte Carlo iteration. Defaults to [“merra2”, “era5”].

  • end_date_lt (string or pandas.Timestamp) – The last date to use for the long-term correction. If None, the most recent date common to all reanalysis products will be used.

  • wd_bin_width (float, optional) – Wind direction bin size when identifying freestream wind turbines (degrees). Defaults to 5 degrees.

  • freestream_sector_width (tuple | float, optional) – Wind direction sector size to use when identifying freestream wind turbines (degrees). If no turbines are located upstream of a particular turbine within the sector, the turbine will be classified as a freestream turbine. When UQ = True, then this should be a tuple of the lower and upper bounds for the Monte Carlo sampling, and when UQ = False this should be a single value. If None, then a default value of 90 degrees will be used if UQ = False and a default value of (50, 110) will be used if UQ = True. Defaults to None.

  • freestream_power_method (str, optional) – Method used to determine the representative power prouction of the freestream turbines (“mean”, “median”, “max”). Defaults to “mean”.

  • freestream_wind_speed_method (str, optional) – Method used to determine the representative wind speed of the freestream turbines (“mean”, “median”). Defaults to “mean”.

  • correct_for_derating (bool, optional) – Indicates whether derated, curtailed, or otherwise unavailable turbines should be flagged and excluded from the calculation of ideal freestream wind plant power production for a given time stamp. If True, ideal freestream power production will be calculated as the sum of the derated turbine powers added to the mean power of the freestream turbines in normal operation multiplied by the number of turbines operating normally in the wind plant. Defaults to True.

  • derating_filter_wind_speed_start (tuple | float, optional) – The wind speed above which turbines will be flagged as derated/curtailed/shutdown if power is less than 1% of rated power (m/s). Only used when correct_for_derating is True. This should be a tuple when UQ = True (values are Monte-Carlo sampled within the specified range) or a single value when UQ = False. If undefined (None), a value of 4.5 m/s will be used if UQ = False and values of (4.0, 5.0) will be used if UQ = True. Defaults to None.

  • max_power_filter (tuple | float, optional) – Maximum power threshold, defined as a fraction of rated power, to which the power curve bin filter should be applied. Only used when correct_for_derating = True. This should be a tuple when UQ = True (values are Monte-Carlo sampled within the specified range) or a single value when UQ = False. If undefined (None), a value of 0.95 will be used if UQ = False and values of (0.92, 0.98) will be used if UQ = True. Defaults to None.

  • wind_bin_mad_thresh (tuple | float, optional) – The filter threshold for each power bin used to identify derated/curtailed/shutdown turbines, expressed as the number of median absolute deviations above the median wind speed. Only used when correct_for_derating is True. This should be a tuple when UQ = True (values are Monte-Carlo sampled within the specified range) or a single value when UQ = False. If undefined (None), a value of 7.0 will be used if UQ = False and values of (4.0, 13.0) will be used if UQ = True. Defaults to None.

  • wd_bin_width_LT_corr (float, optional) – Size of wind direction bins used to calculate long-term frequencies from historical reanalysis data and correct wake losses during the period of record (degrees). Defaults to 5 degrees.

  • ws_bin_width_LT_corr (float, optional) – Size of wind speed bins used to calculate long-term frequencies from historical reanalysis data and correct wake losses during the period of record (m/s). Defaults to 1 m/s.

  • num_years_LT (tuple | int, optional) – Number of years of historical reanalysis data to use for long-term correction. This should be a tuple when UQ = True (values are Monte-Carlo sampled within the specified range) or a single value when UQ = False. If undefined (None), a value of 20 will be used if UQ = False and values of (10, 20) will be used if UQ = True. Defaults to None.

  • assume_no_wakes_high_ws_LT_corr (bool, optional) – If True, wind direction and wind speed bins for which operational data are missing above a certain wind speed threshold are corrected by assigning the wind turbines’ rated power to both the actual and potential power production variables during the long term-correction process. This assumes there are no wake losses above the wind speed threshold. Defaults to True.

  • no_wakes_ws_thresh_LT_corr (float, optional) – The wind speed threshold (inclusive) above which rated power is assigned to both the actual and potential power production variables if operational data are missing for any wind direction and wind speed bin during the long term-correction process. This wind speed corresponds to the wind speed measured at freestream wind turbines. Only used if assume_no_wakes_high_ws_LT_corr = True. Defaults to 13 m/s.

  • min_ws_bin_lin_reg (float, optional) – The minimum wind speed bin to consider when finding linear regression from SCADA freestream wind speeds to reanalysis wind speeds. Defaults to 3.0

  • bin_count_thresh_lin_reg (int, optional) – The minimum number of samples required in a wind speed bin to include when finding linear regression from SCADA freestream wind speeds to reanalysis wind speeds. Defaults to 50.

Method generated by attrs for class WakeLosses.

check_reanalysis_products(attribute: Attribute, value: list[str]) None[source]#

Checks that the provided reanalysis products actually exist in the reanalysis data.

run(num_sim: int | None = None, reanalysis_products: list[str] | None = None, wd_bin_width: float | None = None, freestream_sector_width: float | None = None, freestream_power_method: str | None = None, freestream_wind_speed_method: str | None = None, correct_for_derating: bool | None = None, derating_filter_wind_speed_start: float | None = None, max_power_filter: float | None = None, wind_bin_mad_thresh: float | None = None, wd_bin_width_LT_corr: float | None = None, ws_bin_width_LT_corr: float | None = None, num_years_LT: int | None = None, assume_no_wakes_high_ws_LT_corr: bool | None = None, no_wakes_ws_thresh_LT_corr: float | None = None, min_ws_bin_lin_reg: float | None = None, bin_count_thresh_lin_reg: int | None = None)[source]#

Estimates wake losses by comparing wind plant energy production to energy production of the turbines identified as operating in freestream conditions. Wake losses are expressed as a fractional loss (e.g., 0.05 indicates a wake loss values of 5%).

Note

If None is provided to any of the inputs, then the last used input value will be used for the analysis, and if no prior values were set, then this is the model’s defaults.

Parameters:
  • num_sim (int, optional) – Number of Monte Carlo iterations to perform. Only used if UQ = True. Defaults to 100.

  • wd_bin_width (float, optional) – Wind direction bin size when identifying freestream wind turbines (degrees). Defaults to 5 degrees.

  • freestream_sector_width (tuple | float, optional) – Wind direction sector size to use when identifying freestream wind turbines (degrees). If no turbines are located upstream of a particular turbine within the sector, the turbine will be classified as a freestream turbine. When UQ = True, then this should be a tuple of the lower and upper bounds for the Monte Carlo sampling, and when UQ = False this should be a single value. If None, then a default value of 90 degrees will be used if UQ = False and a default value of (50, 110) will be used if UQ = True. Defaults to None.

  • freestream_power_method (str, optional) – Method used to determine the representative power prouction of the freestream turbines (“mean”, “median”, “max”). Defaults to “mean”.

  • freestream_wind_speed_method (str, optional) – Method used to determine the representative wind speed of the freestream turbines (“mean”, “median”). Defaults to “mean”.

  • correct_for_derating (bool, optional) – Indicates whether derated, curtailed, or otherwise unavailable turbines should be flagged and excluded from the calculation of ideal freestream wind plant power production for a given time stamp. If True, ideal freestream power production will be calculated as the sum of the derated turbine powers added to the mean power of the freestream turbines in normal operation multiplied by the number of turbines operating normally in the wind plant. Defaults to True.

  • derating_filter_wind_speed_start (tuple | float, optional) – The wind speed above which turbines will be flagged as derated/curtailed/shutdown if power is less than 1% of rated power (m/s). Only used when correct_for_derating is True. This should be a tuple when UQ = True (values are Monte-Carlo sampled within the specified range) or a single value when UQ = False. If undefined (None), a value of 4.5 m/s will be used if UQ = False and values of (4.0, 5.0) will be used if UQ = True. Defaults to None.

  • max_power_filter (tuple | float, optional) – Maximum power threshold, defined as a fraction of rated power, to which the power curve bin filter should be applied. Only used when correct_for_derating = True. This should be a tuple when UQ = True (values are Monte-Carlo sampled within the specified range) or a single value when UQ = False. If undefined (None), a value of 0.95 will be used if UQ = False and values of (0.92, 0.98) will be used if UQ = True. Defaults to None.

  • wind_bin_mad_thresh (tuple | float, optional) – The filter threshold for each power bin used to identify derated/curtailed/shutdown turbines, expressed as the number of median absolute deviations above the median wind speed. Only used when correct_for_derating is True. This should be a tuple when UQ = True (values are Monte-Carlo sampled within the specified range) or a single value when UQ = False. If undefined (None), a value of 7.0 will be used if UQ = False and values of (4.0, 13.0) will be used if UQ = True. Defaults to None.

  • wd_bin_width_LT_corr (float, optional) – Size of wind direction bins used to calculate long-term frequencies from historical reanalysis data and correct wake losses during the period of record (degrees). Defaults to 5 degrees.

  • ws_bin_width_LT_corr (float, optional) – Size of wind speed bins used to calculate long-term frequencies from historical reanalysis data and correct wake losses during the period of record (m/s). Defaults to 1 m/s.

  • num_years_LT (tuple | int, optional) – Number of years of historical reanalysis data to use for long-term correction. This should be a tuple when UQ = True (values are Monte-Carlo sampled within the specified range) or a single value when UQ = False. If undefined (None), a value of 20 will be used if UQ = False and values of (10, 20) will be used if UQ = True. Defaults to None.

  • assume_no_wakes_high_ws_LT_corr (bool, optional) – If True, wind direction and wind speed bins for which operational data are missing above a certain wind speed threshold are corrected by assigning the wind turbines’ rated power to both the actual and potential power production variables during the long term-correction process. This assumes there are no wake losses above the wind speed threshold. Defaults to True.

  • no_wakes_ws_thresh_LT_corr (float, optional) – The wind speed threshold (inclusive) above which rated power is assigned to both the actual and potential power production variables if operational data are missing for any wind direction and wind speed bin during the long term-correction process. This wind speed corresponds to the wind speed measured at freestream wind turbines. Only used if assume_no_wakes_high_ws_LT_corr = True. Defaults to 13 m/s.

  • min_ws_bin_lin_reg (float, optional) – The minimum wind speed bin to consider when finding linear regression from SCADA freestream wind speeds to reanalysis wind speeds. Defaults to 3.0

  • bin_count_thresh_lin_reg (int, optional) – The minimum number of samples required in a wind speed bin to include when finding linear regression from SCADA freestream wind speeds to reanalysis wind speeds. Defaults to 50.

plot_wake_losses_by_wind_direction(plot_norm_energy: bool = True, turbine_id: str | None = None, xlim: tuple[float, float] = (None, None), ylim_efficiency: tuple[float, float] = (None, None), ylim_energy: tuple[float, float] = (None, None), return_fig: bool = False, figure_kwargs: dict | None = None, plot_kwargs_line: dict = {}, plot_kwargs_fill: dict = {}, legend_kwargs: dict = {})[source]#

Plots wake losses in the form of wind farm efficiency as well as normalized wind plant energy production for both the period of record and with the long-term correction as a function of wind direction.

Parameters:
  • plot_norm_energy (bool, optional) – If True, include a plot of normalized wind plant energy production as a function of wind direction in addition to the wind farm efficiency plot. Defaults to True.

  • turbine_id (str, optional) – Turbine asset_id to plot wake losses for. If None, wake losses for the entire wind plant will be plotted. Defaults to None.

  • xlim (tuple[float, float], optional) – A tuple of floats representing the x-axis wind direction plotting display limits (degrees). Defaults to (None, None).

  • ylim_efficiency (tuple[float, float], optional) – A tuple of the y-axis plotting display limits for the wind farm efficiency plot (top plot). Defaults to (None, None).

  • ylim_energy (tuple[float, float], optional) – If plot_norm_energy is True, a tuple of the y-axis plotting display limits for the wind farm energy distribution plot (bottom plot). Defaults to (None, None).

  • return_fig (bool, optional) – Flag to return the figure and axes objects. Defaults to False.

  • figure_kwargs (dict, optional) – Additional figure instantiation keyword arguments that are passed to plt.figure(). Defaults to None.

  • plot_kwargs_line (dict, optional) – Additional plotting keyword arguments that are passed to ax.plot() for plotting lines for the wind farm efficiency and, if plot_norm_energy is True, energy distributions subplots. Defaults to {}.

  • plot_kwargs_fill (dict, optional) – If UQ is True, additional plotting keyword arguments that are passed to ax.fill_between() for plotting shading regions for 95% confidence intervals for the wind farm efficiency and, if plot_norm_energy is True, energy distributions subplots. Defaults to {}.

  • legend_kwargs (dict, optional) – Additional legend keyword arguments that are passed to ax.legend() for the wind farm efficiency and, if plot_norm_energy is True, energy distributions subplots. Defaults to {}.

Returns:

If return_fig is True, then the figure and axes object(s), corresponding to the wake loss plot or, if plot_norm_energy is True, wake loss and normalized energy plots, are returned for further tinkering/saving.

Return type:

None | tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes] | tuple[matplotlib.pyplot.Figure, tuple [matplotlib.pyplot.Axes, matplotlib.pyplot.Axes]]

plot_wake_losses_by_wind_speed(plot_norm_energy: bool = True, turbine_id: str | None = None, xlim: tuple[float, float] = (None, None), ylim_efficiency: tuple[float, float] = (None, None), ylim_energy: tuple[float, float] = (None, None), return_fig: bool = False, figure_kwargs: dict | None = None, plot_kwargs_line: dict = {}, plot_kwargs_fill: dict = {}, legend_kwargs: dict = {})[source]#

Plots wake losses in the form of wind farm efficiency as well as normalized wind plant energy production for both the period of record and with the long-term correction as a function of wind speed.

Parameters:
  • plot_norm_energy (bool, optional) – If True, include a plot of normalized wind plant energy production as a function of wind speed in addition to the wind farm efficiency plot. Defaults to True.

  • turbine_id (str, optional) – Turbine asset_id to plot wake losses for. If None, wake losses for the entire wind plant will be plotted. Defaults to None.

  • xlim (tuple[float, float], optional) – A tuple of floats representing the x-axis wind speed plotting display limits (degrees). Defaults to (None, None).

  • ylim_efficiency (tuple[float, float], optional) – A tuple of the y-axis plotting display limits for the wind farm efficiency plot (top plot). Defaults to (None, None).

  • ylim_energy (tuple[float, float], optional) – If plot_norm_energy is True, a tuple of the y-axis plotting display limits for the wind farm energy distribution plot (bottom plot). Defaults to (None, None).

  • return_fig (bool, optional) – Flag to return the figure and axes objects. Defaults to False.

  • figure_kwargs (dict, optional) – Additional figure instantiation keyword arguments that are passed to plt.figure(). Defaults to None.

  • plot_kwargs_line (dict, optional) – Additional plotting keyword arguments that are passed to ax.plot() for plotting lines for the wind farm efficiency and, if plot_norm_energy is True, energy distributions subplots. Defaults to {}.

  • plot_kwargs_fill (dict, optional) – If UQ is True, additional plotting keyword arguments that are passed to ax.fill_between() for plotting shading regions for 95% confidence intervals for the wind farm efficiency and, if plot_norm_energy is True, energy distributions subplots. Defaults to {}.

  • legend_kwargs (dict, optional) – Additional legend keyword arguments that are passed to ax.legend() for the wind farm efficiency and, if plot_norm_energy is True, energy distributions subplots. Defaults to {}.

Returns:

If return_fig is True, then the figure and axes object(s), corresponding to the wake loss plot or, if plot_norm_energy is True, wake loss and normalized energy plots, are returned for further tinkering/saving.

Return type:

None | tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes] | tuple[matplotlib.pyplot.Figure, tuple [matplotlib.pyplot.Axes, matplotlib.pyplot.Axes]]

class openoa.analysis.yaw_misalignment.StaticYawMisalignment(plant, turbine_ids: list[str] | None = None, UQ=True, num_sim=100, ws_bins: list[float] = [5.0, 6.0, 7.0, 8.0], ws_bin_width=1.0, vane_bin_width=1.0, min_vane_bin_count: int = 100, max_abs_vane_angle=25.0, pitch_thresh=0.5, num_power_bins: int = 25, min_power_filter=0.01, max_power_filter: float | tuple[float, float] = (0.92, 0.98), power_bin_mad_thresh: float | tuple[float, float] = (4.0, 10.0), use_power_coeff: bool = False)[source]#

A method for estimating static yaw misalignment for different wind speed bins for each specified wind turbine as well as the average static yaw misalignment over all wind speed bins using turbine-level SCADA data.

The method is comprised of the following core steps, which are performed for each specified wind turbine. If UQ is selected, the following steps are performed multiple times using Monte Carlo simulation to produce a distribution of static yaw misalignment estimates from which 95% confidence intervals can be derived:

  1. Timestamps containing power curve outliers are removed. Specifically, pitch angles are limited to a specified threshold to remove timestamps when turbines are operating in or near above-rated conditions where yaw misalignment has little impact on power performance. Next to increase the likelihood that power performance deviations are caused by yaw misalignment, a power curve outlier detection filter is used to remove timestamps when the turbine is operating abnormally. If UQ is selected, power curve outlier detection parameters will be chosen randomly for each Monte Carlo iteration.

  2. The filtered SCADA data are divided into the specified wind speed bins based on wind speed measured by the nacelle anemometer. If UQ is selected, the data corresponding to each wind speed bin are randomly resampled with replacement each Monte Carlo iteration (i.e., bootstrapping).

  3. For each wind speed bin, the power performance is binned by wind vane angle, where power performance can be defined as the raw power or a normalized coefficient power formed by dividing the raw power by the wind speed cubed.

  4. A cosine exponent curve as a function of wind vane angle is fit to the binned power performance values, where the free parameters are the amplitude, the exponent applied to the cosine, and the wind vane angle offset where the peak of the cosine curve is located.

  5. For each wind speed bin, the static yaw misalignment is estimated as the difference between the wind vane angle where power performance is maximized, based on the wind vane angle offset for the best-fit cosine curve, and the mean wind vane angle.

  6. The overall yaw misalignment is estimated as the average yaw misalignment over all wind speed bins.

Warning

This is a relatively simple method that has not yet been validated using data from wind turbines with known static yaw misalignments. Therefore, the results should be treated with caution. One known issue is that the method currently relies on nacelle wind speed measurements to determine the power performance as a function of wind vane angle. If the measured wind speed is affected by the amount of yaw misalignment, potential biases can exist in the estimated static yaw misalignment values.

Parameters:
  • plant (PlantData) – A openoa.plant.PlantData object that has been validated with at least openoa.plant.PlantData.analysis_type = “StaticYawMisalignment”.

  • turbine_ids (list, optional) – List of turbine IDs for which static yaw misalignment detection will be performed. If None, all turbines will be analyzed. Defaults to None.

  • UQ (bool, optional) – Dertermines whether to perform uncertainty quantification using Monte Carlo simulation (True) or provide a single yaw misalignment estimate (False). Defaults to True.

  • num_sim (int, optional) – Number of Monte Carlo iterations to perform. Only used if UQ = True. Defaults to 100.

  • ws_bins (float, optional) – Wind speed bin centers for which yaw misalignment detection will be performed (m/s). Defaults to [5.0, 6.0, 7.0, 8.0].

  • ws_bin_width (float, optional) – Wind speed bin size to use when detecting yaw misalignment for individual wind seed bins (m/s). Defaults to 1 m/s.

  • vane_bin_width (float, optional) – Wind vane bin size to use when detecting yaw misalignment (degrees). Defaults to 1 degree.

  • min_vane_bin_count (int, optional) – Minimum number of data points needed in a wind vane bin for it to be included when detecting yaw misalignment. Defaults to 100.

  • max_abs_vane_angle (float, optional) – Maximum absolute wind vane angle considered when detecting yaw misalignment. Defaults to 25 degrees.

  • pitch_thresh (float, optional) – Maximum blade pitch angle considered when detecting yaw misalignment. Defaults to 0.5 degrees.

  • num_power_bins (int, optional) – Number of power bins to use for power curve bin filtering to remove outlier data points. Defaults to 25.

  • min_power_filter (float, optional) – Minimum power threshold, defined as a fraction of rated power, to which the power curve bin filter should be applied. Defaults to 0.01.

  • max_power_filter (tuple | float, optional) – Maximum power threshold, defined as a fraction of rated power, to which the power curve bin filter should be applied. This should be a tuple when UQ = True (values are Monte-Carlo sampled within the specified range) or a single value when UQ = False. If undefined (None), a value of 0.95 will be used if UQ = False and values of (0.92, 0.98) will be used if UQ = True. Defaults to None.

  • power_bin_mad_thresh (tuple | float, optional) – The filter threshold for each power bin used to identify abnormal operation, expressed as the number of median absolute deviations from the median wind speed. This should be a tuple when UQ = True (values are Monte-Carlo sampled within the specified range) or a single value when UQ = False. If undefined (None), a value of 7.0 will be used if UQ = False and values of (4.0, 13.0) will be used if UQ = True. Defaults to None.

  • use_power_coeff (bool, optional) – If True, power performance as a function of wind vane angle will be quantified by normalizing power by the cube of the wind speed, approximating the power coefficient. If False, only power will be used. Defaults to False.

Method generated by attrs for class StaticYawMisalignment.

run(num_sim: int | None = None, ws_bins: list[float] | None = None, ws_bin_width: float | None = None, vane_bin_width: float | None = None, min_vane_bin_count: int | None = None, max_abs_vane_angle: float | None = None, pitch_thresh: float | None = None, num_power_bins: int | None = None, min_power_filter: float | None = None, max_power_filter: float | None = None, power_bin_mad_thresh: float | None = None, use_power_coeff: bool | None = None)[source]#

Estimates static yaw misalignment for each wind speed bin for each specified wind turbine. After performing power curve filtering to remove timestamps when pitch angle is above a threshold or the turbine is operating abnormally, best-fit cosine curves are found for binned power performance vs. wind vane angle for each wind speed bin and turbine. The difference between the wind vane angle where power is maximized, based on the best-fit cosine curve, and the mean wind vane angle is treated as the static yaw misalignment. If UQ is True, Monte Carlo simulations will be performed to produce a distribution of yaw misalignment values from which 95% confidence intervals can be determined.

Parameters:
  • num_sim (int, optional) – Number of Monte Carlo iterations to perform. Only used if UQ = True. Defaults to 100.

  • ws_bins (float, optional) – Wind speed bin centers for which yaw misalignment detection will be performed (m/s). Defaults to [5.0, 6.0, 7.0, 8.0].

  • ws_bin_width (float, optional) – Wind speed bin size to use when detecting yaw misalignment for individual wind seed bins (m/s). Defaults to 1 m/s.

  • vane_bin_width (float, optional) – Wind vane bin size to use when detecting yaw misalignment (degrees). Defaults to 1 degree.

  • min_vane_bin_count (int, optional) – Minimum number of data points needed in a wind vane bin for it to be included when detecting yaw misalignment. Defaults to 100.

  • max_abs_vane_angle (float, optional) – Maximum absolute wind vane angle considered when detecting yaw misalignment. Defaults to 25 degrees.

  • pitch_thresh (float, optional) – Maximum blade pitch angle considered when detecting yaw misalignment. Defaults to 0.5 degrees.

  • num_power_bins (int, optional) – Number of power bins to use for power curve bin filtering to remove outlier data points. Defaults to 25.

  • min_power_filter (float, optional) – Minimum power threshold, defined as a fraction of rated power, to which the power curve bin filter should be applied. Defaults to 0.01.

  • max_power_filter (tuple | float, optional) – Maximum power threshold, defined as a fraction of rated power, to which the power curve bin filter should be applied. This should be a tuple when UQ = True (values are Monte-Carlo sampled within the specified range) or a single value when UQ = False. If undefined (None), a value of 0.95 will be used if UQ = False and values of (0.92, 0.98) will be used if UQ = True. Defaults to None.

  • power_bin_mad_thresh (tuple | float, optional) – The filter threshold for each power bin used to identify abnormal operation, expressed as the number of median absolute deviations from the median wind speed. This should be a tuple when UQ = True (values are Monte-Carlo sampled within the specified range) or a single value when UQ = False. If undefined (None), a value of 7.0 will be used if UQ = False and values of (4.0, 13.0) will be used if UQ = True. Defaults to None.

  • use_power_coeff (bool, optional) – If True, power performance as a function of wind vane angle will be quantified by normalizing power by the cube of the wind speed, approximating the power coefficient. If False, only power will be used. Defaults to False.

plot_yaw_misalignment_by_turbine(turbine_ids: list[str] | None = None, xlim: tuple[float, float] = (None, None), ylim: tuple[float, float] = (None, None), return_fig: bool = False, figure_kwargs: dict | None = None, plot_kwargs_curve: dict = {}, plot_kwargs_line: dict = {}, plot_kwargs_fill: dict = {}, legend_kwargs: dict = {})[source]#

Plots power performance vs. wind vane angle along with the best-fit cosine curve for each wind speed bin for each turbine specified. The mean wind vane angle and the wind vane angle where power performance is maximized are shown for each wind speed bin. Additionally, the yaw misalignments for each wind speed bin as well as the mean yaw misalignment avergaged over all wind speed bins are listed. If UQ is True, 95% confidence intervals will be plotted for the binned power performance values and listed for the yaw misalignment estiamtes.

Parameters:
  • turbine_ids (list[str], optional) – Name of turbines for which yaw misalignment data are plotted. If None, plots for all turbines for which yaw misalginment detection was performed will be generated. Defaults to None.

  • xlim (tuple[float, float], optional) – A tuple of floats representing the x-axis wind vane angle plotting display limits (degrees). Defaults to (None, None).

  • ylim (tuple[float, float], optional) – A tuple of the y-axis plotting display limits for the power performance vs. wind vane plots. Defaults to (None, None).

  • return_fig (bool, optional) – Flag to return the figure and axes objects. Defaults to False.

  • figure_kwargs (dict, optional) – Additional figure instantiation keyword arguments that are passed to plt.figure(). Defaults to None.

  • plot_kwargs_curve (dict, optional) – Additional plotting keyword arguments that are passed to ax.plot() for plotting lines for the power performance vs. wind vane plots. Defaults to {}.

  • plot_kwargs_line (dict, optional) – Additional plotting keyword arguments that are passed to ax.plot() for plotting vertical lines indicating mean vane angle and vane angle where power is maximized. Defaults to {}.

  • plot_kwargs_fill (dict, optional) – If UQ is True, additional plotting keyword arguments that are passed to ax.fill_between() for plotting shading regions for 95% confidence intervals for power performance vs. wind vane. Defaults to {}.

  • legend_kwargs (dict, optional) – Additional legend keyword arguments that are passed to ax.legend() for the power performance vs. wind vane plots. Defaults to {}.

Returns:

If return_fig is True, then a dictionary containing the figure and axes object(s) corresponding to the yaw misalignment plots for each turbine are returned for further tinkering/saving. The turbine names in the turbine_ids aregument are the dicitonary keys.

Return type:

None | dict of tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes]