Pipeline

Read in Data

Data is read in from a .csv file and marked into daily increments.

What Data Should Look Like

The imported Data should have columns reflecting results from the ddiv package IVFeatures command run on individual IV curves, Voc and Isc readings from any source (if possible), a string containing the IV curve for each timestamp, poay and temperature readings for the correct timestamp, and a timestamp. All column names should be in an explicit convention.

Use the ddiv package’s command IVFeatures to extract features from individual IV curves. See package documentation for ddiv on how to use this function (note that ddiv requires the segmented package, among others). Rename columns according to the standards below. Note that ddiv returns a list, so it’s a good idea to pass it’s results to data.frame

If your IV Tracer returns Voc and Isc readings, be sure to include them with column names ‘vocc’ and ‘ishc’. If not, rename the ‘Voc’ and ‘Isc’ from the ddiv results to ‘vocc’ and ‘ishc’.

To obtain the string containing an IV curve, use the df2chr function included in this package. Make sure your IV curves are in a dataframe containing voltage and current columns (in that order). it doesn’t matter what the columns are named.

Be sure to add the POA and module temperature (in celsius) data as well. If you have GHI instead of poay, convert it first to POA. This can be done using the Python package PVLib, using latitude, longitude, and elevation.

The timestamp in each column must match the IV curve with it.

Use the following variable names in order to ensure th package processes your data correctly:

Deprecated Name	HBase Name	Unit/Standard
Fill Factor	ffff	%
Open Circuit Voltage	vocc	V
Short Circuit Current	ishc	A
Max Power Voltage	vmpp	V
Max Power Current	impp	A
Max Power Point	pmpp	W

These are the key variable names – look at the example data below to see other names. The extracted features obtained with ddiv should have the correct names when they are extracted.

It is recommended that the period (the third parameter below) be chosen so that each psuedo-IV curve has 300 points or more. In order to achieve, this make sure there are at least 300 Isc and Voc pairs in each period. 10 minute intervals of collection is generally good for 7 days.

# below is the function used to read in our csv
# df_wbw <- read_df_raw_from_csv('../../data/sa42589-albsf_full_ddiv.csv', 0, 7)

colnames(df_wbw)
#>  [1] "tmst"     "vocc"     "ghir"     "ivdf"     "imxp"     "pmpp"    
#>  [7] "ishc"     "modt"     "vmxp"     "ffff"     "poay"     "eisc"    
#> [13] "ersh"     "evoc"     "exrs"     "epmp"     "eimp"     "evmp"    
#> [19] "exff"     "day"      "n_period" "date"

The POA column is converted from the measured ghir.

Extracting I-V features with ddiv

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(purrr)
library(magrittr)
#> 
#> Attaching package: 'magrittr'
#> The following object is masked from 'package:purrr':
#> 
#>     set_names
library(ddiv)
str_ddiv <- function(iv_str, pb) {
  iv <- char_to_df(iv_str)
  # initialize result for when IVfeature encounters an error
  err_df <- data.frame("Isc" = NA, "Rsh" = NA,
                       "Voc" = NA, "Rs" = NA,
                       "Pmp" = NA,"Imp" = NA,
                       "Vmp" = NA, "FF" = NA)

  # calculate ddiv with error trapping
  res <- df <- tryCatch(
    as.data.frame(IVfeature(I = iv$I, V = iv$V, num = 200, crt = 0.1, crtvalb = 0.2)),
    error = function(e) err_df
  )

  pb$tick()$print()
  return(res)
}

batch_ddiv <- function(df) {
  iv_list <- df$ivdf 
  pb <- dplyr::progress_estimated(length(iv_list))
  ddiv_df <- iv_list %>% map_dfr(~str_ddiv(iv_str = ., pb))
  res <- cbind(df$tmst, ddiv_df)
  return(res)
}

# it is recommended to test on a small number of iv curves
# to tweak the input parameters of ddiv::IVfeature function
test <- sample_n(df_wbw, 5)
ddiv_test <- batch_ddiv(test)
colnames(ddiv_test) <- c("tmst", "eisc", "ersh", "evoc", "exrs", "epmp", "eimp", "evmp", "exff" )

# bind ddiv result with original dataframe

test_res <- full_join(test, ddiv_test)
#> Joining, by = c("tmst", "eisc", "ersh", "evoc", "exrs", "epmp", "eimp", "evmp", "exff")

colnames(test_res)
#>  [1] "tmst"     "vocc"     "ghir"     "ivdf"     "imxp"     "pmpp"    
#>  [7] "ishc"     "modt"     "vmxp"     "ffff"     "poay"     "eisc"    
#> [13] "ersh"     "evoc"     "exrs"     "epmp"     "eimp"     "evmp"    
#> [19] "exff"     "day"      "n_period" "date"

Data from a narrow range of temperature and irradiance values are needed at one point for calculating various values. Create it now.


T_corr <- median_temp(df_wbw)

T_corr
#> [1] 30

Lastly, the power_loss_bat function, which deals with power loss modes, requires a subset of the larger dataframe. Use select_init_df to do this.


df_init <- select_init_df(df_wbw, days = 21)

colnames(df_init)
#>  [1] "tmst"     "vocc"     "ghir"     "ivdf"     "imxp"     "pmpp"    
#>  [7] "ishc"     "modt"     "vmxp"     "ffff"     "poay"     "eisc"    
#> [13] "ersh"     "evoc"     "exrs"     "epmp"     "eimp"     "evmp"    
#> [19] "exff"     "day"      "n_period" "date"

Psuedo-IV Curve Generation

This functionality is handled by a master function which calls a variety of subfunctions. IVXbyX creates the Psuedo-IV Curves.


df_full <- IVXbyX(df_wbw, corr_temp = "median", 4)

colnames(df_full)
#>  [1] "date"      "day"       "n_period"  "piv"       "pisc"      "pvoc"     
#>  [7] "prsh"      "prss"      "ppmp"      "pimp"      "pvmp"      "pfff"     
#> [13] "isc_1sun"  "voc_1sun"  "n"         "imp_fit"   "vmp_fit"   "pmp_fit"  
#> [19] "rs_fit"    "voc_adjR2" "voc_rmse"  "vmp_adjR2" "vmp_rmse"  "rs_adjR2" 
#> [25] "rs_rmse"

Power Loss Modes

res <- power_loss_phys_bat(df_full, df_init, corr_T = T_corr, N_c = 4);

# Generate Table
knitr::kable(res[1:5, ], caption = "Power Loss Modes")

Power Loss Modes
date	uni_I	rec	rs	I_mis
2019-03-01	8.971533	-2.6901230	-10.97729	-3.452138
2019-03-02	-15.278475	-1.7287541	-12.85827	9.204685
2019-03-09	1.617194	-0.5502143	-10.41154	-2.755782
2019-03-16	9.203099	-0.9165730	-15.22506	1.102804
2019-03-23	7.747943	0.2120270	-13.54368	-1.263013

Visualization of power loss modes

library(ggplot2)
ggplot(data = res, aes(x = date)) +
  geom_point(aes(y = uni_I, color = "Uniform current", shape = "Uniform current"), size = 2) +
  geom_point(aes(y = rec, color = "Recombination", shape = "Recombination"), size = 2) +
  geom_point(aes(y = rs, color = "Rs loss", shape = "Rs loss"),  size = 2) +
  geom_point(aes(y = I_mis, color = "I mismatch", shape = "I mismatch"), size = 2) +
  ylab(expression(paste(Delta, "Power (W)"))) +
  xlab("Date") +
  theme_bw() +
  theme(axis.text = element_text(size = 12),
        axis.title = element_text(size = 12),
        legend.text = element_text(size = 10),
        legend.position = "top") +
  scale_shape_manual(values = c(4, 8, 17, 16),
                     name = "Power loss \n mode") +
  scale_colour_discrete(name = "Power loss \n mode")