learningRlab

Main Functions:

To explain the use of each function, we present a dataset to work with them:

data <- c(1,1,2,3,4,7,8,8,8,10,10,11,12,15,20,22,25)
plot(data);

The arithmetic mean calculus function:

mean_(data)
#> [1] 9.823529

The geometric mean calculus function:

geometricMean_(data)
#> [1] 6.911414

The mode calculus function:

mode_(data)
#> Factor appears  3  times in the vector.
#> Unique mode
#> [1] 8

The median calculus function:

median_(data)
#> 
#> Sorted vector: 1 1 2 3 4 7 8 8 8 10 10 11 12 15 20 22 25
#> [1] 8

The standard deviation calculus function:

standardDeviation_(data)
#> [1] 6.989364

The average absolute deviation calculus function:

averageDeviation_(data)
#> [1] 5.460208

The variance calculus function:

variance_(data)
#> [1] 48.85121

The quartiles calculus function:

quartile_(data)
#>    Q1    Q2    Q3 
#>  4.25  8.50 12.75

The percentile calculus function:

percentile_(data)
#> Percentile[ 10 ] =  1 
#> Percentile[ 20 ] =  3 
#> Percentile[ 30 ] =  7 
#> Percentile[ 40 ] =  8 
#> Percentile[ 50 ] =  8 
#> Percentile[ 60 ] =  10 
#> Percentile[ 70 ] =  11 
#> Percentile[ 80 ] =  15 
#> Percentile[ 90 ] =  22 
#> Percentile[ 100 ] =  25

The absolute frecuency calculus function:

frecuency_abs(data,1)
#> [1] 2

The relative frecuency calculus function:

frecuency_relative(data,20)
#> [1] 0.05882353

The absolute acumulated frecuency calculus function:

frecuency_absolute_acum(data,1)
#> [1] 2

The relative acumulated frecuency calculus function:

frecuency_relative_acum(data,20)
#> [1] 0.8823529

Explained Functions:

For each main function, there are an explained function to see the calculus process:

arithmetic mean:

explain.mean(data)
#> 
#> __MEAN CALCULUS__ 
#> 
#> Formula -> (x1 + x2 +..+xn) / num_elements
#> 
#> The mean of a dataset is calculated by adding each element of the dataset and dividing the result by the number of elements in the dataset. We'll give the user an example for better comprension.
#> 
#> __Use Example__
#> 
#> First of all, we need to know the content of the dataset/vector of numbers
#> 
#> The content of the vector is: 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> Now we need to add each element of the vector/dataset
#> The sum of the elements is:  167 
#> 
#> Next step, get the number of elements that we've examined
#> The length of the vector is  17 elements
#> 
#> Formula applied ->  167 / 17  =  9.82352941176471
#> Now try by your own! :D
#> 
#> Use interactive.mean function to practice.

geometric mean:

explain.geometricMean(data)
#> 
#> __GEOMETRIC MEAN CALCULUS__ 
#> 
#> Formula -> (x1 * x2 *..* xn)^( 1 / num_elements
#> 
#> The geometric mean of a dataset is calculated by multiplying each element of the dataset and raising the result to 1 divided by the number of elements in the dataset. We'll give the user an example for better comprension.
#> 
#> __Use Example__
#> 
#> First of all, we need to know the content of the dataset/vector of numbers
#> 
#> The content of the vector is: 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> Now we need to multiply each element of the vector/dataset
#> The product of the elements is:  1.87342848e+14 
#> 
#> Next step, get the number of elements that we've examined
#> The length of the vector is  17 elements
#> 
#> Formula applied -> ( 1.87342848e+14 ) ^ ( 1 / 17 ) =  6.91141369632174
#> Now try by your own! :D
#> 
#> Use interactive.geometricMean function to practice.

mode:

explain.mode(data)
#> 
#> __MODE CALCULUS__ 
#> 
#> Formula -> Most repeated value of [Data]
#> 
#> The mode of a dataset is calculated by looking for the most repeated value in the dataset. If in a group there are two or several scores with the same frequency and that frequency is the maximum, the distribution is bimodal or multimodal, that is, it has several modes.
#> 
#> __Use Example__
#> 
#> First step : search the most repeated value
#> 
#> The content of the vector is: 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> Factor  8  appears  3  times in the vector.
#> 
#> Second step : check the dataset looking for a value with the same maximum frequency
#> 
#> If there are only 1 unique most repeated value, it is the mode.
#> If there are 2 values repeated with the same maximum frequency each value represents the mode. Bimodal dataset
#> If there are more than 2 values repeated with the same maximum frequency, it is a Multimodal dataset
#> 
#> Now try by your own! :D
#> 
#> Use interactive.mode function to practice.

median:

explain.median(data)
#> 
#> __MEDIAN CALCULUS__ 
#> 
#> Formula -> 1/2(n+1) where n -> vector size
#> 
#> The median of a dataset is the value in the middle of the sorted data. It's important to know that the data must be sorted. If the dataset has a pair number of elements, we should select both in the middle to add each other and get divided by two. If the dataset has a no pair number of elements, we should select the one in the middle.
#> 
#> __Use Example__
#> 
#> First step : identify if the vector has a pair number of elements
#> 
#> The content of the vector is: 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> 
#> Second step: depending of the number of elements
#> 
#> It has a ODD number of elements ( 17 )
#> 
#> We take the 'n/2' approaching up element
#> The result is :  8
#> Now try by your own! :D
#> 
#> Use interactive.median function to practice.

standard deviation:

explain.standardDeviation(data)
#> 
#> __STANDARD DEVIATION CALCULUS__ 
#> 
#> Formula ->  square_root ((Summation(each_element - mean)^2) / num_elements)
#> 
#> The standard deviation of a dataset is calculated by adding the square of the diference between each element and the mean of the dataset. This sum will be dividing by the number of elements in the dataset and finally making the square root on the result. We'll give the user an example for better comprension.
#> 
#> __Use Example__
#> 
#> First of all, we need to know the content of the dataset/vector of numbers
#> 
#> The content of the vector is: 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> The mean of dataset is... 9.82352941176471
#> The square of the diference between each number and the mean of dataset is: 77.85467 ,77.85467 ,61.20761 ,46.56055 ,33.91349 ,7.972318 ,3.32526 ,3.32526 ,3.32526 ,0.03114187 ,0.03114187 ,1.384083 ,4.737024 ,26.79585 ,103.5606 ,148.2664 ,230.3253
#> Now we need to add each element of the vector/dataset
#> The sum of the squares is:  830.470588235294 
#> 
#> Next step, get the number of elements that we've examined
#> The length of the vector is  17 elements
#> 
#> Formula applied -> ( 830.4706 / 17 ) ^ (1/2) =  6.98936413936664
#> Now try by your own! :D
#> 
#> Use interactive.standardDeviation function to practice.

average absolute deviation:

explain.averageDeviation(data)
#> 
#> __AVERAGE DEVIATION CALCULUS__ 
#> 
#> Formula ->  (Summation(abs(each_element - mean))) / num_elements
#> 
#> The average deviation of a dataset is calculated by adding the absolute value of the diference between each element and the mean of the dataset. This sum will be dividing by the number of elements in the dataset. We'll give the user an example for better comprension.
#> 
#> __Use Example__
#> 
#> First of all, we need to know the content of the dataset/vector of numbers
#> 
#> The content of the vector is: 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> The mean of dataset is... 9.82352941176471
#> The absolute value of the diference between each number and the mean of dataset is: 8.823529 ,8.823529 ,7.823529 ,6.823529 ,5.823529 ,2.823529 ,1.823529 ,1.823529 ,1.823529 ,0.1764706 ,0.1764706 ,1.176471 ,2.176471 ,5.176471 ,10.17647 ,12.17647 ,15.17647
#> Now we need to add each element of the vector/dataset
#> The sum of the squares is:  92.8235294117647 
#> 
#> Next step, get the number of elements that we've examined
#> The length of the vector is  17 elements
#> 
#> Formula applied ->  92.82353 / 17  =  5.46020761245675
#> Now try by your own! :D
#> 
#> Use interactive.averageDeviation function to practice.

variance:

explain.variance(data)
#> 
#> __VARIANCE CALCULUS__ 
#> 
#> Formula ->  (Summation(each_element - mean)^2) / num_elements
#> 
#> The variance of a dataset is calculated by adding the square of the diference between each element and the mean of the dataset. This sum will be dividing by the number of elements in the dataset. We'll give the user an example for better comprension.
#> 
#> __Use Example__
#> 
#> First of all, we need to know the content of the dataset/vector of numbers
#> 
#> The content of the vector is: 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> The mean of dataset is... 9.82352941176471
#> The square of the diference between each number and the mean of dataset is: 77.85467 ,77.85467 ,61.20761 ,46.56055 ,33.91349 ,7.972318 ,3.32526 ,3.32526 ,3.32526 ,0.03114187 ,0.03114187 ,1.384083 ,4.737024 ,26.79585 ,103.5606 ,148.2664 ,230.3253
#> Now we need to add each element of the vector/dataset
#> The sum of the squares is:  830.470588235294 
#> 
#> Next step, get the number of elements that we've examined
#> The length of the vector is  17 elements
#> 
#> Formula applied ->  830.4706 / 17  =  48.8512110726644
#> Now try by your own! :D
#> 
#> Use interactive.variance function to practice.

quartile:

explain.quartile(data)
#> 
#> __QUARTILES CALCULUS__ 
#> 
#> Formula -> (k * N ) / 4 where k -> 1,2,3 and N -> vector size
#> 
#> The quartile divides the dataset in 4 parts as equal as possible.
#> 
#> __Use Example__
#> 
#> Step 1: The vector must be sorted.
#> 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> 
#> Step 2: Apply the formula (k * N) / 4 where 'k' is [1-3]
#> 
#> Q1 -> (1 *  17 ) / 4 =  4.25
#> Q2 -> (2 *  17 ) / 4 =  8.5
#> Q3 -> (3 *  17 ) / 4 =  12.75
#> 
#> Visualization with colors:
#> 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25 
#> 
#> Q1 ->  4.25 || Q2 ->  8.5 || Q3 ->  12.75 || Q4 -> onwards
#> Now try by your own! :D
#> 
#> Use interactive.quartile function to practice.

percentile:

explain.percentile(data)
#> 
#> __PERCENTILES CALCULUS__ 
#> 
#> Formula -> (k * N ) / 100 where k -> [1-100] and N -> vector size
#> 
#> The percentile divides the dataset in 100 parts.
#> The percentile indicates, once the data is ordered from least to greatest, the value of the variable below which a given percentage is located on the data
#> 
#> __Use Example__
#> 
#> Step 1: The vector must be sorted.
#> 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> 
#> Step 2: Apply the formula (k * N) / 100 where 'k' is [1-100]
#> 
#> We will calculate the percentiles 1,25,37,50,92 in this example
#> 
#> Percentile 1 -> (1 *  17 ) / 100 =  0.17 
#>  .Round up the value to locate it in the vector ->  0.17  ~  1 
#>  ..In our data, the value is = 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> 
#> Percentile 25 -> (25 *  17 ) / 100 =  4.25 
#>  .Round up the value to locate it in the vector ->  4.25  ~  5 
#>  ..In our data, the value is = 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> 
#> Percentile 37 -> (37 *  17 ) / 100 =  6.29 
#>  .Round up the value to locate it in the vector ->  6.29  ~  7 
#>  ..In our data, the value is = 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> 
#> Percentile 50 -> (50 *  17 ) / 100 =  8.5 
#>  .Round up the value to locate it in the vector ->  8.5  ~  9 
#>  ..In our data, the value is = 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> 
#> Percentile 92 -> (92 *  17 ) / 100 =  15.64 
#>  .Round up the value to locate it in the vector ->  15.64  ~  16 
#>  ..In our data, the value is = 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> 
#> Now try by your own! :D
#> 
#> Use interactive.percentile function to practice.

absolute frecuency:

explain.absolute_frecuency(data,10)
#> 
#> __ABSOLUTE FRECUENCY CALCULUS__ 
#> 
#> Formula -> N1 + N2 + N3 + ... + Nk -> Nk = X (Where 'X' is the element we want to examine)
#> 
#> The absolute frequency (Ni) of a value Xi is the number of times the value is in the set (X1, X2, ..., XN)
#> 
#> __Use Example__
#> 
#> All we need to do is count the number of times that the element  10  appears in our data set
#> 
#> Our data set: 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> 
#> Now count the number of times that the element  10  appears:  2 
#> 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> Now try by your own! :D
#> 
#> Use interactive.absolute_frecuency function to practice.

relative frecuency:

explain.relative_frecuency(data,8)
#> 
#> __RELATIVE FRECUENCY CALCULUS__ 
#> 
#> Formula -> (Abs_frec(X) / N ) -> Where 'X' is the element we want to examine
#> 
#> The relative frequency is the quotient between the absolute frequency of a certain value and the total number of data
#> 
#> __Use Example__
#> 
#> Step 1: count the number of times that the element  8  appears in our data set
#> 
#> Our data set: 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> 
#> Now count the number of times that the element  8  appears:  3 
#> 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> Step 2: divide it by the length of the data set
#> 
#> Solution --> relative_frecuency = (absolute_frecuency(x) / length(data)) =  3  /  17  =  0.176470588235294 .
#> 
#> Now try by your own! :D
#> 
#> Use interactive.relative_frecuency function to practice.

absolute acumulated frecuency:

explain.absolute_acum_frecuency(data,10)
#> 
#> __ABSOLUTE ACUMULATED FRECUENCY CALCULUS__ 
#> 
#> Formula -> Summation(abs_frecuency <= X ) -> Where 'X' is the element we want to examine
#> 
#> The absolute acumulated frequency is the sum of the absolute frequency of the values minors or equals than the value we want to examine
#> 
#> __Use Example__
#> 
#> Step 1: count the number of times that the elements minors or equals than  10  appears in our data set
#> 
#> Our data set: 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> 
#> Number of times that elements minors or equals to  10  appears =  11 
#> 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> Solution --> absolute_frecuency_acum = Summation(abs_frecuency <= X)  =  11 .
#> 
#> Now try by your own! :D
#> 
#> Use interactive.absolute_acum_frecuency function to practice.

relative acumulated frecuency:

explain.relative_acum_frecuency(data,8)
#> 
#> __RELATIVE ACUMULATED FRECUENCY CALCULUS__ 
#> 
#> Formula -> (Summation(abs_frecuency <= X) / N ) -> Where 'X' is the element we want to examine
#> 
#> The relative acumulated frequency is the quotient between the sum of the absolute frequency of the values minors or equals than the value we want to examine, and the total number of data
#> 
#> __Use Example__
#> 
#> Step 1: count the number of times that the elements minors or equals than  8  appears in our data set
#> 
#> Our data set: 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> 
#> Number of times that elements minors or equals to  8  appears =  9 
#> 1 ,1 ,2 ,3 ,4 ,7 ,8 ,8 ,8 ,10 ,10 ,11 ,12 ,15 ,20 ,22 ,25
#> Step 2: divide it by the length of the data set
#> 
#> Solution --> relative_frecuency_acum = (Summation(abs_frecuency <= X) / length(data)) =  9  /  17  =  0.529411764705882 .
#> 
#> Now try by your own! :D
#> 
#> Use interactive.relative_acum_frecuency function to practice.

learningRlab

Main Functions:

Explained Functions:

User Interactive Functions: