Table of Contents
Mean, Mode, Variance and Standard Deviation
In math notation when we see in a formula, it refers to Population and when we see
it means Sample.
Moda
It is the value that is most frequent in a series. There is an implementation inside python implementation…
Note: is the formula for unbiased sample variance, since we are dividing by
Note: Finding the reintroduces bias.


import mathvalores = [1, 2, 3, 8, 4, 9.8, 6.5, 4, 3, 8, 5, 9, 3.3, 0, 4, 7, 9]valores_clean_float = [float(x) for x in valores]media = sum(valores_clean_float) / len(valores_clean_float)moda = max(valores_clean_float, key=valores_clean_float.count)valores_clean_float_sorted = sorted(valores_clean_float)list_size = len(valores_clean_float_sorted)if list_size % 2 == 0: mediana = (valores_clean_float_sorted[int((list_size / 2) - 1)] + valores_clean_float_sorted[(int(list_size / 2))]) / 2else: mediana = valores_clean_float_sorted[math.floor(list_size / 2)]squared_distance_from_mean = [round((x-media)**2, 2) for x in valores_clean_float_sorted]variance = sum(squared_distance_from_mean)/list_sizestandard_deviation = math.sqrt(variance)print('Media: ', media, '- Moda: ', moda, '- Mediana: ', mediana, '- Variância: ', variance, '- Standard Deviation: ', standard_deviation)
Simple Linear Regression



















A shortcut formula is




x = [1,2,4,3,5]y = [1,3,3,2,5]m = 0 b = 0grau_aprendizado = 0.01for i in range(4): for i in range(len(x)): previsao = m * float(i) + b # y = mx + b erro = previsao - float(y[i]) # erro = p ( i ) - y ( i ) m = m - grau_aprendizado * erro * float(x[i]) b = b - grau_aprendizado * erro * 1.0 print "m {} b {}".format(m, b)
Logistic Regression







Each column in your input data has an associated B coefficient (a constant real value) that must be learned from your training data. The actual representation of the model that you would store in memory or in a file are the coefficients in the equation (the beta value or B’s).
Linear Discriminant Analysis – LDA

Steps



4. Making predictions – Just plug the values found above into the representation model
for X = 4.667797637 and Y = 0
for X = 4.667797637 and Y = 1
We can see that the discriminant value for Y = 0 (12.3293558) is larger than the discriminate value for Y = 1 (-130.3349038), therefore the model predicts Y = 0. Which we know is correct in the dataset.
CART – Classification And Regression Trees

Steps
Sample Dataset
X1 X2 Y2.771244718 1.784783929 01.728571309 1.169761413 03.678319846 2.81281357 03.961043357 2.61995032 02.999208922 2.209014212 07.497545867 3.162953546 19.00220326 3.339047188 17.444542326 0.476683375 110.12493903 3.234550982 16.642287351 3.319983761 1
1. Find the best Split Point Candidate for a feature by iterating through the dataset
IF X1 < 2.7712 THEN LEFT
IF X1 >= 2.7712 THEN RIGHT
X1 Y Group2.771244718 0 RIGHT1.728571309 0 LEFT3.678319846 0 RIGHT3.961043357 0 RIGHT2.999208922 0 RIGHT7.497545867 1 RIGHT9.00220326 1 RIGHT7.444542326 1 RIGHT10.12493903 1 RIGHT6.642287351 1 RIGHT
1.2 Calculate the proportions for each side related to each class
LEFT


1.3 Calculate the Gini for this candidate


1.4 Continue iterating over the dataset until you find the lowest Gini. In this case the lowest Gini index is the X = 6.6422
IF X1 < 6.6422 THEN LEFT
IF X1 >= 6.6422 THEN RIGHT
X1 Y Group2.771244718 0 LEFT1.728571309 0 LEFT3.678319846 0 LEFT3.961043357 0 LEFT2.999208922 0 LEFT7.497545867 1 RIGHT9.00220326 1 RIGHT7.444542326 1 RIGHT10.12493903 1 RIGHT6.642287351 1 RIGHT
LEFT




This is a split that results in a pure Gini index, because the classes are perfectly separated. The LEFT child node will classify instances as class 0 and the RIGHT as class 1.
What do you think?