Table of Contents
Mean, Mode, Variance and Standard Deviation
In math notation when we see in a formula, it refers to Population and when we see
it means Sample.
Moda
It is the value that is most frequent in a series. There is an implementation inside python implementation…
Note: is the formula for unbiased sample variance, since we are dividing by
Note: Finding the reintroduces bias.
data:image/s3,"s3://crabby-images/4c0d3/4c0d3e15ff3d98da9bf6fab0244703dd7a74f4a3" alt="Rendered by QuickLaTeX.com {y}"
data:image/s3,"s3://crabby-images/4c0d3/4c0d3e15ff3d98da9bf6fab0244703dd7a74f4a3" alt="Rendered by QuickLaTeX.com {y}"
import mathvalores = [1, 2, 3, 8, 4, 9.8, 6.5, 4, 3, 8, 5, 9, 3.3, 0, 4, 7, 9]valores_clean_float = [float(x) for x in valores]media = sum(valores_clean_float) / len(valores_clean_float)moda = max(valores_clean_float, key=valores_clean_float.count)valores_clean_float_sorted = sorted(valores_clean_float)list_size = len(valores_clean_float_sorted)if list_size % 2 == 0: mediana = (valores_clean_float_sorted[int((list_size / 2) - 1)] + valores_clean_float_sorted[(int(list_size / 2))]) / 2else: mediana = valores_clean_float_sorted[math.floor(list_size / 2)]squared_distance_from_mean = [round((x-media)**2, 2) for x in valores_clean_float_sorted]variance = sum(squared_distance_from_mean)/list_sizestandard_deviation = math.sqrt(variance)print('Media: ', media, '- Moda: ', moda, '- Mediana: ', mediana, '- Variância: ', variance, '- Standard Deviation: ', standard_deviation)
Simple Linear Regression
data:image/s3,"s3://crabby-images/d966e/d966e261efe282f795003515181f58a254a0bf12" alt="Rendered by QuickLaTeX.com {y} = B {0} + B{1} * {x}"
data:image/s3,"s3://crabby-images/cd1dc/cd1dcc9ab151823fb392b4bda5080a004c99123d" alt="Rendered by QuickLaTeX.com {y} = {mx} + {b}"
data:image/s3,"s3://crabby-images/e56a6/e56a654d9ef808784ee644f489613507ff6ace08" alt="Rendered by QuickLaTeX.com {MSE}"
data:image/s3,"s3://crabby-images/8b9c8/8b9c8bfca5c89136de4827337bd8d27ada54c8ba" alt="Rendered by QuickLaTeX.com {Stochastic Gradient Descent}"
data:image/s3,"s3://crabby-images/4c0d3/4c0d3e15ff3d98da9bf6fab0244703dd7a74f4a3" alt="Rendered by QuickLaTeX.com {y}"
data:image/s3,"s3://crabby-images/6e0fe/6e0fe8ea5d72ddb2e7439f3f965600fe98373b5c" alt="Rendered by QuickLaTeX.com B{0}"
data:image/s3,"s3://crabby-images/240d1/240d14f0d99a913b047cdcc5533fb9d204e17e91" alt="Rendered by QuickLaTeX.com {b}"
data:image/s3,"s3://crabby-images/fb284/fb284595e5da19aa976d1347069ccd1d7955272a" alt="Rendered by QuickLaTeX.com B{0} = {mean(y)} - B{1} * {mean(x)}"
data:image/s3,"s3://crabby-images/79fc0/79fc068675d8553a64b698950ec93b51cf904abc" alt="Rendered by QuickLaTeX.com B{1}"
data:image/s3,"s3://crabby-images/9bc58/9bc58df5f89cb4f3fd9ef5ab4c9369fb85253e91" alt="Rendered by QuickLaTeX.com {m}"
data:image/s3,"s3://crabby-images/ad13f/ad13f89dbcd579109be0cd2f449e09c40c7ceb63" alt="Rendered by QuickLaTeX.com \hat{\beta}_1 = \frac{\sum(X_i - \bar{X}) (Y_i - \bar{Y})} {\sum(X_i - \bar{X})^2}"
data:image/s3,"s3://crabby-images/9006c/9006c5d84b12cc7eb6b2e738bf8485c694909358" alt="Rendered by QuickLaTeX.com {X}_i"
data:image/s3,"s3://crabby-images/88d04/88d0466b8c8e7b973189878a75cbb1165a5cf4c6" alt="Rendered by QuickLaTeX.com {x}"
data:image/s3,"s3://crabby-images/20a23/20a23d1c81bdf9d96cb44851eebed3afdf26e975" alt="Rendered by QuickLaTeX.com \bar{X}"
data:image/s3,"s3://crabby-images/88d04/88d0466b8c8e7b973189878a75cbb1165a5cf4c6" alt="Rendered by QuickLaTeX.com {x}"
data:image/s3,"s3://crabby-images/dac2f/dac2fec3e2c57eb15bfd230d81287f572af513fc" alt="Rendered by QuickLaTeX.com {Y}_i"
data:image/s3,"s3://crabby-images/dc772/dc7724ecebd0580882435be1409df72ff23759ce" alt="Rendered by QuickLaTeX.com {y}"
data:image/s3,"s3://crabby-images/6be0b/6be0ba4dce3ff754b093bded1feb4add314f5a83" alt="Rendered by QuickLaTeX.com \bar{Y}"
data:image/s3,"s3://crabby-images/dc772/dc7724ecebd0580882435be1409df72ff23759ce" alt="Rendered by QuickLaTeX.com {y}"
A shortcut formula is
data:image/s3,"s3://crabby-images/1ca8a/1ca8a7c06de6fb0d704efa3a60b77dc6ccb38d54" alt="Rendered by QuickLaTeX.com B1 = {corr({x},{y})} * \frac{stdev({y})} {stdev({x})}"
data:image/s3,"s3://crabby-images/04ee4/04ee4f81fa20467b9935fd8c5faa5cdea2c64735" alt="Rendered by QuickLaTeX.com corr({x, y})"
data:image/s3,"s3://crabby-images/88d04/88d0466b8c8e7b973189878a75cbb1165a5cf4c6" alt="Rendered by QuickLaTeX.com {x}"
data:image/s3,"s3://crabby-images/dc772/dc7724ecebd0580882435be1409df72ff23759ce" alt="Rendered by QuickLaTeX.com {y}"
x = [1,2,4,3,5]y = [1,3,3,2,5]m = 0 b = 0grau_aprendizado = 0.01for i in range(4): for i in range(len(x)): previsao = m * float(i) + b # y = mx + b erro = previsao - float(y[i]) # erro = p ( i ) - y ( i ) m = m - grau_aprendizado * erro * float(x[i]) b = b - grau_aprendizado * erro * 1.0 print "m {} b {}".format(m, b)
Logistic Regression
data:image/s3,"s3://crabby-images/c7a28/c7a28cfab406da95fa0b55ab6fd23dc9ab69daf3" alt="Rendered by QuickLaTeX.com y = \frac{e ^{ B0 + B1 * x }} {1 + e ^{ B0 + B1 * x }}"
data:image/s3,"s3://crabby-images/e56a6/e56a654d9ef808784ee644f489613507ff6ace08" alt="Rendered by QuickLaTeX.com {MSE}"
data:image/s3,"s3://crabby-images/8b9c8/8b9c8bfca5c89136de4827337bd8d27ada54c8ba" alt="Rendered by QuickLaTeX.com {Stochastic Gradient Descent}"
data:image/s3,"s3://crabby-images/68177/6817774820a8d2c81fa3e33d2aa71be7553463bf" alt="Rendered by QuickLaTeX.com y"
data:image/s3,"s3://crabby-images/5bddb/5bddb29249865da09194a76094a8175217bd5623" alt="Rendered by QuickLaTeX.com B0"
data:image/s3,"s3://crabby-images/d2b4f/d2b4f97aa605f3169a6ade6d2d89e1e78593df81" alt="Rendered by QuickLaTeX.com B1"
data:image/s3,"s3://crabby-images/418f1/418f19cfc0248e4177efdbd8aa17862b3e75a114" alt="Rendered by QuickLaTeX.com x"
Each column in your input data has an associated B coefficient (a constant real value) that must be learned from your training data. The actual representation of the model that you would store in memory or in a file are the coefficients in the equation (the beta value or B’s).
Linear Discriminant Analysis – LDA
data:image/s3,"s3://crabby-images/a3f82/a3f825bc681cd827dfad5ff274243aad4214603a" alt="Rendered by QuickLaTeX.com discriminant(x) = x * \frac{mean} {variance} - \frac {mean ^{2}}{2 * variance} + ln(probability)"
Steps
data:image/s3,"s3://crabby-images/dbe3c/dbe3c95d36ad969f3f2890c3fc97c2601e8ef4f9" alt="Rendered by QuickLaTeX.com P(y = 0) = \frac {count(y=0)} {count(y=0)+count(y=1)}"
data:image/s3,"s3://crabby-images/b5a0d/b5a0db75f0f03ba0b5be8e01a9f0ca8f6b313c2b" alt="Rendered by QuickLaTeX.com P(y = 1) = \frac {count(y=1)} {count(y=0)+count(y=1)}"
data:image/s3,"s3://crabby-images/cf5f6/cf5f64ffdbe9d07455f48484e91ba22f18fc4e6e" alt="Rendered by QuickLaTeX.com SquaredDifference = (x - mean_k)^2"
4. Making predictions – Just plug the values found above into the representation model
for X = 4.667797637 and Y = 0
for X = 4.667797637 and Y = 1
We can see that the discriminant value for Y = 0 (12.3293558) is larger than the discriminate value for Y = 1 (-130.3349038), therefore the model predicts Y = 0. Which we know is correct in the dataset.
CART – Classification And Regression Trees
data:image/s3,"s3://crabby-images/c9279/c927946755329fae1721fca3295c1db23759b130" alt="Rendered by QuickLaTeX.com G = ( ( 1 - ( {g1_1}^2 + {g1_2}^2 ) ) * \frac{n_g1} {n} ) + ( ( 1 - ( {g2_1}^2 + {g2_2}^2 ) ) * \frac{n_g2} {n} )"
Steps
Sample Dataset
X1 X2 Y2.771244718 1.784783929 01.728571309 1.169761413 03.678319846 2.81281357 03.961043357 2.61995032 02.999208922 2.209014212 07.497545867 3.162953546 19.00220326 3.339047188 17.444542326 0.476683375 110.12493903 3.234550982 16.642287351 3.319983761 1
1. Find the best Split Point Candidate for a feature by iterating through the dataset
IF X1 < 2.7712 THEN LEFT
IF X1 >= 2.7712 THEN RIGHT
X1 Y Group2.771244718 0 RIGHT1.728571309 0 LEFT3.678319846 0 RIGHT3.961043357 0 RIGHT2.999208922 0 RIGHT7.497545867 1 RIGHT9.00220326 1 RIGHT7.444542326 1 RIGHT10.12493903 1 RIGHT6.642287351 1 RIGHT
1.2 Calculate the proportions for each side related to each class
LEFT
data:image/s3,"s3://crabby-images/8642c/8642ce7deea06197069f7545e769e855dddb4001" alt="Rendered by QuickLaTeX.com Y = 0: \frac {4}{9} = 0.4444"
data:image/s3,"s3://crabby-images/071d4/071d481360eadccc8a6e675af71977c6d6d6d3d3" alt="Rendered by QuickLaTeX.com Y = 1: \frac {5}{9} = 0.5555"
1.3 Calculate the Gini for this candidate
data:image/s3,"s3://crabby-images/33453/33453be422083ec4fcd1329344748f5b40a19cf7" alt="Rendered by QuickLaTeX.com Gini(X1 = 2.7712) = ( ( 1 - ( \frac{1^2}{1} + \frac{0^2}{1} ) ) * \frac{1} {10} ) + ( ( 1 - ( \frac{4^2}{9} + \frac{5^2}{9} ) ) * \frac{9} {10} )"
data:image/s3,"s3://crabby-images/2bb1c/2bb1cee66eb7f275563c6b4896e81631e313bc9f" alt="Rendered by QuickLaTeX.com Gini(X1 = 2.7712) = 0.4444"
1.4 Continue iterating over the dataset until you find the lowest Gini. In this case the lowest Gini index is the X = 6.6422
IF X1 < 6.6422 THEN LEFT
IF X1 >= 6.6422 THEN RIGHT
X1 Y Group2.771244718 0 LEFT1.728571309 0 LEFT3.678319846 0 LEFT3.961043357 0 LEFT2.999208922 0 LEFT7.497545867 1 RIGHT9.00220326 1 RIGHT7.444542326 1 RIGHT10.12493903 1 RIGHT6.642287351 1 RIGHT
LEFT
data:image/s3,"s3://crabby-images/5f92d/5f92da8d21b05f16406c885ed694a6bd887be2f8" alt="Rendered by QuickLaTeX.com Y = 0: \frac {0}{5} = 0.0"
data:image/s3,"s3://crabby-images/4b968/4b96873c1957ddb65cbacb1db5f67a121860e51c" alt="Rendered by QuickLaTeX.com Y = 1: \frac {5}{5} = 1.0"
data:image/s3,"s3://crabby-images/627af/627af20c8f5fd5648fb8aeba78cdeae70a715894" alt="Rendered by QuickLaTeX.com Gini(X1 = 6.6422) = ( ( 1 - ( \frac{5^2}{5} + \frac{0^2}{5} ) ) * \frac{5} {10} ) + ( ( 1 - ( \frac{0^2}{5} + \frac{5^2}{5} ) ) * \frac{5} {10} )"
data:image/s3,"s3://crabby-images/0054a/0054ad7834e941edd0fed8ab6aa03c945bd661ab" alt="Rendered by QuickLaTeX.com Gini(X1 = 6.6422) = 0.0"
This is a split that results in a pure Gini index, because the classes are perfectly separated. The LEFT child node will classify instances as class 0 and the RIGHT as class 1.
What do you think?