## Regression problem

In the study of any real processes, whether it's cooking pasta or analyzing investments, there is one general principle - they all depend from any parameters. The taste of pasta depends on the temperature of the stove, the amount of water, salt, the quality of pasta and so on, mathematically this is denoted as follows:

Taste = f(temperature, volume of water, salt, ...)

So, let's deal with cooking a portion of pasta, you have a set of random variables: the temperature of the stove, the volume of water, the amount of salt. Let 's set a goal find out how the amount of water affects the taste of pasta.

## Problem statement

How to determine the effect of water volume on the taste of pasta? It is necessary to conduct a series of experiments in which each cooking of pasta will be it will be carried out with a different volume of water, but the other conditions (temperature and amount of salt) will be fixed. We will ask temperature values and amount of salt:

Temperature | t=500°C |
---|---|

Amount of salt | 15 g |

Table 1. Fixed values for the experiment |

Let's start our experiments for different volumes of water, take from 500 ml to 2200 ml, and every time we will taste the pasta and write down all our results:

# | Water volume | Rating |
---|---|---|

1 | 500 ml | 2 |

2 | 600 ml | 3 |

3 | 700 ml | 4 |

4 | 800 ml | 5 |

5 | 900 ml | 6 |

6 | 1000 ml | 8 |

7 | 1100 ml | 9 |

8 | 1200 ml | 11 |

9 | 1300 ml | 14 |

10 | 1400 ml | 19 |

11 | 1500 ml | 23 |

12 | 1600 ml | 26 |

13 | 1700 ml | 31 |

14 | 1800 ml | 41 |

15 | 1900 ml | 45 |

16 | 2000 ml | 51 |

17 | 2100 ml | 69 |

18 | 2200 ml | 76 |

Table 2. Evaluation of the taste of pasta depending on the volume of water |

## Detection of dependence

So, we evaluate the taste of pasta depending on the volume of water, mathematically we study the function: Taste = f(Volume). All regression analysis it consists in the process of identifying the function f in this dependence.

In regression analysis, functions (models) are divided into two types: linear and nonlinear.

Linear model

y = a + bx

Nonlinear model

y = ab^{x}+ c

In order to build a **simple** regression model (function), you need to have the courage and make an assumption, for example:

— This function is similar to a linear one!

When you have chosen a regression model, you begin to select coefficients, for example, in a linear model y=a+bx, it is necessary select the coefficients a and b. The task is relatively simple, "a" is the first value, and "b" can be found by the difference between the last and the first values. Having performed such an operation with our example, we get:

a = -20

b = 0.044

Taste = -20 + 0.044x

Let's tabulate the values of our model:

500 ml | 600 ml | 700 ml | 800 ml | 900 ml | 1000 ml | 1100 ml | 1200 ml | 1300 ml |
---|---|---|---|---|---|---|---|---|

2 | 6.4 | 10.8 | 15.2 | 19.6 | 24 | 28.4 | 32.8 | 37.2 |

1400 ml | 1500 ml | 1600 ml | 1700 ml | 1800 ml | 1900 ml | 2000 ml | 2100 ml | 2200 ml |

41.6 | 46 | 50.4 | 54.8 | 59.2 | 63.6 | 68 | 72.4 | 76.8 |

Table 3. Tabulated values of the regression model |

Here's how it looks on the graph:

**Graph 1.**Linear regression model and initial data## Getting the result

With a stretch, of course, it looks like, but for mathematical inference it is necessary to find the spread of model values and real values. These values are the sum of the squared deviations and the standard error:

RSS (sum of squared deviations) = (2 - 2)^{2}+ (6.4 - 3)^{2}+ ... + (76.8 - 76)^{2}= 5172.6

MSE (Mean Square Deviation) = √RSS = 71.92

S (variance) = 16.95

What to do with this regression model? The regression model allows you to predict what will happen, for example, if we take 2300 ml, 2400 ml, etc. without conducting the experiment itself:

Taste_{2300 ml}= -20 + 0.044· 2300 = 81.2

Taste_{2400 ml}= -20 + 0.044· 2400 = 85.6

And, of course, we can find out how much water is needed for perfect pasta:

Water_{perfect pasta}= (100-20) / 0.044 = 2727 ml

## Minimizing the error

So, with us our model y = a + bx and the real values of the function, the difference between the function and the model - this is the mistake that we make in every experiment. So we can build the error function, and if we have a function, then we can always find its minimum. This is what we will do, finding the minimum of the error function.

The error is the difference between the real value and the simulated one, since this difference can be as positive and negative, it is necessary to use the difference module, which is the easiest thing to do squaring the error and then extracting the root. So our error on every known result is:

Y_{o}- value from observation, Y_{m}- value from model

e = (Y_{o}- Y_{m})^{2}= (Y_{o}- a - bx)^{2}

Total error

S = Σe = Σ(Y_{o}- a - bx)^{2}

The function S is an error function that needs to be minimized, it depends on the parameters a and b. To find the minimum of the function, we will use a simple method - we will find derivatives with respect to the parameters a and b (here we will omit complex search methods minimum of the function):

Derived error functions for parameters a and b:

dS/da = Σ2(a+bx-y)

dS/db = Σ2(a+bx-y)x

Minimum condition of the function:

Σ2(a+bx-y) = 0

Σ2(a+bx-y)x = 0

Simplify, reduce by 2 and expand the brackets (n is the number of observations):

na + bΣx = Σy

aΣx + bΣx^{2}= Σxy

Find a solution:

Σx = 24 300

Σx^{2}= 37 650 000

Σy = 443

Σxy = 793 100

18·a + 24300·b = 443

24300·a + 37650000·b = 793100

-3589·a = 106723 ∴ a = -30

b = 0.04

Let's try our new model in action:

**Graph 3.**Linear regression model adjusted by the least squares method, y = -30·x + 0.04RSS (sum of squared deviations) = (-10 - 2)^{2}+ (-6 - 3)^{2}+ ... + (58 - 76)^{2}= 1175

MSE (Mean Square Deviation) = √RSS = 34.28

S (variance) = 8.08

Taste_{2300 ml}= -30 + 0.04· 2300 = 62

Taste_{2400 ml}= -30 + 0.04· 2400 = 66

As you may have noticed, the predictions of our first model are closer to the truth than the adjusted model. Why? Because that the model was chosen incorrectly, the graph of the function is more like an exponent, and even based on knowledge of the process, it is clear that the linear dependence this is not the place. But this was just an example of a linear regression model, read about more complex models and how to choose a model in the following articles.