# Predicting Your Firm’s Future with Least Squares, Part II Linear least squares lets you find the line that best fits a set of data points but sometimes the points don’t lie along a line. The second part of this two-part article explains how you can use polynomial least squares to find curves that fit your data.As the first part of this article explained, sometimes you can use an advanced math concept — linear least squares — to predict the future. If you can find a line that closely fits your data, you can extend that line into the future to make predictions about what’s to come. If your data represents such things as web site traffic, retail sales, or electricity usage over time, then a good linear least squares fit may help you predict future revenue and expenses.But what if your data doesn’t fall along a neat line? Take a look at the data points plotted in Figure 1. These points clearly don’t lie in a line but they do form some sort of pattern. If you can find a curve that the points do fit, then you can use that curve to predict future values. Figure 1. Out of Line: These points don’t lie along a line but there is a clear pattern.If the points fall along some polynomial curve, then you can use techniques similar to those described earlier in this article to find a polynomial least squares fit. In other words, you can find a polynomial A0 + A1 * x + A2 * x2 + … + Ad * xd for values A0, A1, …, Ad to minimize the sum of the squares of the vertical distances between the points and the curve. In this equation, d is called the polynomial’s degree.Figure 2 shows a degree 2 polynomial least squares fit for the points shown in Figure 1. This curve fits the data pretty well. Figure 2. Polynomial Perfection: This data fits closely to a degree 2 polynomial.To find the least squares polynomial fit to the data, you need to use calculus that’s very similar to the calculus you used to find the linear least squares fit. (For more information on that calculus, see the first part of this article. You also need to use a little linear algebra. Neither of these is too hard but if you don’t want to work through them, you can skip the next few sections.

### A Little Linear Algebra

Suppose you have a series of n equations with n unknown values x1, x2, … , xn: A11 * x1 + A12 * x2 + … + A1n * xn = C1A21 * x1 + A22 * x2 + … + A2n * xn = C2…An1 * x1 + An2 * x2 + … + Ann * xn = Cn (Note: To keep this discussion general and more consistent with what you’ll see in other places, the variables here are x1, x2, and so on and the known constants are A11, A12, and so forth. When we use this to find polynomial least squares fits, you’ll be solving for A1, A2, and so forth.)One way to solve this system of equations is Gaussian elimination. (Actually I use a slightly modified version of Gaussian elimination that avoids the need for back substitution. For a more rigorous discussion of Gaussian elimination, look in Wolfram MathWorld or Wikipedia.)It’s a fairly simple method but learning it from a description can be a bit tricky so after the following description I’ll work through an example that should make things clearer. You can write the system of equations as a matrix equation like this: The first step is to put a 1 in the matrix’s upper left corner. That’s easy. Simply divide each of the entries in the first row by the value currently in the upper left corner. This is called normalizing the row. If you think about it, you’ll realize that this doesn’t change the truth of the first equation. All you’re doing is dividing both sides of the equation by the same number so the equation remains true.Now you have an augmented matrix that looks like the following where ? means an entry whose value isn’t important right now. You’ll need those values later but they don’t change the basic shape of the augmented matrix. Now you repeat these steps to place a 1 in the second row’s second column and 0s in all of the other entries in that column. Continue the process until the matrix has this form: Now if you go back to the original meaning of this matrix, you can read out the xi values. Using the values in this matrix to reconstruct the first equation you get x1 + 0 + 0 + … + 0 = V1 so x1 = V1. Using the other rows in the matrix you can see that xi = Vi. There’s one odd special case that may throw a monkey wrench into this process. When you try to change the ith entry on row i into a 1, there’s a chance that the entry will have value 0. In that case, you cannot divide the other entries in that row by 0 to normalize the row.

If that happens, simply swap that row with one of the following rows that does not have a 0 in this column and continue as before. (If there is no such row, then the system of equations does not have a unique solution.) If you think about what this swap means, you can see that it doesn’t change the truth of the equations. This operation corresponds to switching the original order of the equations and that certainly doesn’t change whether they are true or false.

### A Gaussian Example

So how about that example? Suppose you want to find a solution to the following three equations with three unknowns:

3 * x + 6 * y + 3 * z = 9
2 * x + 4 * y – z = -3
4 * x + 6 * y + 2 * z = 8

The augmented matrix for this system of equations is:  Now you can read off the solution: x = 2, y = -1, z = 3. If you plug these values into the original equations, you’ll see that they work.

Hopefully by now you understand Gaussian elimination. It’s fairly simple and only requires some easy multiplication, addition, and row swapping. The only catch is that there are a lot of operations so it’s easy to make mistakes. Fortunately the computer is very good at performing these sorts of operations without making mistakes.

The following code shows how the PolynomialLeastSquares example program (which is available for download in C# and Visual Basic versions) performs Gaussian elimination. I’ve put this in a separate method because it’s handy to have for solving other systems of equations not just those needed for polynomial least squares.
``// Perform Gaussian elimination on these coefficients.// Return the array of values that gives the solution.private double[] GaussianElimination(double[,] coeffs){    int max_equation = coeffs.GetUpperBound(0);    int max_coeff = coeffs.GetUpperBound(1);    for (int i = 0; i <= max_equation; i++)    {        // Use equation_coeffs[i, i] to eliminate the ith        // coefficient in all of the other equations.        // Find a row with non-zero ith coefficient.        if (coeffs[i, i] == 0)        {            for (int j = i + 1; j<= max_equation; j++)            {                // See if this one works.                if (coeffs[j, i] != 0)                {                    // This one works. Swap equations i and j.                    // This starts at k = i because all                    // coefficients to the left are 0.                    for (int k = i; k <= max_coeff; k++)                    {                        double temp = coeffs[i, k];                        coeffs[i, k] = coeffs[j, k];                        coeffs[j, k] = temp;                    }                    break;                }            }        }        // Make sure we found an equation with        // a non-zero ith coefficient.        double coeff_i_i = coeffs[i, i];        if (coeff_i_i == 0)        {            throw new ArithmeticException(String.Format(                "There is no unique solution for these points.",                coeffs.GetUpperBound(0) - 1));        }        // Normalize the ith equation.        for (int j = i; j <= max_coeff; j++)        {            coeffs[i, j] /= coeff_i_i;        }        // Use this equation value to zero out        // the other equations' ith coefficients.        for (int j = 0; j<= max_equation; j++)        {            // Skip the ith equation.            if (j != i)            {                // Zero the jth equation's ith coefficient.                double coef_j_i = coeffs[j, i];                for (int d = 0; d <= max_coeff; d++)                {                    coeffs[j, d] -= coeffs[i, d] * coef_j_i;                }            }        }    }    // At this point, the ith equation contains    // 2 non-zero entries:    //      The ith entry which is 1    //      The last entry coeffs[max_coeff]    // This means Ai = equation_coef[max_coeff].    double[] solution = new double[max_equation + 1];    for (int i = 0; i <= max_equation; i++)    {        solution[i] = coeffs[i, max_coeff];    }    // Return the solution values.    return solution;}``

### Back to Polynomial Least Squares

That was a long digression from polynomial least squares so let’s review where we were when the digression began. To find the best fit polynomial, you found that the partial derivative of the polynomial with respect to Aj is:

?E/?Aj = 2 * (?yi * xij – A0 * ?xij – A1 * ?xij+1 – … – AD * ?xij+D)

To find the best solution, you set this equation and the partial derivatives for the other Ai equal to 0 and solve for the Ai values. Now that you know how to perform Gaussian elimination, this is a snap. The only tricky part is figuring out what coefficients to use in the equations that you send into the Gaussian elimination method.

The following code shows how the PolynomialLeastSquares example program builds the coeffs array holding the augmented matrix that it passes to the GaussianElimination method.

``// Find the least squares polynomial fit.private double[] FindPolynomialLeastSquaresFit(    List points, int degree){    // Allocate space for (degree + 1) equations with     // (degree + 2) terms each.    double[,] coeffs = new double[degree + 1, degree + 2];    // Calculate the coefficients for the equations.    for (int j = 0; j <= degree; j++)    {        // Calculate the coefficients for the jth equation.        // Calculate the constant term for this equation.        coeffs[j, degree + 1] = 0;        foreach (PointF pt in points)        {            coeffs[j, degree + 1] -= Math.Pow(pt.X, j) * pt.Y;        }        // Calculate the other coefficients.        for (int a_sub = 0; a_sub <= degree; a_sub++)        {            // Calculate the dth coefficient.            coeffs[j, a_sub] = 0;            foreach (PointF pt in points)            {                coeffs[j, a_sub] -= Math.Pow(pt.X, a_sub + j);            }        }    }    // Solve the equations.    return GaussianElimination(coeffs);}``
``` ```

By using programs such as PolynomialLeastSquares, you can get a better understanding of your data. You can find the best fit of a given degree and then judge whether the data points fit that polynomial well.

Ideally if you know something about the process that generated the data, you may be able to figure out what degree to use for the polynomial, but if you can't, beware of fitting data to high degree polynomials. A degree d polynomial can exactly fit d + 1 points so if you use too high a degree you'll get a misleadingly exact match.For example, Figure 3 shows a degree 9 polynomial exactly fitting 10 data points. The curve fits the points exactly but that doesn't mean it's the best fit or the most useful one. Figure 3. Too Tight a Fit: A high-degree polynomial can fit most any data, possibly obscuring its true structure.

Figure 4 shows the same points fitted by a degree 2 polynomial. Although this doesn't fit the points exactly, it's probably is a better representation of the data than the previous curve.

For example, suppose you want to predict the y value when x is close to 0. The polynomial in Figure 4 would give you a value close to the bottom of the graph while the polynomial in Figure 3 would give you a value way off the top. Which is really better depends on your situation but unless something really complex is happening, the degree 2 curve is probably a better bet. Figure 4. Relaxed Fit: A degree 2 polynomial probably fits this data better than a degree 9 polynomial even though it doesn't fit the points exactly.

You can use techniques similar to those described here to find other kinds of least squares fits, too. For example, depending on your data you may want to use a logarithmic curve of the form y = a + b * ln(x), a power curve of the form y = a * xb, or an exponential curve of the form y = a * ebx. For more information on these kinds of curves, see Wolfram MathWorld. Share the Post: ### Cybersecurity Battles: Lapsus\$ Era Unfolds

In 2023, the cybersecurity field faces significant challenges due to the continuous transformation of threats and the increasing abilities of hackers. A prime example of this is the group of ### Web Application Supply Chain Security

Today’s web applications depend on a wide array of third-party components and open-source tools to function effectively. This reliance on external resources poses significant security risks, as malicious actors can ### Thrilling Battle: Germany Versus Huawei

The German interior ministry has put forward suggestions that would oblige telecommunications operators to decrease their reliance on equipment manufactured by Chinese firms Huawei and ZTE. This development comes after ### The iPhone 15’s Secrets and Surprises

As we dive into the most frequently asked questions and intriguing features, let us reiterate that the iPhone 15 brings substantial advancements in technology and design compared to its predecessors. ### iPhone 15: Performance, Camera, Battery

Apple’s highly anticipated iPhone 15 has finally hit the market, sending ripples of excitement across the tech industry. For those considering upgrading to this new model, three essential features come ### Virginia’s Clean Economy Act Soars Ahead

Virginia has made significant strides towards achieving its short-term carbon-free objectives as outlined in the Clean Economy Act of 2020. Currently, about 44,000 megawatts (MW) of wind, solar, and energy ### Innovative Energy Storage Solutions

The Department of Energy recently revealed a significant investment of \$325 million in advanced battery technologies to store excess renewable energy produced by solar and wind sources. This funding will ### Revolutionizing India’s Tech Sector with Renesas

Tushar Sharma, a semiconductor engineer at Renesas Electronics, met with Indian Prime Minister Narendra Modi to discuss the company’s support for India’s “Make in India” initiative. This initiative focuses on ### Thrilling East Windsor Mixed-Use Development

Real estate developer James Cormier, in collaboration with a partnership, has purchased 137 acres of land in Connecticut for \$1.15 million with the intention of constructing residential and commercial buildings. ### Inside Michigan’s Clean Energy Revolution

Democratic state legislators in Michigan continue to discuss and debate clean energy legislation in the hopes of establishing a comprehensive clean energy strategy for the state. A Senate committee meeting ### European Chips Act: What is it?

In response to the intensifying worldwide technology competition, Europe has unveiled the long-awaited European Chips Act. This daring legislative proposal aims to fortify Europe’s semiconductor supply chain and enhance its ### You Should Use Low-Code Platforms for Apps

As the demand for rapid software development increases, low-code platforms have emerged as a popular choice among developers for their ability to build applications with minimal coding. These platforms not ### Five Powerful Strategies to Bolster Your Cybersecurity

In today’s increasingly digital landscape, businesses of all sizes must prioritize cyber security measures to defend against potential dangers. Cyber security professionals suggest five simple technological strategies to help companies ### Huawei Dazzles with Electric Vehicles and Wireless Earbuds

During a prominent unveiling event, Huawei, the Chinese telecommunications powerhouse, kept quiet about its enigmatic new 5G phone and alleged cutting-edge chip development. Instead, Huawei astounded the audience by presenting ### Digital Banking Needs Cybersecurity

The banking, financial, and insurance (BFSI) sectors are pioneers in digital transformation, using web applications and application programming interfaces (APIs) to provide seamless services to customers around the world. Rising