Predicting Your Firm’s Future with Least Squares, Part II

Predicting Your Firm’s Future with Least Squares, Part II

Linear least squares lets you find the line that best fits a set of data points but sometimes the points don’t lie along a line. The second part of this two-part article explains how you can use polynomial least squares to find curves that fit your data.As the first part of this article explained, sometimes you can use an advanced math concept — linear least squares — to predict the future. If you can find a line that closely fits your data, you can extend that line into the future to make predictions about what’s to come. If your data represents such things as web site traffic, retail sales, or electricity usage over time, then a good linear least squares fit may help you predict future revenue and expenses.But what if your data doesn’t fall along a neat line? Take a look at the data points plotted in Figure 1. These points clearly don’t lie in a line but they do form some sort of pattern. If you can find a curve that the points do fit, then you can use that curve to predict future values. Figure 1. Out of Line: These points don’t lie along a line but there is a clear pattern.If the points fall along some polynomial curve, then you can use techniques similar to those described earlier in this article to find a polynomial least squares fit. In other words, you can find a polynomial A0 + A1 * x + A2 * x2 + … + Ad * xd for values A0, A1, …, Ad to minimize the sum of the squares of the vertical distances between the points and the curve. In this equation, d is called the polynomial’s degree.Figure 2 shows a degree 2 polynomial least squares fit for the points shown in Figure 1. This curve fits the data pretty well. Figure 2. Polynomial Perfection: This data fits closely to a degree 2 polynomial.To find the least squares polynomial fit to the data, you need to use calculus that’s very similar to the calculus you used to find the linear least squares fit. (For more information on that calculus, see the first part of this article. You also need to use a little linear algebra. Neither of these is too hard but if you don’t want to work through them, you can skip the next few sections.

A Little Calculus

Consider the general equation for a polynomial of degree d. A0 + A1 * x + A2 * x2 + … + Ad * xdYou need to find the values of A0, A1, A2, … Ad that minimizes the sum of the squares of the vertical distances between the points and the curve. For the ith data point, you can plug in the value xi to find the corresponding value on the curve. You can then subtract that value from the data point’s y coordinate yi and square the result to get the ith error term. Ei = (yi – (A0 + A1 * xi + A2 * xi2 + … + Ad * xid))2This is all analogous to the discussion of linear least squares in the first part of this article. In fact, if the polynomial has degree 1, this is the same equation you use for linear least squares.The next step is to take partial derivatives of this equation with respect to the values you want to find: A0, A1, A2, … Ad. That may seem like a terrifying mess but it’s really not too bad if you use the same three rules described in the first part of the article.To take the partial derivative of a function E with respect to variable A0, written ?/?A0(E) or ?E/?A0, you can use these rules: The partial derivative of a sum is the sum of the derivatives. The partial derivative of a constant (something that doesn’t have the variable A0 in it) is 0. The partial derivative of a function to a power ?/?A0(FK) equals K * (FK-1) * ?/?A0(F).If you use these three rules, you will find that the partial derivative with respect to Aj of the ith error term is: ?Ei/?Aj = 2 * (yi – (A0 + A1 * xi + … + Ad * xid)) * (-xij)= -2 * (yi * xij – A0 * xij – A1 * xij+1 – … – Ad * xij+d) Adding up the terms for the all of the data points you get: ?E/?Aj = ??Ei/?Aj = 2 * ?(yi * xij – A0 * xij – A1 * xij+1 – … – Ad * xij+d) If you pull out the Ai terms you can rewrite this as: ?E/?Aj = 2 * (?yi * xij – A0 * ?xij – A1 * ?xij+1 – … – Ad * ?xij+d)This equation is intimidating but all of the xi and yi values are coordinates of the data points so they’re just numbers. That means this equation has the following form for some constant S values.?E/?Aj = 2 * (S – A0 * S0 – A1 * S1 – … – Ad * Sd)This is a fairly simple equation with d + 1 unknowns A0, A1, … , Ad.When you take the partial derivatives with respect to Aj as j varies from 0 to d, you get d + 1 equations with d + 1 unknowns. Now you can solve the equations to find the coefficients and you’re done.For a linear least squares fit, solving 2 equations with 2 unknowns isn’t too hard. For this more general case where the degree of the polynomial might be large, however, you need a more general solution. That’s where the linear algebra comes in.

A Little Linear Algebra

Suppose you have a series of n equations with n unknown values x1, x2, … , xn: A11 * x1 + A12 * x2 + … + A1n * xn = C1A21 * x1 + A22 * x2 + … + A2n * xn = C2…An1 * x1 + An2 * x2 + … + Ann * xn = Cn (Note: To keep this discussion general and more consistent with what you’ll see in other places, the variables here are x1, x2, and so on and the known constants are A11, A12, and so forth. When we use this to find polynomial least squares fits, you’ll be solving for A1, A2, and so forth.)One way to solve this system of equations is Gaussian elimination. (Actually I use a slightly modified version of Gaussian elimination that avoids the need for back substitution. For a more rigorous discussion of Gaussian elimination, look in Wolfram MathWorld or Wikipedia.)It’s a fairly simple method but learning it from a description can be a bit tricky so after the following description I’ll work through an example that should make things clearer. You can write the system of equations as a matrix equation like this:

The first step is to put a 1 in the matrix’s upper left corner. That’s easy. Simply divide each of the entries in the first row by the value currently in the upper left corner. This is called normalizing the row. If you think about it, you’ll realize that this doesn’t change the truth of the first equation. All you’re doing is dividing both sides of the equation by the same number so the equation remains true.Now you have an augmented matrix that looks like the following where ? means an entry whose value isn’t important right now. You’ll need those values later but they don’t change the basic shape of the augmented matrix.

Now you repeat these steps to place a 1 in the second row’s second column and 0s in all of the other entries in that column. Continue the process until the matrix has this form:

Now if you go back to the original meaning of this matrix, you can read out the xi values. Using the values in this matrix to reconstruct the first equation you get x1 + 0 + 0 + … + 0 = V1 so x1 = V1. Using the other rows in the matrix you can see that xi = Vi. There’s one odd special case that may throw a monkey wrench into this process. When you try to change the ith entry on row i into a 1, there’s a chance that the entry will have value 0. In that case, you cannot divide the other entries in that row by 0 to normalize the row.

If that happens, simply swap that row with one of the following rows that does not have a 0 in this column and continue as before. (If there is no such row, then the system of equations does not have a unique solution.) If you think about what this swap means, you can see that it doesn’t change the truth of the equations. This operation corresponds to switching the original order of the equations and that certainly doesn’t change whether they are true or false.

A Gaussian Example

So how about that example? Suppose you want to find a solution to the following three equations with three unknowns:

3 * x + 6 * y + 3 * z = 9
2 * x + 4 * y – z = -3
4 * x + 6 * y + 2 * z = 8

The augmented matrix for this system of equations is:

Now you can read off the solution: x = 2, y = -1, z = 3. If you plug these values into the original equations, you’ll see that they work.

Hopefully by now you understand Gaussian elimination. It’s fairly simple and only requires some easy multiplication, addition, and row swapping. The only catch is that there are a lot of operations so it’s easy to make mistakes. Fortunately the computer is very good at performing these sorts of operations without making mistakes.

The following code shows how the PolynomialLeastSquares example program (which is available for download in C# and Visual Basic versions) performs Gaussian elimination. I’ve put this in a separate method because it’s handy to have for solving other systems of equations not just those needed for polynomial least squares.
// Perform Gaussian elimination on these coefficients.// Return the array of values that gives the solution.private double[] GaussianElimination(double[,] coeffs){    int max_equation = coeffs.GetUpperBound(0);    int max_coeff = coeffs.GetUpperBound(1);    for (int i = 0; i <= max_equation; i++)    {        // Use equation_coeffs[i, i] to eliminate the ith        // coefficient in all of the other equations.        // Find a row with non-zero ith coefficient.        if (coeffs[i, i] == 0)        {            for (int j = i + 1; j<= max_equation; j++)            {                // See if this one works.                if (coeffs[j, i] != 0)                {                    // This one works. Swap equations i and j.                    // This starts at k = i because all                    // coefficients to the left are 0.                    for (int k = i; k <= max_coeff; k++)                    {                        double temp = coeffs[i, k];                        coeffs[i, k] = coeffs[j, k];                        coeffs[j, k] = temp;                    }                    break;                }            }        }        // Make sure we found an equation with        // a non-zero ith coefficient.        double coeff_i_i = coeffs[i, i];        if (coeff_i_i == 0)        {            throw new ArithmeticException(String.Format(                "There is no unique solution for these points.",                coeffs.GetUpperBound(0) - 1));        }        // Normalize the ith equation.        for (int j = i; j <= max_coeff; j++)        {            coeffs[i, j] /= coeff_i_i;        }        // Use this equation value to zero out        // the other equations' ith coefficients.        for (int j = 0; j<= max_equation; j++)        {            // Skip the ith equation.            if (j != i)            {                // Zero the jth equation's ith coefficient.                double coef_j_i = coeffs[j, i];                for (int d = 0; d <= max_coeff; d++)                {                    coeffs[j, d] -= coeffs[i, d] * coef_j_i;                }            }        }    }    // At this point, the ith equation contains    // 2 non-zero entries:    //      The ith entry which is 1    //      The last entry coeffs[max_coeff]    // This means Ai = equation_coef[max_coeff].    double[] solution = new double[max_equation + 1];    for (int i = 0; i <= max_equation; i++)    {        solution[i] = coeffs[i, max_coeff];    }    // Return the solution values.    return solution;}

Back to Polynomial Least Squares

That was a long digression from polynomial least squares so let’s review where we were when the digression began. To find the best fit polynomial, you found that the partial derivative of the polynomial with respect to Aj is:

?E/?Aj = 2 * (?yi * xij – A0 * ?xij – A1 * ?xij+1 – … – AD * ?xij+D)

To find the best solution, you set this equation and the partial derivatives for the other Ai equal to 0 and solve for the Ai values. Now that you know how to perform Gaussian elimination, this is a snap. The only tricky part is figuring out what coefficients to use in the equations that you send into the Gaussian elimination method.

The following code shows how the PolynomialLeastSquares example program builds the coeffs array holding the augmented matrix that it passes to the GaussianElimination method.

// Find the least squares polynomial fit.private double[] FindPolynomialLeastSquaresFit(    List points, int degree){    // Allocate space for (degree + 1) equations with     // (degree + 2) terms each.    double[,] coeffs = new double[degree + 1, degree + 2];    // Calculate the coefficients for the equations.    for (int j = 0; j <= degree; j++)    {        // Calculate the coefficients for the jth equation.        // Calculate the constant term for this equation.        coeffs[j, degree + 1] = 0;        foreach (PointF pt in points)        {            coeffs[j, degree + 1] -= Math.Pow(pt.X, j) * pt.Y;        }        // Calculate the other coefficients.        for (int a_sub = 0; a_sub <= degree; a_sub++)        {            // Calculate the dth coefficient.            coeffs[j, a_sub] = 0;            foreach (PointF pt in points)            {                coeffs[j, a_sub] -= Math.Pow(pt.X, a_sub + j);            }        }    }    // Solve the equations.    return GaussianElimination(coeffs);}

By using programs such as PolynomialLeastSquares, you can get a better understanding of your data. You can find the best fit of a given degree and then judge whether the data points fit that polynomial well.

Ideally if you know something about the process that generated the data, you may be able to figure out what degree to use for the polynomial, but if you can't, beware of fitting data to high degree polynomials. A degree d polynomial can exactly fit d + 1 points so if you use too high a degree you'll get a misleadingly exact match.For example, Figure 3 shows a degree 9 polynomial exactly fitting 10 data points. The curve fits the points exactly but that doesn't mean it's the best fit or the most useful one.

Figure 3. Too Tight a Fit: A high-degree polynomial can fit most any data, possibly obscuring its true structure.

Figure 4 shows the same points fitted by a degree 2 polynomial. Although this doesn't fit the points exactly, it's probably is a better representation of the data than the previous curve.

For example, suppose you want to predict the y value when x is close to 0. The polynomial in Figure 4 would give you a value close to the bottom of the graph while the polynomial in Figure 3 would give you a value way off the top. Which is really better depends on your situation but unless something really complex is happening, the degree 2 curve is probably a better bet.

Figure 4. Relaxed Fit: A degree 2 polynomial probably fits this data better than a degree 9 polynomial even though it doesn't fit the points exactly.

You can use techniques similar to those described here to find other kinds of least squares fits, too. For example, depending on your data you may want to use a logarithmic curve of the form y = a + b * ln(x), a power curve of the form y = a * xb, or an exponential curve of the form y = a * ebx. For more information on these kinds of curves, see Wolfram MathWorld.

devx-admin

devx-admin

Share the Post:
Game Changer

How ChatGPT is Changing the Game

The AI-powered tool ChatGPT has taken the computing world by storm, receiving high praise from experts like Brex design lead, Pietro Schirano. Developed by OpenAI,

Future of Cybersecurity

Cybersecurity Battles: Lapsus$ Era Unfolds

In 2023, the cybersecurity field faces significant challenges due to the continuous transformation of threats and the increasing abilities of hackers. A prime example of

Apple's AI Future

Inside Apple’s AI Expansion Plans

Rather than following the widespread pattern of job cuts in the tech sector, Apple’s CEO Tim Cook disclosed plans to increase the company’s UK workforce.

AI Finance

AI Stocks to Watch

As investor interest in artificial intelligence (AI) grows, many companies are highlighting their AI product plans. However, discovering AI stocks that already generate revenue from

Web App Security

Web Application Supply Chain Security

Today’s web applications depend on a wide array of third-party components and open-source tools to function effectively. This reliance on external resources poses significant security

Thrilling Battle

Thrilling Battle: Germany Versus Huawei

The German interior ministry has put forward suggestions that would oblige telecommunications operators to decrease their reliance on equipment manufactured by Chinese firms Huawei and

Game Changer

How ChatGPT is Changing the Game

The AI-powered tool ChatGPT has taken the computing world by storm, receiving high praise from experts like Brex design lead, Pietro Schirano. Developed by OpenAI, ChatGPT is known for its

Future of Cybersecurity

Cybersecurity Battles: Lapsus$ Era Unfolds

In 2023, the cybersecurity field faces significant challenges due to the continuous transformation of threats and the increasing abilities of hackers. A prime example of this is the group of

Apple's AI Future

Inside Apple’s AI Expansion Plans

Rather than following the widespread pattern of job cuts in the tech sector, Apple’s CEO Tim Cook disclosed plans to increase the company’s UK workforce. The main area of focus

AI Finance

AI Stocks to Watch

As investor interest in artificial intelligence (AI) grows, many companies are highlighting their AI product plans. However, discovering AI stocks that already generate revenue from generative AI, such as OpenAI,

Web App Security

Web Application Supply Chain Security

Today’s web applications depend on a wide array of third-party components and open-source tools to function effectively. This reliance on external resources poses significant security risks, as malicious actors can

Thrilling Battle

Thrilling Battle: Germany Versus Huawei

The German interior ministry has put forward suggestions that would oblige telecommunications operators to decrease their reliance on equipment manufactured by Chinese firms Huawei and ZTE. This development comes after

iPhone 15 Unveiling

The iPhone 15’s Secrets and Surprises

As we dive into the most frequently asked questions and intriguing features, let us reiterate that the iPhone 15 brings substantial advancements in technology and design compared to its predecessors.

Chip Overcoming

iPhone 15 Pro Max: Overcoming Chip Setbacks

Apple recently faced a significant challenge in the development of a key component for its latest iPhone series, the iPhone 15 Pro Max, which was unveiled just a week ago.

Performance Camera

iPhone 15: Performance, Camera, Battery

Apple’s highly anticipated iPhone 15 has finally hit the market, sending ripples of excitement across the tech industry. For those considering upgrading to this new model, three essential features come

Battery Breakthrough

Electric Vehicle Battery Breakthrough

The prices of lithium-ion batteries have seen a considerable reduction, with the cost per kilowatt-hour dipping under $100 for the first occasion in two years, as reported by energy analytics

Economy Act Soars

Virginia’s Clean Economy Act Soars Ahead

Virginia has made significant strides towards achieving its short-term carbon-free objectives as outlined in the Clean Economy Act of 2020. Currently, about 44,000 megawatts (MW) of wind, solar, and energy

Renewable Storage Innovation

Innovative Energy Storage Solutions

The Department of Energy recently revealed a significant investment of $325 million in advanced battery technologies to store excess renewable energy produced by solar and wind sources. This funding will

Renesas Tech Revolution

Revolutionizing India’s Tech Sector with Renesas

Tushar Sharma, a semiconductor engineer at Renesas Electronics, met with Indian Prime Minister Narendra Modi to discuss the company’s support for India’s “Make in India” initiative. This initiative focuses on

Development Project

Thrilling East Windsor Mixed-Use Development

Real estate developer James Cormier, in collaboration with a partnership, has purchased 137 acres of land in Connecticut for $1.15 million with the intention of constructing residential and commercial buildings.

USA Companies

Top Software Development Companies in USA

Navigating the tech landscape to find the right partner is crucial yet challenging. This article offers a comparative glimpse into the top software development companies in the USA. Through a

Software Development

Top Software Development Companies

Looking for the best in software development? Our list of Top Software Development Companies is your gateway to finding the right tech partner. Dive in and explore the leaders in

India Web Development

Top Web Development Companies in India

In the digital race, the right web development partner is your winning edge. Dive into our curated list of top web development companies in India, and kickstart your journey to

USA Web Development

Top Web Development Companies in USA

Looking for the best web development companies in the USA? We’ve got you covered! Check out our top 10 picks to find the right partner for your online project. Your

Clean Energy Adoption

Inside Michigan’s Clean Energy Revolution

Democratic state legislators in Michigan continue to discuss and debate clean energy legislation in the hopes of establishing a comprehensive clean energy strategy for the state. A Senate committee meeting

Chips Act Revolution

European Chips Act: What is it?

In response to the intensifying worldwide technology competition, Europe has unveiled the long-awaited European Chips Act. This daring legislative proposal aims to fortify Europe’s semiconductor supply chain and enhance its

Revolutionized Low-Code

You Should Use Low-Code Platforms for Apps

As the demand for rapid software development increases, low-code platforms have emerged as a popular choice among developers for their ability to build applications with minimal coding. These platforms not

Cybersecurity Strategy

Five Powerful Strategies to Bolster Your Cybersecurity

In today’s increasingly digital landscape, businesses of all sizes must prioritize cyber security measures to defend against potential dangers. Cyber security professionals suggest five simple technological strategies to help companies

Global Layoffs

Tech Layoffs Are Getting Worse Globally

Since the start of 2023, the global technology sector has experienced a significant rise in layoffs, with over 236,000 workers being let go by 1,019 tech firms, as per data

Huawei Electric Dazzle

Huawei Dazzles with Electric Vehicles and Wireless Earbuds

During a prominent unveiling event, Huawei, the Chinese telecommunications powerhouse, kept quiet about its enigmatic new 5G phone and alleged cutting-edge chip development. Instead, Huawei astounded the audience by presenting

Cybersecurity Banking Revolution

Digital Banking Needs Cybersecurity

The banking, financial, and insurance (BFSI) sectors are pioneers in digital transformation, using web applications and application programming interfaces (APIs) to provide seamless services to customers around the world. Rising