icrosoft Excel is a powerful tool for writing user interfaces for spreadsheet applications. Apart from the ability to display information in a tabular form, Excel has scores of functions capable of addressing most calculation needs?and if those aren’t sufficient, you can create your own functions, because Excel has Visual Basic for Applications (VBA) embedded in it.

One powerful function that shipped with Excel is the IF function. The IF function is very similar to the IF statement in most programming languages, giving applications a way to choose one of several options. However, unlike the IF statement, you can not add comments to an IF function, because comments aren’t permitted within function calls in worksheet cells. Therefore, it can be very difficult to understand a complex nested IF function.

Moreover, a programming language lets you indent lines of code, which helps make it much easier to read and comprehend what that code does. The worksheet IF function does not allow indentation.

Finally, you can nest worksheet IF functions, so it’s easy to end up with a formula that’s difficult to understand. Here’s an example:

` =IF(price_br="BID",IF(B10<>"",IF(Q10<>"", (AD10-Q10),IF(T10<>"",(AD10-T10),IF(X10<>"", (AD10-X10),IF(P10="N"," Hello","No Hello")))),""),IF(B10<>"",IF(Q10<>"",(AD10- Q10),IF(T10<>"",(AD10-((U10-T10)/2+T10)), IF(X10<>"",(AD10-((Y10-X10)/2+X10)), IF(P10="N",("Hello"),"No() Hello")))),""))`

Lest you think I simply made this formula up as a “straw man” example, I didn’t. I saw it in a real spreadsheet in a real financial institution (although I did change some literal strings to disguise it).

If you are in a pressure situation where you need to debug a spreadsheet formula, you will yearn for a tool to help simplify reading and understanding the code. Excel does give you some minimal help by color-coding the opening and closing braces, but the Formula Formatter tool discussed in this article goes several steps further. After you install the Formula Formatter, right clicking on any cell containing an IF function allows you to view the formula in a nicely-formatted manner, and with any nested IF functions neatly indented.

The IF function examines a worksheet formula and converts it into pseudo code, displaying the function in an easy-to-understand form. For example, after parsing and formatting the IF function, the formula shown earlier looks like this:

` START_IF_00 =IF(price_br="BID" then START_IF_01 IF(B10<>"" then START_IF_02 IF(Q10<>"" then (AD10-Q10) else START_IF_03 IF(T10<>"" then (AD10-T10) else START_IF_04 IF(X10<>"" then (AD10-X10) else START_IF_05 IF(P10="N" then " Hello" else "No Hello") END_IIF_05 ) END_IIF_04 ) END_IIF_03 ) END_IIF_02 else "") END_IIF_01 else START_IF_06 IF(B10<>"" then START_IF_07 IF(Q10<>"" then (AD10-Q10) else START_IF_08 IF(T10<>"" then (AD10-((U10-T10)/2+T10)) else START_IF_09 IF(X10<>"" then (AD10-((Y10-X10)/2+X10)) else START_IF_10 IF(P10="N" then ("Hello") else "No() Hello") END_IIF_10 ) END_IIF_09 ) END_IIF_08 ) END_IIF_07 else "") END_IIF_06 ) END_IIF_00`

As you can see, while still complex, it’s far easier to read than the original packed one-line version.

**Building an IF Function Parser**

The code to parse an IF formula consists of a .NET class, callable by VBA in Excel. The heart of the parser is the .NET framework’s Regular Expression class, used to identify each IF function. Syntactically, an IF function in Excel always takes this form:

` IF(A=B,Y,Z)`

It’s worth pointing out that any of the items A,B,Y, or Z in the preceding formula could be another IF function or indeed *any* built in or user-defined function in Excel. The key to isolating IF statements is to identify where each function begins and ends. The opening and closing parentheses provide an indication; however it’s not quite as simple as that. First, you need to identify the opening or closing parentheses that signify either the beginning or end of an IF function. But remember, parentheses can also occur in literal strings. Moreover, other functions can contain parentheses, so the parser must unambiguously identify the specific parentheses belonging to the IF formula.

Using regular expressions simplifies the problem considerably. The parser uses the .NET framework class because Excel VBA doesn’t include a robust regular expression parser. For example, the code below captures all the IF functions in a formula:

` string patternIF= "IF\("; Regex reg = new Regex( patternIF, RegexOptions.Multiline | RegexOptions.IgnoreCase); MatchCollection matches = reg.Matches(sourceCopy);`

Because parentheses are crucial, the code needs to handle braces that are part of other Excel formulas. To identify those formulas that do not take any parameters (such as Date()), you can use the code below:

` string pattern = @"[a-zA-Z-9]*()"; Regex reg = new Regex( pattern, RegexOptions.IgnoreCase | RegexOptions.Multiline); MatchCollection matches = reg.Matches(ifTest);`

The parser must also identify all the literal strings in a formula, using this pattern:

` string localPattern = @"x22[sa-zA-Z()]*x22"; Regex regLocal = new Regex( localPattern,RegexOptions.Multiline | RegexOptions.IgnoreCase);`

So the logic follows this procedure:

- A user selects a formula
- The parser identifies all the IF functions the formula contains, and then figures out which of those is the innermost IF function.
- It replaces any literal strings and all built-in or user-defined functions in the formula with tokens, using regular expression pattern matching.
- Because IF functions can be nested, the parser also needs to work out the relationship between the various IF functions to indent the pseudo code properly.
- Finally, the parser breaks down each IF statement into three parts, the IF part, the Then part and the Else part. As it reconstructs the formula from the tokens, the display code ensures that each part of the “IF” function appears on a separate line, and adds indentation based on where the IF function lies in the whole formula.

**Exposing the .Net Assembly to COM Clients**

To expose this assembly for use by unmanaged COM Clients, you must declare a .NET Interface and define a GUID attribute:

` [Guid("43A8EB80-A566-4985-BEDE-58312478029D")] public interface IParser { void parse(string source); }`

The .NET assembly implements the IParser interface:

` [Guid("22FAF7FA-8FCA-4eca-8E28-11DBC07475F4")] public class FormulaAnalyser : IParser`

Building the application creates a type library called FormulaParser.tlb. Any COM client can now use the type library to access public exposed functionality in the .NET assembly.

**Connecting to a .NET Assembly from Excel**

To use this tool in Excel, you must be able to run macros, so you should click the Tools menu and set your Macros|Security setting to medium. You’ll find an Excel add-in file in the downloadable code. An add-in file has an .xla extension and usually contains code only.

First, open up the add-in file called FormulaFormatter.xla. To see or edit the code for the add-in, press Alt + F11. Click the Tools menu and select References. If you see a reference to FormulaParser.tlb, unselect it. Browse to the location where you copied the type library FormulaParser.tlb and select that file. Before leaving the code window, make sure you save the add-in by clicking the File menu and selecting?”Save FormulaFormatter.xla” (alternatively, press Ctrl + S).

The main purpose of this .xla add-in is to be able to call the Formula Formatter when a user right-clicks in a cell containing an IF function. A bit of VBA code sets up the shortcut menu for the Excel application:

` Private Sub App_SheetBeforeRightClick( _ ByVal Sh As Object, ByVal Target As Range, _ Cancel As Boolean) Dim bRet As Boolean If Target.HasFormula Then If Target.HasArray Then Debug.Print "Array formula" End If If InStr(1, Target.formula, "if(", _ vbTextCompare) > 0 Then Application.CommandBars("Cell").Controls _ (FORMULA_MENU_NAME).Enabled = True Else Application.CommandBars("Cell").Controls _ (FORMULA_MENU_NAME).Enabled = False End If Else Application.CommandBars("Cell").Controls _ (FORMULA_MENU_NAME).Enabled = False End If End Sub`

**Setting Up the Formula Formatter**

At this point, the add-in is set up and you are ready to use the Formula Formatter. Open any workbook of interest. To bring the Formula Formatter into your environment, you need to set a reference to the FormulaFormatter.xla add-in. Go to the Tools menu and select Add-in. Click the Browse button, and navigate to the location of the FormulaFormatter.xla on your file system.

? | |

Figure 1. Using the Formula Formatter Add-in: To use the add-in, right click on any cell containing an IF function, and you’ll see the option “Format Formula” on the popup menu. |

? | |

Figure 2. The Formula Formatter Window: Selecting the “Format Formula” option from a cell’s context menu displays the IF formula formatted and indented for enhanced readability. |

After successfully registering the add-in, right click on any cell where the formula contains an IF function, and you should see the option “Format Formula” as shown in Figure 1.

When you select the “Format Formula” option, you’ll see the formula appear nicely formatted and indented in a popup window (see Figure 2).

This tool is a starting point. I’ve tested it against a number of formulas and so far, it’s worked just fine; however, because it’s impossible to envisage all the possible formulas that people might use in real life, it’s very possible that the Formula Formatter may need more tweaking. The good news is that you can download the code for the tool, so feel free to tweak it to your heart’s content.

One convenient enhancement you might consider making to the Formula Formatter is adding the ability to let users edit the formula in pseudo code, and then re-apply their changes to the worksheet. Essentially, you’d need to perform the reverse of the formula parsing process shown in this article. If you examine the code and understand how the formula was converted to pseudo-code, it should become apparent what you would need to do.