Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Build this IF Function Parser to Simplify Debugging Excel Formulas : Page 2

If you're tired of trying to decipher long Excel IF formulas displayed in a single line of code, try this convenient Formula Formatter add-in that displays such formulas neatly indented in a popup window.




Application Security Testing: An Integral Part of DevOps

Building an IF Function Parser
The code to parse an IF formula consists of a .NET class, callable by VBA in Excel. The heart of the parser is the .NET framework's Regular Expression class, used to identify each IF function. Syntactically, an IF function in Excel always takes this form:


It's worth pointing out that any of the items A,B,Y, or Z in the preceding formula could be another IF function or indeed any built in or user-defined function in Excel. The key to isolating IF statements is to identify where each function begins and ends. The opening and closing parentheses provide an indication; however it's not quite as simple as that. First, you need to identify the opening or closing parentheses that signify either the beginning or end of an IF function. But remember, parentheses can also occur in literal strings. Moreover, other functions can contain parentheses, so the parser must unambiguously identify the specific parentheses belonging to the IF formula.

Using regular expressions simplifies the problem considerably. The parser uses the .NET framework class because Excel VBA doesn't include a robust regular expression parser. For example, the code below captures all the IF functions in a formula:

string patternIF= "IF\\("; Regex reg = new Regex( patternIF, RegexOptions.Multiline | RegexOptions.IgnoreCase); MatchCollection matches = reg.Matches(sourceCopy);

Because parentheses are crucial, the code needs to handle braces that are part of other Excel formulas. To identify those formulas that do not take any parameters (such as Date()), you can use the code below:

string pattern = @"[a-zA-Z\0-9]*\(\)"; Regex reg = new Regex( pattern, RegexOptions.IgnoreCase | RegexOptions.Multiline); MatchCollection matches = reg.Matches(ifTest);

The parser must also identify all the literal strings in a formula, using this pattern:

string localPattern = @"\x22[\sa-zA-Z\(\)]*\x22"; Regex regLocal = new Regex( localPattern,RegexOptions.Multiline | RegexOptions.IgnoreCase);

So the logic follows this procedure:

  1. A user selects a formula
  2. The parser identifies all the IF functions the formula contains, and then figures out which of those is the innermost IF function.
  3. It replaces any literal strings and all built-in or user-defined functions in the formula with tokens, using regular expression pattern matching.
  4. Because IF functions can be nested, the parser also needs to work out the relationship between the various IF functions to indent the pseudo code properly.
  5. Finally, the parser breaks down each IF statement into three parts, the IF part, the Then part and the Else part. As it reconstructs the formula from the tokens, the display code ensures that each part of the "IF" function appears on a separate line, and adds indentation based on where the IF function lies in the whole formula.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date