Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


Tip of the Day
Language: C++
Expertise: Advanced
Feb 25, 2013

String Literals and Values with Backslash Escape Character in C# and C++\CLI

Backslashes in C# and C++\CLI are widely used in Windows file paths and with escape characters. Use of double backslashes and backslashes to perform special functions is well described in many publications. However, the source code is not necessary what you will see in the Visual Studio Debugger. A comparison look-up table for source code, memory representation and visual appearance of strings including backslashes is presented as an easily accessible format for this information.

During debugging a complex application written in C# and C++\CLI languages, differences were noted in string value appearance in the MS Visual Studio Text Visualizers for these languages. Tests were created to analyze the compilation outcome and values displayed by Debugger. The results are summarized in the table below which presents the differences developers will find essential for debugging purposes.

Language

Source Code:

String literal

String

Length

Unicode Code Units

in MS VS Memory Window

Text in MS VS Watch Window

Remarks

C#

"abc \\n"

6

61 00 62 00 63 00 20 00 5C 00 6E 00

"abc \\n"

 

C++\CLI

"abc \\n"

6

61 00 62 00 63 00 20 00 5C 00 6E 00

"abc \n"

Only one backslash visible

 

 

 

 

 

 

C#

"abc \n"

5

61 00 62 00 63 00 20 00 0a 00

"abc \n"

 

C++\CLI

"abc \n"

5

61 00 62 00 63 00 20 00 0a 00

"abc "

Only four chars visible

 

 

 

 

 

 

C#

@"abc \n"

6

61 00 62 00 63 00 20 00 5C 00 6E 00

"abc \\n"

Using verbatim string literal, OK

C++\CLI

@"abc \n"

 

 

 

error C2018: unknown character 0x40

 

 

 

 

 

 

C#

"\/abc"

 

 

 

error CS1009: Unrecognized escape sequence

C++\CLI

"\/abc"

4

2F 00 61 00 62 00 63 00

"/abc"

warning C4129: '/' : unrecognized character escape sequence. The backslash character is removed during compilation.

 

 

 

 

 

 

C#

"\x0Axyz"

4

0a 00 78 00 79 00 7a 00

"\nxyz"

 

C++\CLI

"\x0Axyz"

4

0a 00 78 00 79 00 7a 00

"xyz"

Only three chars visible

 

 

 

 

 

 

C#

"\x0A0Axyz"

4

0a 0a 78 00 79 00 7a 00

"xyz"

 

C++\CLI

"\x0A0Axyz"

 

 

 

error C2022: '2570': too big for character

C++\CLI

L"\x0A0Axyz"

4

0a 0a 78 00 79 00 7a 00

"xyz"

 

Summary of analysis of tests in MS VS 2008\2010 for C# and C++\CLI

Similarities:

  • Combinations of characters consisting of a backslash symbol followed by a single character or up to three octal digits, or the character x followed by a sequence of hexadecimal digits in C# and C++\CLI source code are used only in character constants or string literals. Actual Unicode Code Units that use these combinations can be inserted in values in the memory.
  • Both compilers recognize "\\" and "\n" character escape sequences, as expected.

Differences:

  • The verbatim character (@) is not recognized by C++ compiler, and it issues an error, as expected.
  • Underlying value for "\\" is the same after compilation by both compilers, but is displayed differently.
  • Unrecognized escape sequences are processed differently by the two compilers. C# compiler issues the error, and C++ compiler issues a warning (level 1) and continues scanning after removing the backslash character.
  • When hexadecimal digits are used in an escape sequence, C++ compiler checks if the specified value fits character or wide-character if the constant is prefixed with letter L.

Note:

The hexadecimal value for "\n" is shown as 0a in Debugger's Memory window and 0A when the string is serialized in a file and then the file is opened in the Binary Editor.

Boris Eligulashvili
 
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap