C# String problems

When retrieving HTML or string content from some editors such as SharePoint editor, the editor may insert zero-width or control characters in the content that can cause problem methods like string.IndexOf, string.Compare, string.Replace etc.

If the application only uses English language, it’s possible to strip all control characters from the data before doing other string operations.

Use the Regex to do so

data = Regex.Replace(data, @”[^x20-x7F]”, “”);

This line will remove all characters that are not in the range 0x20 to 0x7F in the ASCII table.

Then proceed to do string operations as per normal.

Leave a Reply

Your email address will not be published. Required fields are marked *