I think every developer faced this problem of converting Unicode encoded string to Bytes[] . But .Net Framework has a very rich support for converting Encoded String to Bytes[]. .Net Framework support following 4 Encoding which is inherited from Base class Encoding (For Details – http://msdn2.microsoft.com/en-us/library/system.text.encoding(VS.71).aspx )
1. System.Text.ASCIIEncoding – Encodes characters as 7Bit ASCII character
2. System.Text.UnicodeEncoding – Encode characters in 2 consecutive Bytes enabling support for Big Endean or Little Endean.
3. System.Text.UTF7Encoding – Encode characters in UTF7
4. System.Text.UTF8Encoding-– Encode characters in UTF8
Now, let’s consider our text to be UTF8 encoded like
StringBuilder _TextBuilder = new StringBuilder(223);
_TextBuilder.AppendFormat(@”<?xml version=””1.0″” encoding=””UTF-8″”?>{0}”, Environment.NewLine); _TextBuilder.AppendFormat(@”<Contents Type=””string””><![CDATA[• das sfas fdasfs afdasfasd fasd hg kjh klhhjn “);
_TextBuilder.AppendFormat(@”fdhæfdhj fdh.lfjnhfjk.lnh fdæjlf hlæfjhnf læhfj hglæjælfdh{0}”, Environment.NewLine);
_TextBuilder.AppendFormat(@”hfdj fdklhjfdh]]></Contents>”);
string pdfDirectorXml = _TextBuilder.ToString();
The String pdfDirectorXml contains some unicode charecters with classic Bell Charecter also.The easiest way to convert UTF8 encoded text to Binary –
UTF8Encoding encoding = new UTF8Encoding();
byte[] bytes = encoding.GetBytes(pdfDirectorXml);
Hence, you can get the UTF-8 encoded byte[] representation of the string. J