0% found this document useful (0 votes)
3 views

Converting String to Byte Array in C# - Stack Overflow

The document discusses a user's issue with converting a string to a byte array in C# after migrating from VB.NET, highlighting specific syntax errors and providing various solutions. It includes multiple responses from other users suggesting casting to byte[], using different encoding methods, and avoiding default encoding due to potential issues. The conversation emphasizes the importance of understanding encoding types and provides code snippets for proper conversion methods.

Uploaded by

su.nats
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Converting String to Byte Array in C# - Stack Overflow

The document discusses a user's issue with converting a string to a byte array in C# after migrating from VB.NET, highlighting specific syntax errors and providing various solutions. It includes multiple responses from other users suggesting casting to byte[], using different encoding methods, and avoiding default encoding due to potential issues. The conversation emphasizes the importance of understanding encoding types and provides code snippets for proper conversion methods.

Uploaded by

su.nats
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Stack Overflow Sign up Log in

Questions Jobs Tags Users Badges Ask

778 Converting string to byte array in C#


c# string vb.net encoding byte

I'm converting something from VB into C#. Having a


problem with the syntax of this statement:

if ((searchResult.Properties["user"].Coun
t > 0))
{
profile.User = System.Text.Encoding.UTF8
"user"][0]);
}

I then see the following errors:

Argument 1: cannot convert from 'object' to


'byte[]'

The best overloaded method match for


'System.Text.Encoding.GetString(byte[])'
has some invalid arguments

I tried to fix the code based on this post, but still no


success

string User = Encoding.UTF8.GetString("us


er", 0);

Any suggestions?

Share Improve this question Follow

nouptime asked
8,311 ● 5 ● 18 ● 34 Apr 18 '13 at 0:50

Jan Turoň edited


27k ● 21 ● 102 ● 154 Dec 6 '19 at 16:43

2 What is the type of


searchResult.Properties["user"][0] ? Try
casting it to byte[] first – mshsayem Apr 18 '13
at 0:54

mshsayem went where I was going. Are you missing


a cast to a (byte[]) on the searchResult? –
Harrison Apr 18 '13 at 0:56

How would I go about doing that in my case? My


knowledge of C# syntax is pretty limited to be
honest. – nouptime Apr 18 '13 at 1:26

3 You need to find out what type


Properties["user"][0] is. If you're sure it's a
byte array then you can cast like this
profile.User = System.Text.Encoding.UTF8
.GetString((byte[])searchResult.Properti
es["user"][0]);
– keyboardP Apr 18 '13 at 1:32

2 Turns out there was no need for all that fuss. The
username could be fetched without encoding after
all. – nouptime Mar 14 '14 at 8:10

Show 2 more comments

17 Answers order by votes

If you already have a byte array then you


1398 will need to know what type of encoding
was used to make it into that byte array.
+200

For example, if the byte array was created


like this:

byte[] bytes = Encoding.ASCII.Get


Bytes(someString);

You will need to turn it back into a string


like this:

string someString = Encoding.ASCI


I.GetString(bytes);

If you can find in the code you inherited, the


encoding used to create the byte array then
you should be set.

Share Improve this answer Follow

Timothy Randall answered


15.2k ● 1 ● 13 ● 27 Apr 18 '13 at 0:54

edited
Sep 12 '18 at 12:52

4 Timothy, I've looked through the VB code and I can't


seem to find a byte array as you have mentioned. –
nouptime Apr 18 '13 at 1:06

On your search result, what is the type of the


Properties property? – Timothy Randall Apr 18 '13
at 1:09

All I can see is that there are a number items


attached to Properties as a string. I'm not sure if
that's what you were asking me though. –
nouptime Apr 18 '13 at 1:24

25 @AndiAR try Encoding.UTF8.GetBytes(somestring) –


OzBob Dec 5 '16 at 4:24

1 For my situation I found that


Encoding.Unicode.GetBytes worked (but ASCII
didn't) – Jeff May 11 '18 at 16:29

Show 2 more comments

First of all, add the System.Text


116 namespace

using System.Text;

Then use this code

string input = "some text";


byte[] array = Encoding.ASCII.Get
Bytes(input);

Hope to fix it!

Share Improve this answer Follow

Shridhar answered
1,788 ● 1 ● 10 ● 13 Dec 18 '15 at 4:40

edited
May 24 '17 at 7:50

Also you can use an Extension Method to


48 add a method to the string type as
below:

static class Helper


{
public static byte[] ToByteArr
ay(this string str)
{
return System.Text.Encoding
.ASCII.GetBytes(str);
}
}

And use it like below:

string foo = "bla bla";


byte[] result = foo.ToByteArray()
;

Share Improve this answer Follow

Ali answered
3,107 ● 4 ● 35 ● 50 Jun 22 '17 at 14:59

Cristian Ciupitu edited


18.3k ● 7 ● 46 ● 70 Sep 3 '18 at 14:07

17 I'd rename that method to include the fact that it's


using ASCII encoding. Something like
ToASCIIByteArray . I hate when I find out some
library I'm using uses ASCII and I'm assuming it's
using UTF-8 or something more modern. – T Blank
Sep 8 '17 at 18:10

Add a comment

var result = System.Text.Encoding


43
.Unicode.GetBytes(text);

Share Improve this answer Follow

Kuganrajh Rajendran answered


679 ● 6 ● 9 Sep 29 '17 at 4:31

5 This should be the accepted answer, as the other


answers suggest ASCII, but the encoding is either
Unicode (which it UTF16) or UTF8. – Abel Dec 26
'18 at 21:57

1 Indeed, @Abel. The C# currently uses UTF-16 as


default and encoding such makes sense more than
ASCII. Depends of project of course, but this is
default. – Angel Oct 7 '20 at 4:56

Add a comment

Encoding.Default should not


39
be used...
@Randall's answer uses
Encoding.Default , however Microsoft
raises a warning against it:

Different computers can use


different encodings as the default,
and the default encoding can
change on a single computer. If
you use the Default encoding to
encode and decode data streamed
between computers or retrieved at
different times on the same
computer, it may translate that data
incorrectly. In addition, the
encoding returned by the Default
property uses best-fit fallback to
map unsupported characters to
characters supported by the code
page. For these reasons, using the
default encoding is not
recommended. To ensure that
encoded bytes are decoded
properly, you should use a Unicode
encoding, such as UTF8Encoding
or UnicodeEncoding. You could
also use a higher-level protocol to
ensure that the same format is
used for encoding and decoding.

To check what the default encoding is, use


Encoding.Default.WindowsCodePage
(1250 in my case - and sadly, there is no
predefined class of CP1250 encoding, but
the object could be retrieved as
Encoding.GetEncoding(1250) ).

...UTF-8 encoding should be


used instead...
Encoding.ASCII is 7bit, so it doesn't work
either, in my case:

byte[] pass = Encoding.ASCII.GetB


ytes("šarže");
Console.WriteLine(Encoding.ASCII.
GetString(pass)); // ?ar?e

Following Microsoft's recommendation:

var utf8 = new UTF8Encoding();


byte[] pass = utf8.GetBytes("šarž
e");
Console.WriteLine(utf8.GetString(
pass)); // šarže

Encoding.UTF8 recommended by others


is an instance uf UTF-8 encoding and can
be also used directly or as

var utf8 = Encoding.UTF8 as UTF8E


ncoding;

...but it is not used always


Default encoding is misleading: .NET uses
UTF-8 everywhere (including strings
hardcoded in the source code), but
Windows actually uses 2 other non-UTF8
non-standard defaults: ANSI codepage (for
GUI apps before .NET) and OEM codepage
(aka DOS standard). These differs from
country to country (for instance, Windows
Czech edition uses CP1250 and CP852)
and are oftentimes hardcoded in windows
API libraries. So if you just set UTF-8 to
console by chcp 65001 (as .NET implicitly
does and pretends it is the default) and run
some localized command (like ping), it
works in English version, but you get tofu
text in Czech Republic.

Let me share my real world experience: I


created WinForms application customizing
git scripts for teachers. The output is
obtained on the background
anynchronously by a process described by
Microsoft as (bold text added by me):

The word "shell" in this context


(UseShellExecute) refers to a
graphical shell (ANSI CP) (similar
to the Windows shell) rather than
command shells (for example,
bash or sh) (OEM CP) and lets
users launch graphical applications
or open documents (with messed
output in non-US environment).

So effectively GUI defaults to UTF-8,


process defaults to CP1250 and console
defaults to 852. So the output is in 852
interpreted as UTF-8 interpreted as
CP1250. I got tofu text from which I could
not deduce the original codepage due to
the double conversion. I was pulling my hair
for a week to figure out to explicitly set
UTF-8 for process script and convert the
output from CP1250 to UTF-8 in the main
thread. Now it works here in the Eastern
Europe, but Western Europe Windows uses
1252. ANSI CP is not determined easily as
many commands like systeminfo are also
localized and other methods differs from
version to version: in such environment
displaying national characters reliably is
almost unfeasible.

So until the half of 21st century, please DO


NOT use any "Default Codepage" and set it
explicitly (to UTF-8 if possible).

Share Improve this answer Follow

Jan Turoň answered


27k ● 21 ● 102 ● 154 Dec 6 '19 at 11:07

edited
Oct 31 '20 at 9:24

static byte[] GetBytes(string str


34
)
{
byte[] bytes = new byte[str.
Length * sizeof(char)];
System.Buffer.BlockCopy(str.ToCh
0, bytes, 0, bytes.Length);
return bytes;
}

static string GetString(byte[] by


tes)
{
char[] chars = new char[byte
s.Length / sizeof(char)];
System.Buffer.BlockCopy(bytes,
0, chars, 0, bytes.Length);

Share Improve this answer Follow

Eran Yogev answered


804 ● 8 ● 16 Apr 28 '14 at 19:47

JustinStolle edited
3,742 ● 3 ● 32 ● 46 Aug 31 '14 at 1:09

1 This will fail for characters that fall into the surrogate
pair range.. GetBytes will have a byte array that
misses one normal char per surrogate pair off the
end. The GetString will have empty chars at the end.
The only way it would work is if microsoft's default
were UTF32, or if characters in the surrogate pair
range were not allowed. Or is there something I'm
not seeing? The proper way is to 'encode' the string
into bytes. – Gerard ONeill Feb 17 '17 at 17:31

Correct, for a wider range you can use something


similar to #Timothy Randall's solution: using System;
using System.Text; namespace Example{ public
class Program { public static void Main(string[] args)
{ string s1 = "Hello World"; string s2 = "‫;"שלום עולם‬
string s3 = " ";
Console.WriteLine(Encoding.UTF8.GetString(Encodi
ng.UTF8.GetBytes(s1)));
Console.WriteLine(Encoding.UTF8.GetString(Encodi
ng.UTF8.GetBytes(s2)));
Console.WriteLine(Encoding.UTF8.GetString(Encodi
ng.UTF8.GetBytes(s3))); } } } – Eran Yogev Feb 17
'17 at 20:03

@EranYogev why it should fail? I have tested it for


the whole range of System.Int32 and it was
correct. Can you please explain here or in this
question:
stackoverflow.com/questions/64077979/… – astef
Sep 26 '20 at 13:05

Add a comment

Building off Ali's answer, I would


13 recommend an extension method that
allows you to optionally pass in the
encoding you want to use:

ons
{
/// <summary>
/// Creates a byte array from
the string, using the
/// System.Text.Encoding.Defa
ult encoding unless another is sp
ecified.
/// </summary>
public static byte[] ToByteAr
ray(this string str, Encoding enc
oding = Encoding.Default)
{
return encoding.GetBytes(
str);
}
}

And use it like below:

string foo = "bla bla";

// default encoding
byte[] default = foo.ToByteArray(
);

// custom encoding
byte[] unicode = foo.ToByteArray(
Encoding.Unicode);

Share Improve this answer Follow

Dan Sinclair answered


788 ● 1 ● 8 ● 23 May 10 '19 at 12:35

4 Note that using


Encoding encoding = Encoding.Default
results in a compile time error:
CS1736 Default parameter value for 'enco
ding' must be a compile-time constant
– Douglas Gaskell Jun 17 '19 at 19:42

Add a comment

This what worked for me


12

byte[] bytes = Convert.FromBase64


String(textString);

And in reverse:

string str = Convert.ToBase64Stri


ng(bytes);

Share Improve this answer Follow

Mina Matta answered


728 ● 11 ● 19 Dec 2 '19 at 21:28

knocte edited
14.7k ● 7 ● 67 ● 111 Dec 16 '20 at 14:06

that only works when your string only contains a-z,


A-Z, 0-9, +, /. No other characters are allowed
de.wikipedia.org/wiki/Base64 – Blechdose Jan 17
'20 at 7:48

Add a comment

use this
11

byte[] myByte= System.Text.ASCIIE


ncoding.Default.GetBytes(myString
);

Share Improve this answer Follow

alireza amini answered


1,614 ● 1 ● 16 ● 33 Jun 30 '15 at 14:43

The following approach will work only if the


11 chars are 1 byte. (Default unicode will not
work since it is 2 bytes)

public static byte[] ToByteArray(


string value)
{
char[] charArr = value.ToChar
Array();
byte[] bytes = new byte[charA
rr.Length];
for (int i = 0; i < charArr.L
ength; i++)
{
byte current = Convert.To
Byte(charArr[i]);
bytes[i] = current;
}

return bytes;
}

Keeping it simple

Share Improve this answer Follow

Mandar Sudame answered


157 ● 1 ● 7 Mar 4 '16 at 18:57

Noam M edited
3,057 ● 5 ● 25 ● 38 Jan 8 '18 at 5:11

char and string are UTF-16 by definition. – Tom


Blodget Mar 4 '16 at 23:37

Yes the default is UTF-16. I am not making any


assumptions on Encoding of the input string. –
Mandar Sudame Mar 6 '16 at 20:06

There is no text but encoded text. Your input is type


string and is therefore UTF-16. UTF-16 is not the
default; there is no choice about it. You then split
into char[] , UTF-16 code units. You then call
Convert.ToByte(Char), which just happens to convert
U+0000 to U+00FF to ISO-8859-1, and mangles any
other codepoints. – Tom Blodget Mar 6 '16 at 20:55

Makes sense. Thanks for the clarification. Updating


my answer. – Mandar Sudame Mar 8 '16 at 19:56

1 I think you are still missing several essential points.


Focus on char being 16 bits and
Convert.ToByte() throwing half of them away. –
Tom Blodget Mar 9 '16 at 1:23

Show 1 more comment

A refinement to JustinStolle's edit (Eran


7 Yogev's use of BlockCopy).

The proposed solution is indeed faster than


using Encoding. Problem is that it doesn't
work for encoding byte arrays of uneven
length. As given, it raises an out-of-bound
exception. Increasing the length by 1 leaves
a trailing byte when decoding from string.

For me, the need came when I wanted to


encode from DataTable to JSON . I was
looking for a way to encode binary fields
into strings and decode from string back to
byte[] .

I therefore created two classes - one that


wraps the above solution (when encoding
from strings it's fine, because the lengths
are always even), and another that handles
byte[] encoding.

I solved the uneven length problem by


adding a single character that tells me if the
original length of the binary array was odd
('1') or even ('0')

As follows:

public static class StringEncoder


{
static byte[] EncodeToBytes(s
tring str)
{
byte[] bytes = new byte[s
tr.Length * sizeof(char)];
System.Buffer.BlockCopy(str.T
0, bytes, 0, bytes.Length);
return bytes;
}
static string DecodeToString(
byte[] bytes)
{
char[] chars = new char[b
ytes.Length / sizeof(char)];
System.Buffer.BlockCopy(bytes

Share Improve this answer Follow

user4726577 answered
71 ● 1 ● 1 Mar 29 '15 at 14:31

Ali edited
3,107 ● 4 ● 35 ● 50 Jul 29 '17 at 5:59

You could use MemoryMarshal API to


5 perform very fast and efficient conversion.
String will implicitly be cast to
ReadOnlySpan<byte> , as
MemoryMarshal.Cast accepts either
Span<byte> or ReadOnlySpan<byte> as
an input parameter.

public static class StringExtensi


ons
{
public static byte[] ToByteAr
ray(this string s) => s.ToByteSpa
n().ToArray(); // heap allocatio
n, use only when you cannot opera
te on spans
public static ReadOnlySpan<by
te> ToByteSpan(this string s) =>
MemoryMarshal.Cast<char, byte>(s)
;
}

Following benchmark shows the difference:

Input: "Lorem Ipsum is simply dum


my text of the printing and types
etting industry. Lorem Ipsum has
been the industry's standard dumm
y text ever since the 1500s,"

| Method |
Mean | Error | StdDe
v | Gen 0 | Gen 1 | Gen 2 | Allo
cated |
|----------------------------- |-
----------:|----------:|---------
-:|-------:|------:|------:|-----
-----:|
| UsingEncodingUnicodeGetBytes |
160.042 ns | 3.2864 ns | 6.4099 n
s | 0.0780 | - | - |

Share Improve this answer Follow

Pawel Maga answered


4,481 ● 3 ● 31 ● 58 Oct 16 '19 at 13:09

edited
Oct 16 '19 at 13:15

This question has been answered


4 sufficiently many times, but with C# 7.2 and
the introduction of the Span type, there is a
faster way to do this in unsafe code:

public static class StringSupport


{
private static readonly int _
charSize = sizeof(char);

public static unsafe byte[] G


etBytes(string str)
{
if (str == null) throw ne
w ArgumentNullException(nameof(st
r));
if (str.Length == 0) retu
rn new byte[0];

fixed (char* p = str)


{
return new Span<byte>

Keep in mind that the bytes represent a


UTF-16 encoded string (called "Unicode" in
C# land).

Some quick benchmarking shows that the


above methods are roughly 5x faster than
their
Encoding.Unicode.GetBytes(...)/GetString(..
.) implementations for medium sized strings
(30-50 chars), and even faster for larger
strings. These methods also seem to be
faster than using pointers with
Marshal.Copy(..) or Buffer.MemoryCopy(...).

Share Improve this answer Follow

Algemist answered
300 ● 2 ● 10 Dec 7 '18 at 14:43

Does anyone see any reason why not to do


3 this?

mystring.Select(Convert.ToByte).T
oArray()

Share Improve this answer Follow

Lomithrani answered
1,717 ● 3 ● 16 ● 24 Apr 12 '17 at 16:25

shA.t edited
15.3k ● 5 ● 47 ● 95 Jul 30 '17 at 3:47

10 Convert.ToByte(char) doesn't work like you


think it would. The character '2' is converted to
the byte 2 , not the byte that represents the
character '2' . Use
mystring.Select(x => (byte)x).ToArray()
instead. – Jack Aug 2 '17 at 18:50

Add a comment

If the result of, 'searchResult.Properties [


3 "user" ] [ 0 ]', is a string:

if ( ( searchResult.Properties [
"user" ].Count > 0 ) ) {

profile.User = System.Text.Encodin
"user" ] [ 0 ].ToCharArray ().Sel
ect ( character => ( byte ) chara
cter ).ToArray () );

The key point being that converting a string


to a byte [] can be done using LINQ:

.ToCharArray ().Select ( characte


r => ( byte ) character ).ToArray
() )

And the inverse:

.Select ( character => ( char ) c


haracter ).ToArray () )

Share Improve this answer Follow

You might also like