FileStream fs = new FileStream("d:\\aa.txt",FileMode.Open);
byte[] bb= new Byte[fs.Length];
fs.Read(bb,0,bb.Length);
aa.txt含有中文,现在需要将bb分割,分割后要保证文字的正确,请问应该怎么做?
不能一行一行读吗?
StreamReader srReadLine = new StreamReader(
(System.IO.Stream)File.OpenRead("C:\\Temp\\Test.txt"),
System.Text.Encoding.ASCII);
srReadLine.BaseStream.Seek(0, SeekOrigin.Begin);
while (srReadLine.Peek() > -1) {
Console.WriteLine(srReadLine.ReadLine());
}
srReadLine.Close();
假如非要这样的话。使用这个方法试试:
从 index 开始,从当前流中将最多的 count 个字符读入 buffer。
public override int Read(
in char[] buffer,
int index,
int count
);
StreamReader srRead = new StreamReader(
(System.IO.Stream)File.OpenRead("C:\\Temp\\Test.txt"),
System.Text.Encoding.ASCII);
// set the file pointer to the beginning
srRead.BaseStream.Seek(0, SeekOrigin.Begin);
srRead.BaseStream.Position = 0;
while (srRead.BaseStream.Position < srRead.BaseStream.Length) {
char[] buffer = new char[1];
srRead.Read(buffer, 0, 1);
Console.Write(buffer[0].ToString());
srRead.BaseStream.Position++;
}
srRead.DiscardBufferedData();
srRead.Close();
要求
分成数组。
建议,可以先用正则表达式将从文件中读出来的中文和英文分开,然后分别处理就可以了。匹配中文的正则表达式是:[\u4E00-\u9FA0]+。
你可以用StreamReader srReadLine = new StreamReader(
(System.IO.Stream)File.OpenRead("C:\\Temp\\Test.txt"),
System.Text.Encoding.UTF8); or unicode
然后用streamReader.read方法,就不会把汉字分开