New to c# here, I have done some research about this problem but couldn't find anything, lack of vocabulary maybe.
My task here is to a read a huge file and to extract only the lines which are following the conditions.
Code I'm using to test some things:
using (StreamReader sr = new StreamReader("SPDS_Test.doc"))
{
while ((line = sr.ReadLine()) != null)
{
try
{
if (line.Contains("R ") | line.Contains("E "))
{
data = line;
data = data.Remove(0, 1);
data= data.Replace(" ", "").Replace("N", "").Replace("+", ",").Replace("·", ",").Replace("?", ",").Replace("(", "").Replace(")", "");
Data.Add(data);
}
}
catch (Exception e)
{
Console.WriteLine("--------", e);
Console.WriteLine("--------Press any to continue---------");
Console.ReadKey();
}
}
foreach (string d in Data)
{
Console.WriteLine(d);
Console.ReadKey();
}
}
This is a part of the file :
R XRPA168VC
B A
L 手动紧急停堆
E XRPA300KS
A 反应堆停堆 汽轮机停机
R XRPR111VR
B IP
E F2/3(XRPR144KS, XRPR145KS, XRPR146KS)
What I noticed is that the letters aren't even letter if there chinese around it, for example I tried the condition line.Substring(0,1) == "R", it couldn't find those lines.
No matter what I do, my codes would only return this
XPR111VR
F2/3XRPR144KS, XRPR145KS, XRPR146KS
I really need to be able to extract every R and E lines.
I just tried to copy my whole doc into Notepad and put the encoding into UTF8, seems to work afterward but not sure if it's reliable.