How to extract stock code NUMBER from news summary?

72 views Asked by At

I have a Pandas table and need to extract the stock code '00981', '00823' from text stored in a column. The code is in the (00000) format. The code would be located at different location in the text summary. Please advice.

News
1 example(00981)example example example。 
2 example example example (00823)text text text 

desired output:

Code column
981
823

s = TABLE['News'].str.find('(')
e = s + 5
c = TABLE['News'].str[s:e]
TABLE["Code"] = c
2

There are 2 answers

1
Umar.H On BEST ANSWER

This works for me :

print(df)
           News
0          1 example(00981)example example example。 
1      2 example example example (00823)text text...
-
df['stock_num'] = df['News'].str.extract('(\d{5})').astype(int)
print(df)
                                                    News stock_num
0          1 example(00981)example example example。      981
1      2 example example example (00823)text text...     823

to change the string into a number you can either leverage the .astype() method or pd.to_numeric(df['stock_number'])

1
Simon Crane On

This will find all occurrences of 5 digits surrounded by parentheses:

import re

x = re.findall('\(\d{5}\)', my_string)