Save (pre-2012) 13-F filings to Pandas Dataframe using python

125 views Asked by At

I saw the question on pre-2013 13-F filings, but noticed they used an even different format pre 2012. This is the original question: Extracting table of holdings from (Edgar 13-F filings) TXT (pre-2013) with python

Pre 2013 but post 2012 example:

https://www.sec.gov/Archives/edgar/data/1067983/000119312512470800/d434976d13fhr.txt

Pre 2012 example:

https://www.sec.gov/Archives/edgar/data/1067983/000095012905008251/0000950129-05-008251.txt

Pre 2012, they did not fill in all company names, title of class and CUSIP number. This therefore shifts the columns to the left. (See pre 2012 format in picture) Pre-2012 13-F Filing

Adapting the code from NoobFin and Jack Fleeting's question gives me this:

Code:

endpoint = r"https://www.sec.gov/Archives/edgar/data/1067983/000095012905008251/0000950129-05-008251.txt"
headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get(url = endpoint, headers = headers)
def lst_bunch(l,lenth=4):
    i=0
    while i < len(l):
        if len(l[i])<lenth:
            l[i] += l.pop(i+1)
        i += 1
    for item in l:
        if len(item)<lenth:
            lst_bunch(l,lenth)
    else:
        return l

tabs = response.text.replace('<TABLE>','xxx<TABLE>').split('xxx')
for tab in tabs[1:]:
    soup = bs(tab,'html')
    table = soup.select_one('table')
    lines = table.text.splitlines()
    lst_bunch(lines,50)
    for line in lines:
        print(line.strip())

Output:
Jack Fleeting's code applied

What I am looking for is a DataFrame which I can export to CSV (or SQL or whatever) that looks like this:

Quick Excel file to show desired result.

I was thinking of making 1 good example and put it through some ML commands, but maybe I am missing something.

Thanks!

0

There are 0 answers