with open("/home/xxxx/Downloads/DataEnginner9.txt", "r") as f:
for line in f:
print(line)
when i run this code i am able to get as sentences only,
The above code takes the file and splits into sentences and prints each line, but i want it to identify paragraphs from multiple files and also create a data-frame which contains the file name in the first column and respective entire content in the second column of the same row i.e.., example Data-frame :
[file1,content of the file splitted in paragraphs; file2,content of the file2 splitted in paragraphs . . . ]
Below is the sample output generated by the above script from one file.
Job description
Responsibilities
Work collaboratively with a global team to design, develop
scalable, maintainable and reliable services that process very large quantities
data using Big Data technologies (100 billion daily indicators, 6 TB/day before
compression).
Familiar with Object oriented development, with specific experience
in at least one major OO language(knowledge of Java is mandatory and if
possible java 8). Nice to have: Knowledge of functional programming.
Perform end-to-end software development life cycle functions
including Design, Development, Performance Analysis & Tuning, Optimization,
Testing and Product Maintenance.