python - text file creating with 5th line of each article -


i want create text file 5th line of each of 32 articles in text file called "aberdeen2005.txt". have separate articles of file using:

import re  sections = []  current = [] open("aberdeen2005.txt") f:     line in f:         if re.search(r"(?i)\d+ of \d+ documents", line):                    sections.append("".join(current))            current = [line]         else:            current.append(line)  print(len(sections))  

in order trying following code:

for in range(1,500):     print(sections[i].readline(5)) 

but not working. ideas?

kind regards!

first when range(1,500) might out of range of sections raising indexerror, safer use range(len(sections)) right size.

it may more beneficial keep current list since split line anyway:

sections.append(current) 

then change .readline(5) [4] 4th element list (since indices start @ 0 idx 4 line 5) this:

import re  sections = []  current = [] open("aberdeen2005.txt") f:     line in f:         if re.search(r"(?i)\d+ of \d+ documents", line):                    sections.append(current) #remove "".join() keep split line            current = [line]         else:            current.append(line)  print(len(sections))  in range(len(sections)): #range(len(...))     print(sections[i][4])  #changed .readline(5) [4] since .readline() works on files 

the reason running problems because .readline() method on file objects time processed lists string raising attributeerror since str doesn't have .readline method, instead can split lines with:

sections[i].split("\n")[4] 

"\n" newline character, may not appear @ end of each line depending on os or other operations (like if .strip() eachline) sections contain strings may more liking:

import re  sections = []  current = [] open("aberdeen2005.txt") f:     line in f:         if re.search(r"(?i)\d+ of \d+ documents", line):                    sections.append("".join(current))            current = [line]         else:            current.append(line)  print(len(sections))  in range(len(sections)): #range(len(...))     print(sections[i].split("\n")[4])  #changed .readline(5) .split("\n")[4] 

Comments