python - How to match fields from two lists and further filter based upon the values in subsequent fields? -


edit: question answered on reddit. here link if interested in answer problem https://www.reddit.com/r/learnpython/comments/42ibhg/how_to_match_fields_from_two_lists_and_further/

i attempting pos , alt strings file1 match in file2, simple. however, file2 has values in 17th split element/column last element/column (340th) contains string such 1/1:1.2.2:51:12 want filter for.

i want extract rows file2 contain/match pos , alt file1. thereafter, want further filter matched results contain values in 17th split element/column onwards. values have split ":" can filter split[0] = "1/1" , split[2] > 50. problem have no idea how this.

i imagine have iterate on these , split not sure how code presently in loop , values want filter in columns not rows.

any advice appreciated, have sat problem since friday , have yet find solution.

import os,itertools,re file1 = open("file1.txt","r") file2 = open("file2.txt","r")  matched = []  (x),(y) in itertools.product(file2,file1):     if not x.startswith("#"):             cells_y = y.split("\t")             pos_y = cells[0]             alt_y = cells[3]              cells_x = x.split("\t")             pos_x = cells_x[0]+":"+cells_x[1]             alt_x = cells_x[4]              if pos_y in pos_x , alt_y in alt_x:                     matched.append(x)  z in matched:     cells_z = z.split("\t")     if cells_z[16:len(cells_z)]: 

your requirement not clear, might mean this:

for (x),(y) in itertools.product(file2,file1):     if x.startswith("#"):         continue      cells_y = y.split("\t")     pos_y = cells[0]     alt_y = cells[3]      cells_x = x.split("\t")     pos_x = cells_x[0]+":"+cells_x[1]     alt_x = cells_x[4]      if pos_y != pos_x: continue     if alt_y != alt_x: continue      extra_match = false      f in range(17, 341):         y_extra = y[f].split(':')          if y_extra[0] != '1/1': continue         if y_extra[2] <= 50: continue         extra_match = true         break      if not extra_match: continue      xy = x + y     matched.append(xy) 

i chose concatenate x , y matched array, since wasn't sure whether or not want data. if not, feel free go appending x or y.


Comments