bash - Filtering Input files -
so i'm trying filter 'duplicate' results file.
ive file looks like:
7 14 35 35 4 23 23 53 85 27 49 1 35 4 23 27 49 1 ....
that mentally can divide item 1 , item 2. item 1 first 3 numbers on each line , item 2 last 3 numbers on each line.
i've got list of 'items':
7 14 35 23 53 85 35 4 23 27 49 1 ...
at point in file, lets line number 3 (this number arbitrary , example), 'items' can separated. lets lines 1 , 2 red , lines 3 , 4 blue.
i want make sure on original file there no red red or blue blues - red blue or blue red, while retaining original numbers. ideally file go from:
7 14 35 35 4 23 (red blue) 23 53 85 27 49 1 (red blue) 35 4 23 27 49 1 (blue blue) ....
to
7 14 35 35 4 23 (red blue) 23 53 85 27 49 1 (red blue) ....
i'm having trouble thinking of (or any) way it. appreciated.
edit:
an filtering script have grabs lines if have blue or red on lines:
#!/bin/bash while read name; grep "$name" twoitems done < itemblue > filtered while read name2; grep "$name2" filtered done < itemred > double filtered
edit2:
example input item files:
this pretty easy using grep
option -f
.
first of all, generate 4 'pattern' files out of items file. using awk here, might use perl or not. following example, put 'split' between line 2 , 3; please adjust when necessary.
awk 'nr <= 2 {print "^" $0 " "}' items.txt > starts_red.txt awk 'nr <= 2 {print " " $0 "$"}' items.txt > ends_red.txt awk 'nr >= 3 {print "^" $0 " "}' items.txt > starts_blue.txt awk 'nr >= 3 {print " " $0 "$"}' items.txt > ends_blue.txt
next, use grep
pipeline using pattern files (option -f
) filter appropriate lines input file.
grep -f starts_red.txt input.txt | grep -f ends_blue.txt > red_blue.txt grep -f starts_blue.txt input.txt | grep -f ends_red.txt > blue_red.txt
finally, concatenate 2 output files. of course, might use >>
let second grep
pipeline append output output of first.
Comments
Post a Comment