split, no repeat- Regular expression



I have a file, the content is like this:

ATATTTGATTGGCCAGCCCTGCGTTTGCGGTTTTTTTTTGTTTTTTTATTTCCTGTATTTTTTTTGGGGGGGAAAAATTGCAGTTCCACGGA
4f-rnp Gene 204:267
ACCTTATCGACTAGTATAAAAGGCACTGTCAGCTCTCCAGCCCGAACAAAATCGATCAAAATGCGCCCGCAATCAGCTGCGTGTCTATTACT
44D JMB 166:101
ATGGGAGCGGTATGCTTAAATAGGGGCACCTTTTAATCCCTCTGGCCATTGGCAATCGATCCATTTAGTGGGAGCCATGTTCAAGTTGCTGG
44L JMB 166:101
AACTTATGTAATCATATAGATTCTATAATAAACAAAGAAACAAAACTAGTTGTAAAACAAACACGATTCCTGTGTGTCATTGCGGGATATGG
74F EMBO 3:289
TTTCCACACGATCGTGCTGCCTCCCAATAAACCCGGTGCAGTGAGTCAGTGTGTTGTGTGCCCCAGTCGCGAGCGGACGATCCGTGGAGATC
Abdb EMBO 7:3223
TGCGGATCAATTAAACCGTAAAAAACAGAGCAGGCGAGCGTAAGCAAGAGAGAGAGGTGAAGCCAGAGGCGGAGGCGCAAGACAAAGTGCAT
abl p1 Oncogene 3:33
AAAAAACAGAGCAGGCGAGCGTAAGCAAGAGAGAGAGGTGAAGCCAGAGGCGGAGGCGCAAGACAAAGTGCATTTTCAGGGCGTGTTTTTGA
abl p2 Oncogene 3:33
TAATAGTCGCTCAAAAGCTGTCGAGAGAGAGGGAGAGAAAAGAGAGAGTGAAAGCATAGTCCCGCTATTTTGCCGAGAGAAATAAAGAGCAG
ace JMB 210:15

for example, the first sequence, what I want is after sequence: 4f-rnp;
AND then collect all this name to a new file.
so the new file is like:
4f-rnp
44D
44L JMB
74F
Abdb
abl
*here I don;t want another alb, so the output should not be repeated.*
ace

I know how to make script to split and get the name, but How can I
avoid this repeatment?

Thanks!

.



Relevant Pages