Another shameless rip from this link where this simple nawk command deletes duplicates in a file and keeps the order:
nawk ' !x[$0]++' file_to_process > name_of_new_processed_file
Additional benefit, it’s substantially faster than your traditional sort & uniq combo. Case in point, a 28 MB file using sort & combo yields the following processing time:
real 0m22.53s user 0m2.16s sys 0m11.51s
Where as using nawk yields the following processing time:
real 0m1.39s user 0m1.33s sys 0m0.06s
I think we know who the clear winner is.