[nSLUG] sed question

Rich budman85 at eastlink.ca
Fri Dec 30 04:45:43 AST 2005


On Thu, 2005-12-29 at 04:08 -0400, Jason Kenney wrote:
> > While sed is good I have found awk to be better. Most problems solved
> > with sed can be solved with awk just as easily. It is still good to
> > know sed syntax as the vi/vim ex mode (: mode) syntax is similar.
> 
> Well, they are intended for different things... Awk is very good for 
> processing delimited files, as here.
> 
> Some might even suggest perl makes them both obsolete?
> 
> Herb: You need to use the -e switch I think to run on the command line.
> 
> Also, I don't think sed understands \s, like perl does.
> 

the cmd line regex isn't as feature rich as perl's
for basic things, sed it fast.

at work, I used a perl script to generate an awk script
perl and awk are about the same when files sizes are less than 2GB.
Over 2GB, perl has some issues.  I think it depends on how it was
compiled.  Even with optimizations, I still see awk flying thru huge
files.

awk takes about 45 min to go thru 50 million rows
perl was taking 3 to 4 hours
no matter what I did, I tried using the least amount of conditions, 
awk was do about 20 compares and was still faster.

its worth learning awk and sed to see where perl got its roots. :)
I was having a problem in perl reading pipe delimited files where the
columns were null (||||) split would read it as 1 instead of 4.
I used the 3rd arg to force the number of columns returned, 
@x=split("|",$y,10)  
Experimenting, I found change "|" to /\|/ solves the split issue.
@x=split(/\|/,$y)  
does work correctly


Pick up some O'Riely books :)
Worth the bucks.


Rich

 


!DSPAM:43b4f212298769206651351!




More information about the nSLUG mailing list