Click to See Complete Forum and Search --> : Can sed do this? Or am I barking up the wrong tree?
Eddie Fantastic
08-18-2005, 11:47 AM
Hi,
I reguarly do product updates to our aging system. A section looks somthing like this:
025833 3M Post It Note Pad 1.1/2x2 653YE 1 15060 583 388 12 12000242 3M01401 653YE NNNNNN 767A 188A C 12180 N48201090 N3BN
025841 3M Post-It Note Pad 3x3-76x76mm 654YE 1 15060 1188 791 12 12000242 3M01403 654YE NNNNNN 767A 188A C 12180 N48201090 N3BN
02585X 3M Post-It Note Yellow 3x5 655YE 1 15060 1591 1059 12 12000242 3M01417 655YE NNNNNN 767A 188A C0400189595949312180 N48201090 Y3BN
This example contains 3 lines of the text file (formatting is lost here). I'm interested in the last but forth character on each line (line character number 210). If this is a Y (as it is with the third product), I would like to change characters 72-76 inclusive to 01001 (in the case of the third product here I would be changing 15060 to 01001. If it is not a Y the line can be left as it is.
This shows 3 products, our update file contains 20000.
I've been messing around with sed and have got to grips with the basics, but I can't find a way to specify a character number on a line to query on.
Can this be done in sed, if not is there an alternative?
Thanks for any help.
bwkaz
08-18-2005, 07:26 PM
Hmm.... you could probably do something like this:
sed -r -e '/^.{209}Y/ s/15060/01001/g' <filename >filename.new Sed has a way to only do substitutions on lines that match a regex (in this case, the regex is "^.{209}Y", i.e. start at the beginning of the line, match 209 characters (any characters), and then a single Y). On those lines, it'll do the substitution "s/15060/01001/g" (change all instances of 15060 to 01001).
The -r is there to enable extended regular expressions (which are required for sed to properly interpret {209}). If you don't use GNU sed, then this may not be possible.
Eddie Fantastic
08-19-2005, 05:09 AM
Thanks for your help.
It's halfway there. But I didn't give a good example here.
Although these three products have a range of 15060 (the text to be modified), not all do. 15060 applies to sticky pads. For paper it is 10060, For Envelopes it's 10050. There are dozens of variations. Always five digits. Always characters 72-76 on the line.
So can your code be modified to accomplish this?
hlrguy
08-19-2005, 11:05 AM
I can usually figure out the sed syntax using this as a reference.
http://www-h.eng.cam.ac.uk/help/tpl/unix/sed.html
hlrguy
DrChuck
08-19-2005, 02:13 PM
Sed is best for pattern matching and replacement, but you don't really have patterns here.
Awk is good for splitting lines based on unique delimeters. Again, that doesn't help you here.
Cut could work, as you can select a substring based on columns, exactly as you wish. But if you are writing a shell script anyway to implement your alorithm, you might as well use bash's built in string manipulation.
Here is how I would isolate character #210:
#!/bin/bash
first_column=209 #zero-based counting
num_columns=1
while read input_line
do
substring=${input_line:$first_column:$num_columns}
echo ${#input_line}
echo $substring
done
Oh yeah, read works from standard input, so run it like:
cat original.txt |script.sh > modified.txt
Sorry ... posted too early. The other piece is to replace the substring from columns 72-76. Using the same variable expansion method, break the string into pieces and splice in "01001":
beginning=${input_line:0:71}
middle=${input_line:71:5}
end=${input_line:76}
new=01001
output_line="$beginning$new$end"
drChuck
bwkaz
08-19-2005, 06:57 PM
Ah.
Yeah, I'd do what DrChuck said, in that case. ;)
Eddie Fantastic
08-22-2005, 08:51 AM
Many thanks for your help.
Things are coming along well.
I have built a small script incorporating your code.
I must stress I am in no way a programmer, this is my first real dabble. So please ignore my simplistic approach.
I had to add another check, as it seems that even if the item is a 5 star, if column 150 (the last character in that there cluster of N's and Y's) is Y ( a discount exception item), the range must not be changed.
All the processing is working a treat, and only those products that need to be altered are being altered.
The problem I have is that the file that is output loses all extra whitespaces. That is, if there is more than one space, the process will reduce it to one space, thus losing the required formatting.
I have looked yet cannot find a way to keep the original formatting.
Included is my script and the original and new file.
#!/bin/bash
while read input_line
do
first_column=209
num_columns=1
substring=${input_line:$first_column:$num_columns}
if [ $substring = "N" ] #test if item is 5 star, if not leave intact
then
echo $input_line
else
first_column=149
num_columns=1
substring=${input_line:$first_column:$num_columns}
if [ $substring = "Y" ] #test if item is discount Exception, if Y leave intact
then
echo $input_line
else #whatever's left change range to 01001
beginning=${input_line:0:71}
middle=${input_line:71:5}
end=${input_line:76}
new=01001
output_line="$beginning$new$end"
echo $output_line
fi
fi
done
Original file:
O25833 3M Post It Note Pad 1.1/2x2 653YE 1 15060 583 388 12 12000242 3M01401 653YE NNNNNN 767A 188A C 12180 N48201090 Y3BN
025841 3M Post-It Note Pad 3x3-76x76mm 654YE 1 15060 1188 791 12 12000242 3M01403 654YE NNNNNY 767A 188A C 12180 N48201090 Y3BN
02585X 3M Post-It Note Yellow 3x5 655YE 1 15060 1591 1059 12 12000242 3M01417 655YE NNNNNN 767A 188A C0400189595949312180 N48201090 N3BN
025868 3M Post-It Note Tape-658H 1 10022 5280 3520 12 12000242 3M72948 658CT NNNNNN 770C H 15170 N38140010 N4 N
Output file:
O25833 3M Post It Note Pad 1.1/2x2 653YE 1 01001 583 388 12 12000242 3M01401 653YE NNNNNN 767A 188A C 12180 N48201090
025841 3M Post-It Note Pad 3x3-76x76mm 654YE 1 15060 1188 791 12 12000242 3M01403 654YE NNNNNY 767A 188A C 12180 N4820
02585X 3M Post-It Note Yellow 3x5 655YE 1 15060 1591 1059 12 12000242 3M01417 655YE NNNNNN 767A 188A C0400189595949312
025868 3M Post-It Note Tape-658H 1 10022 5280 3520 12 12000242 3M72948 658CT NNNNNN 770C H 15170 N38140010 N4 N
DrChuck
08-22-2005, 01:02 PM
Eddie,
Double quotes preserve the whitespace.
echo "$input_line"
echo "$output_line"
drChuck
Eddie Fantastic
08-23-2005, 10:37 AM
Many thanks, both of you.
It worked an absolute treat.
Ed