Thursday, May 5, 2011

How do I use regex to replace non-word chars in a specific piece of string?

I have text file with a row like this:

SendersTimeSeriesIdentification: COMPANY_A/COMPANY_B/REF_7/20090505

I'd like to replace all non-word chars in the value part with the character n, like this:

SendersTimeSeriesIdentification: COMPANYnAnCOMPANYnBnREFn7n20090505

But there are similar strings all over the file, which must remain intact. An example being:

MessageIdentification: REF_7/VER_1/20090505

I think I must use lookbehind, and I came to this attempt (VB.NET):

Regex.Replace(fileContentString, "(?<=SendersTimeSeriesIdentification: )(\W)", "0")

This doesn't work as I'd like it to. So my questions are:
Is it possible to replace all non-word characters in a specific piece of string with just one Regex.Replace call? How?

From stackoverflow
  • Rather than doing as a single regex replace, I'd split the file into lines, then only process lines that start with "SendersTimeSeriesIdentification: ". That way the regex replacement is nice and simple.

    Alan Moore : Kamarey's answer is correct, but I would take this approach if I could.
    SinkovecJ : @Alan M: I agree. The "if I could" is the key here. :-)
  • Try this one:

    Regex.Replace(fileContentString, "(?<=SendersTimeSeriesIdentification:\s.*)[_\W]", "0")
    

    This replaces all \W and _ chars with "0" after "SendersTimeSeriesIdentification: ".

    SinkovecJ : I'll use this solution, just because it is easier in my situation. I guess this would not work if the line would have a comment at the end (// some comment), becuse the two forward slashes would get replaced, even though they shouldn't have.

0 comments:

Post a Comment

Note: Only a member of this blog may post a comment.