regex delete duplicate chars from string.

0 favourites
  • 8 posts
  • Does any one know how to delete any duplicate numbers and letter from a string using regex?

    For example if my string looks like this

    "112, 113, 112, 114, 113, 112,"

    it would be turned into this

    "112, 113, 114".

    I've been looking into regex but it's over my head.


  • Try Construct 3

    Develop games in your browser. Powerful, performant & highly capable.

    Try Now Construct 3 users don't see these ads
  • I don't think regex is suited for that. A better way would be to add each number to a dictionary to eliminate duplicates. for example:

    global string list1= "112, 113, 112, 114, 113, 112,"

    global string list2=""

    repeat tokencount(list, ",") times

    --- Dictionary: add key tokenat(list, loopindex, ",") with value 0

    dictionary: for each key

    --- add Dictionary.CurrentKey&"," to list2

  • Ok i'll try that thanks for your help.

  • I just remembered this topic...and... i really do not like dictionary ...


  • Since OP asked for a Regex, here's how to do it :

    Matching pattern : (?:(\w)(?:\1)*)

    Flags : g

    Substitution pattern : $1

    See the witchcraft in action in this capx.

    Note that the (\w) in the matching pattern can be changed to something else to accommodate for accentuated letters and other symbols. Replacing it by (.) will ensure that any character can only be found once consecutively, except for the newline character. (You'd have to use ([\s\S]) to include it.)

  • Magistross

    Your example doesn't work as the OP requires.

    Using his string as an example, feeding the following into your .capx - "112, 113, 112, 114, 113, 112," - would produce "12, 13, 12, 14, 13, 12," which is obviously not correct.

    Unfortunately, I'm useless at RegEx so am unable to remedy.

  • It seems I misread his need, I stopped at the first sentence, while what he wants is to remove duplicate "words" from a string as explained further. It's definitely not the same thing. I fear using a single RegexReplace won't quite cut it. What could work is to use "lookaround" to create a match if a word doesn't appear more than once. Then you can concatenate all matches in a loop.

    Since Javascript only support lookahead, it will have the drawback of losing the original order in which words appear, only the last occurrence of a word will create a match. Only the first occurrence would have been matched if lookbehind was supported... that's too bad if original order is needed. For the sake of showcasing the power of Regex, here's how to do it :

    Matching pattern : (\b\w+\b)(?!.*\b\1\b)

    Flags : g


  • Thanks everyone.

    I used R0j0's method shortly after he suggested it which works fine, but i'll certainly look at yours korbaach, and Magistross thanks for taking the time show how to do it with Regex as originally requested, really impressive, it might be a quick and neater alternative. The reason i wanted to use Regex was because i'm used to using tokens to search strings and could have done it that way if the replace function could specify a particular token index instead of replacing all occurrences.

Jump to:
Active Users
There are 2 visitors browsing this topic (0 users and 2 guests)