Friday, May 20, 2011

SQL Server: How to Remove Special Characters


One more useful function from my query bank, which I like to use to remove special characters from a string.

CREATE FUNCTION dbo.RemoveSpecialChars ( @InputString VARCHAR(8000) )
RETURNS VARCHAR(8000)  
BEGIN
    IF @InputString IS NULL
        RETURN NULL
    DECLARE @OutputString VARCHAR(8000)
    SET @OutputString = ''
    DECLARE @l INT
    SET @l = LEN(@InputString)
    DECLARE @p INT
    SET @p = 1
    WHILE @p <= @l
        BEGIN
            DECLARE @c INT
            SET @c = ASCII(SUBSTRING(@InputString, @p, 1))
            IF @c BETWEEN 48 AND 57
                OR @c BETWEEN 65 AND 90
                OR @c BETWEEN 97 AND 122
                  --OR @c = 32
                SET @OutputString = @OutputString + CHAR(@c)
            SET @p = @p + 1
        END
    IF LEN(@OutputString) = 0
        RETURN NULL
    RETURN @OutputString
END

How to use it.
SELECT dbo.RemoveSpecialChars ('This string contains special chracters:/ Which * & we % need @ to #remove')

According to this function space is a special character and if you want to ignore spaces then uncomment “OR @c = 32”.

4 comments:

  1. Very nice and timely for me. Works perfectly. Thank you.

    R Baum

    ReplyDelete
  2. Create Function dbo.[fn_RemoveUnwantedChars]
    (
    @InputString varchar(max)
    )
    Returns varchar(max)
    Begin

    If @InputString is null Return Null;

    Declare @CHARSET varchar(100)
    Set @CHARSET = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'


    Declare @NewMSG varchar(max)
    Set @NewMSG = ''
    Declare @CH char(1)
    Declare @Len1 int

    Set @Len1 = Len(@InputString)

    While @Len1>0
    Begin
    Set @CH = Left(@InputString,1)
    if CHARINDEX(@CH,@CHARSET)>=1 Set @NewMSG = @NewMSG + @CH
    Set @Len1 = @Len1 - 1
    Set @InputString = Right(@InputString,@Len1)

    End

    Return @NewMsg
    End

    -- Above is the method I use and does not require looking up ASCII codes.
    -- In my version I don't want the SPACE (ASCII 32, hex 0x20)
    -- To add the space use this
    -- Set @CHARSET = ' 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
    -- To check for email @ and . use
    -- Set @CHARSET = ' @.0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'

    ReplyDelete
    Replies
    1. the immediate above function is not working fine. it's not considering the lower case letters

      Delete
  3. I found the second version works great and is easier to follow. It allows for easy inclusion or exclusion of any character, even international. The inclusion of the lower case letters only has significance if the collation sequence is set to case sensitive, otherwise the inclusion of lower case letters is redundant. Consider if for some reason you want to omit vowels or certain letters, or include other characters. The second is much cleaner and definitely more versatile.

    ReplyDelete

All suggestions are welcome