php preg_replace wrong charset or encoding

I’ve this simple code:

function getCleanText($rawText) //removes doublespace and punctuation
{
    return strtolower(preg_replace("/[\s\t]+/u", " ", 
        preg_replace("/[^a-zA-Z1-9àèéìòù]+/u", " ", $rawText)));
}

echo getCleanText("uscì"). " uscì <br>";

the function just removes punctuation and double spaces.
Why i’ve this output?

usc�� uscì 

I mean “uscì” doesn’t have any punctuation and the function is supposed to return it as it is without modification. Still i’ve problem with all accented letters. The web page is encoded in UTF-8. if i try with utf_encode like this

return utf8_encode(strtolower(preg_replace("/[\s\t]+/u", " ", 
        preg_replace("/[^a-zA-Z1-9àèéìòù]+/u", " ", $rawText))));

the output is

usc㬠uscì 

any ideas? Where i can find some documentation to understand my error?

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s