Zend_Filter

1 message Options
Embed this post
Permalink
fire-eyed-boy

Zend_Filter

Reply Threaded More More options
Print post
Permalink
Hi all,

For a CMS I'm building I need to filter the input for a 'page path'
input field. This input field allows the CMS user to assign custom SEO
friendly uri (segments) to the page.

I'm thinking about creating my own filter for this. But before I do I
wanted to consult this group.

What I usually do with this type of thing is, replace all diacritical
characters with their non diacritical equivalent and special characters
with their expanded counterparts. This is what I used to use (found some
time ago on the internet):

// ">>" indicates break for readability
public function replaceDiacriticalChars( $string )
{

  $string = strtr(
   $string,
   "\xA1\xAA\xBA\xBF\xC0\xC1\xC2\xC3\xC5\xC7\xC8\xC9\xCA\xCB\xCC >>
   \xCD\xCE\xCF\xD0\xD1\xD2\xD3\xD4\xD5\xD8\xD9\xDA\xDB\xDD\xE0\xE1 >>
   \xE2\xE3\xE5\xE7\xE8\xE9\xEA\xEB\xEC\xED\xEE\xEF\xF0\xF1\xF2\xF3 >>
   \xF4\xF5\xF8\xF9\xFA\xFB\xFD\xFF",
   "!ao?AAAAACEEEEIIIIDNOOOOOUUUYaaaaaceeeeiiiidnooooouuuyy"
  );

  $string = strtr(
   $string,
   array(
    "\xC4" => "Ae",
    "\xC6" => "AE",
    "\xD6" => "Oe",
    "\xDC" => "Ue",
    "\xDE" => "TH",
    "\xDF" => "ss",
    "\xE4" => "ae",
    "\xE6" => "ae",
    "\xF6" => "oe",
    "\xFC" => "ue",
    "\xFE" => "th"
   )
  );

  return $string;

}

This does things like:

input: éèïß
output: eeiss

This does the job pretty good. I usually replace spaces, underscores and
plus signs with dashes too, etc. But I was wondering:

1. What are the things I should or should not be concirned about when
creating valid uri's? I mean, is this still a concirn, or do uri's allow
a much broader spectrum of characters nowadays?

2. Is there already some component or filter in ZF that can create these
kind of normalized uri's from some input?

Thank you in advance for any insights.