Zend_Http_Client and utf-8

7 messages Options
Embed this post
Permalink
umpirsky

Zend_Http_Client and utf-8

Reply Threaded More More options
Print post
Permalink
Hi.

I'm crawling this site http://www.mojauto.rs using Zend_Http_Client. Strange thing is that I get messed Serbian characters (č.ć,ž...), probably encoding problem. I tried to explicitly set utf-8 with
$client->setHeaders('Content-type: text/html; charset=utf-8'); but same problem occurs.

Any idea?

Regards,
Saša Stamenković
umpirsky

Re: Zend_Http_Client and utf-8

Reply Threaded More More options
Print post
Permalink
I monitored firefox headers with live HTTP headers when visiting this site, and it sends 
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
when I do the same, I still get wrong chars :(

Regards,
Saša Stamenković


On Tue, Sep 22, 2009 at 10:49 AM, umpirsky <[hidden email]> wrote:

Hi.

I'm crawling this site http://www.mojauto.rs using Zend_Http_Client. Strange
thing is that I get messed Serbian characters (č.ć,ž...), probably encoding
problem. I tried to explicitly set utf-8 with
$client->setHeaders('Content-type: text/html; charset=utf-8'); but same
problem occurs.

Any idea?

Regards,
Saša Stamenković
--
View this message in context: http://www.nabble.com/Zend_Http_Client-and-utf-8-tp25530629p25530629.html
Sent from the Zend Framework mailing list archive at Nabble.com.


umpirsky

Re: Zend_Http_Client and utf-8

Reply Threaded More More options
Print post
Permalink
Hehum, in Zend_Dom_Query #177
$domDoc = new DOMDocument;
it should be
$domDoc = new DOMDocument('1.0''utf-8'); or sth like that. Am I right?!?!?

Noticed that characters are messed after I do Zend_Dom_Query::query() on response body, so, problem is probably in Zend_Dom_Query.

Anyway, fixing that didn't help :(

Regards,
Saša Stamenković


On Tue, Sep 22, 2009 at 2:27 PM, Саша Стаменковић <[hidden email]> wrote:
I monitored firefox headers with live HTTP headers when visiting this site, and it sends 
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
when I do the same, I still get wrong chars :(

Regards,
Saša Stamenković



On Tue, Sep 22, 2009 at 10:49 AM, umpirsky <[hidden email]> wrote:

Hi.

I'm crawling this site http://www.mojauto.rs using Zend_Http_Client. Strange
thing is that I get messed Serbian characters (č.ć,ž...), probably encoding
problem. I tried to explicitly set utf-8 with
$client->setHeaders('Content-type: text/html; charset=utf-8'); but same
problem occurs.

Any idea?

Regards,
Saša Stamenković
--
View this message in context: http://www.nabble.com/Zend_Http_Client-and-utf-8-tp25530629p25530629.html
Sent from the Zend Framework mailing list archive at Nabble.com.



umpirsky

Re: Zend_Http_Client and utf-8

Reply Threaded More More options
Print post
Permalink
There is already issue http://framework.zend.com/issues/browse/ZF-3938...

Regards,
Saša Stamenković


On Tue, Sep 22, 2009 at 3:08 PM, Саша Стаменковић <[hidden email]> wrote:
Hehum, in Zend_Dom_Query #177
$domDoc = new DOMDocument;
it should be
$domDoc = new DOMDocument('1.0''utf-8'); or sth like that. Am I right?!?!?

Noticed that characters are messed after I do Zend_Dom_Query::query() on response body, so, problem is probably in Zend_Dom_Query.

Anyway, fixing that didn't help :(

Regards,
Saša Stamenković



On Tue, Sep 22, 2009 at 2:27 PM, Саша Стаменковић <[hidden email]> wrote:
I monitored firefox headers with live HTTP headers when visiting this site, and it sends 
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
when I do the same, I still get wrong chars :(

Regards,
Saša Stamenković



On Tue, Sep 22, 2009 at 10:49 AM, umpirsky <[hidden email]> wrote:

Hi.

I'm crawling this site http://www.mojauto.rs using Zend_Http_Client. Strange
thing is that I get messed Serbian characters (č.ć,ž...), probably encoding
problem. I tried to explicitly set utf-8 with
$client->setHeaders('Content-type: text/html; charset=utf-8'); but same
problem occurs.

Any idea?

Regards,
Saša Stamenković
--
View this message in context: http://www.nabble.com/Zend_Http_Client-and-utf-8-tp25530629p25530629.html
Sent from the Zend Framework mailing list archive at Nabble.com.




tfk

Re: Zend_Http_Client and utf-8

Reply Threaded More More options
Print post
Permalink
On Tue, Sep 22, 2009 at 3:18 PM, Саша Стаменковић <[hidden email]> wrote:
> There is already issue http://framework.zend.com/issues/browse/ZF-3938...
> Regards,
> Saša Stamenković
>

Maybe you can write a patch and attach it to the issue. That generally
helps getting it fixed ASAP.

Till
umpirsky

Re: Zend_Http_Client and utf-8

Reply Threaded More More options
Print post
Permalink
I'm trying... :P

Regards,
Saša Stamenković


On Tue, Sep 22, 2009 at 3:40 PM, till <[hidden email]> wrote:
On Tue, Sep 22, 2009 at 3:18 PM, Саша Стаменковић <[hidden email]> wrote:
> There is already issue http://framework.zend.com/issues/browse/ZF-3938...
> Regards,
> Saša Stamenković
>

Maybe you can write a patch and attach it to the issue. That generally
helps getting it fixed ASAP.

Till

umpirsky

Re: Zend_Http_Client and utf-8

Reply Threaded More More options
Print post
Permalink
Problem is in http client, since it occurs on some servers, and on some not, probably depending on headers they send by default.

Regards,
Saša Stamenković


On Tue, Sep 22, 2009 at 2:42 PM, Саша Стаменковић <[hidden email]> wrote:
I'm trying... :P

Regards,
Saša Stamenković



On Tue, Sep 22, 2009 at 3:40 PM, till <[hidden email]> wrote:
On Tue, Sep 22, 2009 at 3:18 PM, Саша Стаменковић <[hidden email]> wrote:
> There is already issue http://framework.zend.com/issues/browse/ZF-3938...
> Regards,
> Saša Stamenković
>

Maybe you can write a patch and attach it to the issue. That generally
helps getting it fixed ASAP.

Till