URL encoding

Author: hansen@ahp-gmbh.de (-GHAN-)

Hi Uniface.info,
While doing server pages a funny URL came which has to be URL ENCODED in order to be recognized correctly. This could be anything with special chars like the following:

"ANGEFÜGT" that needs to be encoded in order not the become "ANGEFÃœGT"

 

Do we have a simple NON-3GL possibility to do so? Is there a nice workaround?

 

Cheers,

-GHAN-

8 Comments

  1. using $replace ?


    Author: ulrich-merkel (ulrichmerkel@web.de)
  2. as mentioned above, $replace is the key to success.

    This real-world URL: http://vimeo.com/moogaloop.swf?clip_id=3669333&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=00ADEF&fullscreen=1

    uses "entities" just like "&"

    Each beginners HTML manual has a table of these entities including  "Ä" or  "ä" for the upper and lowercase a-umlaut..

    So all you have to do is replacing "Ä" with "Ä" to solve the "problem"

    Success, Uli


    Author: ulrich-merkel (ulrichmerkel@web.de)
  3. NOTE:
    The &-Codes wont do since its not about HTML TAGs but URLs ... Those are encoded via %-hex codes (%20 => SPACE)


    Author: -GHAN- (hansen@ahp-gmbh.de)
  4. see my above URL example, they USE entities


    Author: ulrich-merkel (ulrichmerkel@web.de)
  5. Hi,

    Is a 3GL solution unworkable for you in this case? A quick search reveals many examples that are only a few lines of code. I've used these many times in the past. The 'bit' level processing in 3GL makes this quite straight forward. What can be tricky is char set handling. This may be a problem if you are maybe generating a page on a UNIX server and processing later on a Windows machine.

    Also, have a look at http://en.wikipedia.org/wiki/Percent-encoding. It has an interesting bit about Unicode (UTF-16) encoding.

    Cheers,

    J.


    Author: Jason Huggins (jason.huggins@uniface.com)
  6. Hi,

    What you see is, as far as I know, Uniface mapping the character Ü to it's own characterset. It's a multibyte character, that's why you see the two chars. We had to struggle with this a few times. It's possible to stress Uniface here. Never tried it since U8, but I was hoping that Unicode would solve this problem. Did you try this? Assuming you're on U9..

    Kind regards,

    Peter


    Author: lammersma (lammersma@hotmail.com)
  7. Uniface behaves fine in this "game". The problems bubble up when you try to put those chars into a http link! It's displayed clear and also received in that way. But the web server doesnt get this right since it isn't URL encoded (and we use UTF-8 here)  :) So it only notices the chars in double byte coding and delivers junk.

    So,  in order to be smarter than the server, we will have to help it a bit out. Did some investigation on this.  URL encoding replaces unsafe ASCII characters with "%" followed by two hexadecimal digits corresponding to the character values in the ISO-8859-1 character-set. Time to make a tiny include on that :) I will contribute it in the next part of my HOW-TO.

    Cheers,
    -GHAN-

     

     


    Author: -GHAN- (hansen@ahp-gmbh.de)
  8. A  reason to migrate to Uniface 9.4 where the  $encode function provides a URL encode feature to turn  "ANGEFÜGT" to  "ANGEF%C3%9CGT" and reverse with $decode


    Author: George Mockford (palgam0@hotmail.com)