Commit 8293a4c9 authored by Matthias Sommerfeld's avatar Matthias Sommerfeld

Finish a stable version 1.0

Refactored some of the code, updated the documentation, pushed the required
PHP version to 5.6, some polishing here, som cleanup there.
parent 4c3baae7
......@@ -4,6 +4,7 @@
* @package IDNA Convert
* @subpackage charset transcoding
* @author Matthias Sommerfeld, <mso@phlylabs.de>
* @copyright 2003-2016 phlyLabs Berlin, http://phlylabs.de
* @version 1.0.0 2016-01-08
*/
......@@ -30,6 +31,7 @@ class EncodingHelper
if (strtoupper($encoding) == 'ISO-8859-1') {
return \utf8_encode($string);
} if (strtoupper($encoding) == 'WINDOWS-1252') {
return \utf8_encode(self::map_w1252_iso8859_1($string));
......
......@@ -99,6 +99,24 @@ class IdnaConvert {
if (self::$isMbStringOverload === null) {
self::$isMbStringOverload = (extension_loaded('mbstring') && (ini_get('mbstring.func_overload') & 0x02) === 0x02);
}
// Kept for backwarsds compatibility. Consider using the setter methods instead.
if (!empty($params) && is_array($params)) {
if (isset($params['encoding'])) {
$this->setEncoding($params['encoding']);
}
if (isset($params['allow_overlong'])) {
$this->setAllowOverlongUtf8($params['allow_overlong']);
}
if (isset($params['idn_version'])) {
$this->setIdnVersion($params['idn_version']);
}
if (isset($params['strict_mode'])) {
$this->setStrictMode($params['strict_mode']);
}
}
}
public function getClassVersion()
......
# IDNA Convert (idna_convert.class.php)
# IDNA Convert - pure PHP IDNA converter
<http://idnaconv.phlymail.de>
<phlymail@phlylabs.de>
(c) 2004-2015 phlyLabs, Berlin
<http://idnaconv.net>
by Matthias Sommerfeld <mso@phlylabs.de>
&copy; 2004-2016 phlyLabs, Berlin
## Introduction
......@@ -27,17 +27,16 @@ In older builds "ß" was mapped to "ss". Should you still need this behaviour, s
**ATTENTION:** As of version 0.8.0 the class fully supports IDNA 2008. Thus the aforementioned parameter is deprecated and replaced by a parameter to switch between the standards. See the updated example 5 below.
**ATTENTION:** BC break: As of version 1.0.0 the class closely follows the PSRs PSR-1, PSR-2 and PSR-4 of the PHP-FIG. As such the classes' naming has been changed, a namespace has been introduced and the default IDN version has changed from 2003 to 2008.
## Files
- **idna_convert.class.php** - The actual class
- **example.php** - An example web page for converting
- **transcode_wrapper.php** - Convert various encodings, see below
- **uctc.php** - phlyLabs' Unicode Transcoder, see below
- **ReadMe.md** - This file
- **IdnaConvert.php** - The actual class
- **EncodingHelper.php** - Convert various encodings to and from UTF-8, see below
- **UnicodeTranscoder.php** - Transcode between various Unicode representations, see below
- **README.md** - This file
- **LICENCE** - The LGPL licence file
The class is contained in idna_convert.class.php.
## Installation
......@@ -46,7 +45,7 @@ The class is contained in idna_convert.class.php.
```php
{
"require" : {
"mso/idna-convert" : "0.9.*"
"mso/idna-convert" : "1.*"
}
}
```
......@@ -64,10 +63,10 @@ Say we wish to encode the domain name nörgler.com:
```php
<?php
// Include the class
require_once('idna_convert.class.php');
// Instantiate it
$IDN = new idna_convert();
// Include the class
use Mso\IdnaConvert\IdnaConvert;
// Instantiate it
$IDN = new IdnaConvert();
// The input string, if input is not UTF-8 or UCS-4, it must be converted before
$input = utf8_encode('nörgler.com');
// Encode it to its punycode presentation
......@@ -83,10 +82,10 @@ We received an email from a punycoded domain and are willing to learn, how the d
```php
<?php
// Include the class
require_once('idna_convert.class.php');
// Instantiate it
$IDN = new idna_convert();
// Include the class
use Mso\IdnaConvert\IdnaConvert;
// Instantiate it
$IDN = new IdnaConvert();
// The input string
$input = 'andre@xn--brse-5qa.xn--knrz-1ra.info';
// Encode it to its punycode presentation
......@@ -103,10 +102,10 @@ The input is read from a UCS-4 coded file and encoded line by line. By appending
```php
<?php
// Include the class
require_once('idna_convert.class.php');
// Instantiate it
$IDN = new dinca_convert();
// Include the class
use Mso\IdnaConvert\IdnaConvert;
// Instantiate it
$IDN = new IdnaConvert();
// Iterate through the input file line by line
foreach (file('ucs4-domains.txt') as $line) {
echo $IDN->encode(trim($line), 'ucs4_string');
......@@ -121,14 +120,14 @@ We wish to convert a whole URI into the IDNA form, but leave the path or query s
```php
<?php
// Include the class
require_once('idna_convert.class.php');
// Instantiate it
$IDN = new idna_convert();
// Include the class
use Mso\IdnaConvert\IdnaConvert;
// Instantiate it
$IDN = new IdnaConvert();
// The input string, a whole URI in UTF-8 (!)
$input = 'http://nörgler:secret@nörgler.com/my_päth_is_not_ÄSCII/');
// Encode it to its punycode presentation
$output = $IDN->encode_uri($input);
$output = $IDN->encodeUri($input);
// Output, what we got now
echo $output; // http://nörgler:secret@xn--nrgler-wxa.com/my_päth_is_not_ÄSCII/
```
......@@ -136,54 +135,55 @@ echo $output; // http://nörgler:secret@xn--nrgler-wxa.com/my_päth_is_not_ÄSCI
### Example 5.
To support IDNA 2008, the class needs to be invoked with an additional parameter. This can also be achieved on an instance.
Per default, the class converts strings according to IDNA version 2008. To support IDNA 2003, the class needs to be invoked with an additional parameter. This can also be achieved on an instance.
```php
<?php
// Include the class
require_once('idna_convert.class.php');
// Instantiate it
$IDN = new idna_convert(array('idn_version' => 2008));
use Mso\IdnaConvert\IdnaConvert;
// Instantiate it, switching to IDNA 2003, the original, now outdated standard
$IDN = new IdnaConvert(['idn_version' => 2003]);
// Sth. containing the German letter ß
$input = 'meine-straße.de');
// Encode it to its punycode presentation
$output = $IDN->encode_uri($input);
// Output, what we got now
echo $output; // xn--meine-strae-46a.de
// Switch back to old IDNA 2003, the original standard
$IDN->set_parameter('idn_version', 2003);
// Switch back to IDNA 2008
$IDN->setIdnVersion(2003);
// Sth. containing the German letter ß
$input = 'meine-straße.de');
// Encode it to its punycode presentation
$output = $IDN->encode_uri($input);
$output = $IDN->encodeUri($input);
// Output, what we got now
echo $output; // meine-strasse.de
```
## Transcode wrapper
## Encoding helper
In case you have strings in different encoding than ISO-8859-1 and UTF-8 you might need to translate these strings to UTF-8 before feeding the IDNA converter with it.
In case you have strings in encodings other than ISO-8859-1 and UTF-8 you might need to translate these strings to UTF-8 before feeding the IDNA converter with it.
PHP's built in functions `utf8_encode()` and `utf8_decode()` can only deal with ISO-8859-1.
Use the file transcode_wrapper.php for the conversion. It requires either iconv, libiconv or mbstring installed together with one of the relevant PHP extensions. The functions you will find useful are
`encode_utf8()` as a replacement for `utf8_encode()` and
`decode_utf8()` as a replacement for `utf8_decode()`.
Use the encoding helper class supplied with this pacagke for the conversion. It requires either iconv, libiconv or mbstring installed together with one of the relevant PHP extensions. The functions you will find useful are
`toUtf8()` as a replacement for `utf8_encode()` and
`fromUtf8()` as a replacement for `utf8_decode()`.
Example usage:
```php
<?php
require_once('idna_convert.class.php');
require_once('transcode_wrapper.php');
use Mso\IdnaConvert\IdnaConvert;
use Mso\IdnaConvert\EncodingHelper;
$mystring = '<something in e.g. ISO-8859-15';
$mystring = encode_utf8($mystring, 'ISO-8859-15');
$mystring = EncodingHelper::toUtf8($mystring, 'ISO-8859-15');
$IDN = new IdnaConvert();
echo $IDN->encode($mystring);
```
## UCTC - Unicode Transcoder
Another class you might find useful when dealing with one or more of the Unicode encoding flavours. The class is static, it requires PHP5. It can transcode into each other:
Another class you might find useful when dealing with one or more of the Unicode encoding flavours. It can transcode into each other:
- UCS-4 string / array
- UTF-8
- UTF-7
......@@ -194,17 +194,21 @@ Example usage:
```php
<?php
require_once('uctc.php');
use Mso\IdnaConvert\UnicodeTranscoder;
$mystring = 'nörgler.com';
echo uctc::convert($mystring, 'utf8', 'utf7imap');
echo UnicodeTranscoder::convert($mystring, 'utf8', 'utf7imap');
```
## Contact us
## Contact me
For questions, bug reports and security issues just send me an email.
In case of errors, bugs, questions, wishes, please don't hesitate to contact us under the email address below.
phlyLabs
c/o Matthias Sommerfeld
Am Großen Rohrpfuhl 11
D-12355 Berlin
Germany
The team of phlyLabs
http://phlylabs.de
mailto:phlymail@phlylabs.de
mailto:mso@phlylabs.de
......@@ -12,7 +12,7 @@
*
* @package IdnaConvert
* @author Matthias Sommerfeld <mso@phlyLabs.de>
* @copyright 2003-2009 phlyLabs Berlin, http://phlylabs.de
* @copyright 2003-2016 phlyLabs Berlin, http://phlylabs.de
* @version 0.1.0 2016-01-08
*/
......
......@@ -19,7 +19,7 @@
},
"require": {
"ext-pcre": "*",
"php": ">=5.0.0"
"php": ">=5.6.0"
},
"repositories": {
"type": "vcs",
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment