adcomm group

Every php developer who work in japan realized that mb_* function don’t work that good with SJIS encoded string. Some characters are considered as 1 byte length, or as 2 bytes and some as 3 bytes |-| ; Not really useful in the end. i recently discovered this article (in Japanese : http://phpspot.org/blog/archives/2005/11/php_17.html). The presented regEx is supposed to split a Japanese string into word. Linguistic-wise, I’m completely disagree with the results, but it was an interesting starting point for Japanese string manipulation and specially SJIS encoded ones :> .
then i changed a little bit the regEX in order to obtain a one-character tokenizer one. This regEx enable me afterward to re implement the basic but really useful string manipulation command as substring, str_replace, strpos, strlen etc…

here is the code of the class


//implementtation of the most basic string manipulation function for japanese, work the same way as the monobytes ones

class processJ
{
public function __construct()
{
mb_regex_encoding("SJIS");
}

public function __destruct()
{

}

//this method split a sjis encoded japanese string by character an return an array
public function charSpliter($str)
{
$token = array();

while(1)
{
$bytes = mb_ereg("[一-龠]|[ぁ-ん]|[ァ-ヴー]|[a-zA-Z0-9]|[a-zA-Z0-9]", $str, $match);
if ($bytes === false) {
break;
} else {
$match = $match[0];
$token[] = $match;
}
$pos = strpos($str, $match);
$str = substr($str, $pos+$bytes);
}
return $token;
}

public function substrJ($str, $start, $lenght = NULL)
{
$strToken = $this->charSpliter($str);
$end = !empty($lenght) ? $lenght : count($strToken);
$substr = "";
for($i = $start; $i < $end; $i++)
{
$substr .= $strToken[$i];
}
return $substr;
}

public function strlenJ($str)
{
$strlen = $this->charSpliter($str);
return count($strlen);
}

public function strposJ($haystack, $needle)
{
$strToken = $this->charSpliter($haystack);
$needleToken = $this->charSpliter($needle);
$tokenLen = count($strToken);
$needleLen = count($needleToken);

for($i = 0; $i < $tokenLen; $i++)
{
if($strToken[$i] == $needleToken[0])
{
for($j = 0; $j < $needleLen; $j++)
{
if($needleToken[$j] !== $strToken[($i+$j)])
{
continue 2;
}
}

return $i;
}
}

}

public function str_replaceJ($search, $replace, $subject)
{
return $this->substrJ($subject,0,$this->strposJ($subject,$search)).$replace.$this->substrJ($subject,($this->strposJ($subject,$search)+$this->strlenJ($search)),$this->strlenJ($subject));
}

}

$test = new processJ();

print($test->substrJ("ようこそcyberblogへ漢字",0,4)); > "ようこそ"
print($test->strlenJ("ようこそcyberblogへ漢字")); > 16
print($test->strposJ("ようこそcyberblogへ漢字", "cyber")); > 4
print($test->str_replaceJ("cyberblog","サイバーブログ","ようこそcyberblogへ漢字")); > "ようこそサイバーブログへ漢字"
?>

have fun

Categories: Focus Web, Geek Talk — @ 12:22 pm

french version

It’s been 2 days that i’m testing the microsoft silverlight and i wanted to share my opinion about it. It simply the end of the HTML. This is what dev, designers and project managers waited for; a finally clear cut of the tasks that lead to improve the coordination in the creation flow. But first let’s do a small presentation.

  1. Presentation.
    There’s two version of silverlight actually, 1.x and the beta 2.

    • version 1.x : it’s a XAML based application from the huge SGML familly. The interaction are coded in javascript and it’s interpreted on the client side. An interpreted flex-like.
    • version 2 : Still based on the XAML representation but integrate the .net framework. That mean you can code it in the language you like (c#,VB,php,…), it’s compiled and get the best of the infamous frameworks.
  2. Why silverlight 1.1 was a deception.
    It’s maybe because i thought to hight about it. When i knew that it what an interpreted SGML (XML) my first thought was to generate it dynamically with server side language (php, perl) and it works (see this article). Q funny tricks lead to another one, i tried to DOM scripting a little bit. It was a failure. Then i was in front of a flash/flex like. But flash player is installed on around 90% of the computer that goes on internet; abode won.
  3. Why silverlight 2 make me dream.
    By integrating the .net framework, it directly a step beyond pragmatically wise and production wise. For those who are already familiar with the .net environment, they can start right away in the project production as they’re used to. At this moment, my craziest dream was in from of my eyes : the end of this good but too old html. I already briefly saw this dream in flex but flex is just a 0Kal flash 9 version instead of silverlight 2 that can be used as a graphic protocol.
  4. Conclusion
    i’m very interesting in this technology, because i’m pretty sure if the development continue in the way, it’s the end of the html. All the most important interface design software as firework can export in XAML, That mean once the designers finished the mock up, the whole design is finish in the same time. On top of that in the visual studio 2008 when you create silverlight project, the software generate a class skeleton like the Poseidon module for java/eclipse. No more designer that touch the js or server-side code or dev that play around with the css or the div ;)
Categories: Focus Web, Geek Talk — @ 2:14 pm

The last two days, Tokyo witnessed the first MySQL users conference in Asia. Held in the National Museum of Emerging Science and Innovation (Miraikan) in Odaiba, it was a very informative event in a relaxed atmosphere.

(more…)

Categories: Focus Web, Geek Talk, News & Innovations — @ 12:04 pm

Company Information

CyberMedia k.k.
Tech Hiroo Bldg 1F
Hiroo 1-10-5
Shibuya-ku, Tokyo
150-0012
tel +81(0)3-5423-5333
fax +81(0)3-5423-6654
email CyberMedia

cyber bloggers