Arabic Parsing
1 Star2 Stars3 Stars4 Stars5 Stars (5 votes, average: 4.20 out of 5)
usd
An Arabic parsing script for proper Arabic readability on Flash Player 8 and 9 without the use of Adobe Text Layout Framework, runs on the client-side with no need to migrating older projects for Flash Player 10 ..

You can change Font, Color, Size, .., etc freely using your software.

يمكنك تغيير الخط، اللون، الحجم، ..، الخ بمنتهى الحرية من خلال البرنامج الخاص بك

This content requires Flash Player 8 or above, click here to get it.

Latest Version: 3.2

You can integrate my parser into your library as open-source under GNU GPL license.

ActionSctipt 2.0 open-source files available here
ActionSctipt 3.0 open-source files available here
Action Input source files available here
Action for iOS source files available here

Older Versions:

1.9 (AS2), 1.9 (AS3)
1.8, 1.7, 1.6, 1.5, 1.4, 1.3, 1.2, 1.1, 1.0

ActionSctipt 2.0 Usage:

import net.ixdc.utils.StringUtils;
var utils:StringUtils = new StringUtils();
utils.wrapFactor = .8; // optional
 
var format:TextFormat = new TextFormat();
format.font = "Arial";
format.size = 14;
format.align = "right";
format.rightMargin = 4;
 
var output:TextField = this.createTextField("output", 1, 10, 10, Stage.width-20, format.size);
output.autoSize = true;
output.embedFonts = true;
output.wordWrap = true;
output.multiline = true;
output.html = true;
 
var xml:XML = new XML();
xml.ignoreWhite = true;
xml.onLoad = function(done:Boolean) {
	if (done) {
		output.htmlText = utils.parseArabic(this.firstChild.firstChild.nodeValue, output._width, format);
	}
}
xml.load("arabic.xml");

ActionSctipt 3.0 Usage:

package {
	import flash.display.Sprite;
	import flash.events.Event;
	import flash.net.URLLoader;
	import flash.net.URLRequest;
	import flash.text.TextField;
	import flash.text.TextFieldAutoSize;
	import flash.text.TextFormat;
	import flash.events.MouseEvent;
	import net.ixdc.utils.StringUtils;
	public class ProperArabicTextAS3 extends Sprite {
		public var utils:StringUtils;
		public var format:TextFormat;
		public var output:TextField;
		public function ProperArabicTextAS3() {
			utils = new StringUtils();
			utils.wrapFactor = .8; // optional
			output = new TextField();
			output.width = stage.stageWidth-20;
			output.autoSize = TextFieldAutoSize.LEFT;
			output.embedFonts = true;
			output.wordWrap = true;
			output.multiline = true;
			addChild(output);
			format = new TextFormat();
			format.font = "Arial";
			format.size = 14;
			format.align = "right";
			format.rightMargin = 4;
			var xmlLoader:URLLoader = new URLLoader();
			xmlLoader.addEventListener(Event.COMPLETE, showXML);
			xmlLoader.load(new URLRequest("arabic.xml"));
		}
		public function showXML(event:Event):void {
			XML.ignoreWhitespace = true; 
			var arabic:XML = new XML(event.target.data);
			output.htmlText = utils.parseArabic(arabic.text(), output.width, format);
		}
	}
}

Sample XML:

<?xml version="1.0" encoding="utf-8"?>
<arabic>
	<![CDATA[الهدف من لعبة الكلمات المتقاطعة هو ملء المربعات البيضاء، وتشكيل الكلمات أو العبارات، عن طريق حل القرائن التي تؤدي إلى إجابات. والمربعات السوداء تستخدم لفصل الكلمات أو العبارات. ويوضع لكل صف أو عمود رقم، ثم يكتب أمام الرقم ما يشابه الكلمة المطلوب كتابتها في المربعات، وتحوي المجلات أنواعا من هذه اللعبة؛ فمنها سهل وآخر صعب، والكلمات المتقاطعة في مجلة معينة تتبع أسلوب يختلف عن الأخرى، تبعا لأسلوب اللغة المستعملة والبلد، وتقوم المجلات والصحف بوضع هذه اللعبة لتسلية القراء. ظهرت أول لعبة للكلمات المتقاطعة في صحيفة newyork world وذلك في 21 ديسمبر عام 1913، وأصبحت من الألعاب الرائجة في الولايات المتحدة، ومنها انتقلت إلى بقية دول العالم، وبمختلف اللغات، وكان أول من أدخلها إلى الصحافة هو آرثر وين.]]>
</arabic>

Features Supported:

  1. Embedding Fonts (just put a dynamic textfield on-stage and select at least Basic Latin (95 glyphs) and Arabic (1088 glyphs) from the Character Embedding menu).
  2. Arabic Ligatures.
  3. Arabic Diacritics. (Fonts must support diacritics)
  4. Word Wrapping.
  5. Bi-Directional text.
  6. HTML Text. (Optional using StringUtils class, otherwise, kindly use the light version with ArabicLight class, both classes are included in the sample files above)
  7. Loading External text on run-time.
  8. Windows/Mac/Linux support.
  9. Arabic enabled input fields using actionscript method (createArabicInput) with the help of javascript (arabicinput.js) beta
  10. Urdu support.

Requirements:

  1. Dynamic TextField.
  2. HTML enabled TextField.
  3. pre-assigned TextFormat.
  4. Arabic fonts. must have Form-B unicode representations
  5. Urdu fonts. must have Form-A unicode representations

Properties:

  1. data: for referencing original input string.
  2. wrapFactor: optional reduction value (Mac only) for wrapping correction when using bi-directional text (default = 0.9).
  3. htmlLines: array of spliced text block HTML lines.
  4. embedFonts: boolean (default = true).
  5. hindiNumeralsOnly: boolean (default = false).

Omitted Properties:

  1. latinOnly
  2. americanFormat

For the font embedding and reducing file size, you can use shared fonts, and port those from a different server, follow this guide ..

detailed case study

Problem:

Well, this issue is a very old one, I’ve always experienced annoying problems each time I try to render Arabic text on run-time using Flash and ActionScript 2.0, Flash doesn’t support right-to-left languages, and when it comes to Arabic, we are talking about proper characters joining and word wrapping.

Analysis of Alternatives:

I never stopped tracking this specific issue to find any new solution has been developed somewhere, for a long time I can only find this paid solution.

The guy one had it perfectly solved, he was having his solution implemented using ActionScript 2.0 at some point earlier, but he stopped publishing that and left his latest version only, which works with ActionScript 3.0.

Recently, Adobe has announced that they finally support right-to-left languages along with bi-directional and complex script ones, only using ActionScript 3.0 and it requires Flash Player 10 or above, they even have an open source framework called Text Layout Framework, it’s perfect actually, it should put an end to this issue, but ..

I have my own objections; I made my quick test here to check their new technique.

Arabic text is finally rendered properly, but why we have to add so many childs to our display list from different TextLines and TextBlocks instead of a single TextField?
Also we have to set a max width, and I couldn’t find a way to scroll through text without using a custom scroller and masking the text just like any other MovieClip or DisplayObject, personally, I find this much complicated solution just to display simple Arabic string properly!

On the other hand, there are others who developed complex solutions developed in ActionScript 2.0 for Arabic speakers, and they’re definitely facing hell to convince their clients to migrate their projects to ActionScript 3.0, an easy solution for ActionScript 2.0 would save the day.

I do like ActionScript 2.0 a lot more, not just for handling text easily, also for one other important reason, which is memory Management!
ActionScript 3.0 new DisplayList structure has many features yes, but it kills innovation by relaying objects disposal to Flash Player’s Garbage Collector alone, there’s no supported way to destroy objects manually during run-time, unlike ActionSciript 2.0, and I find that an extremely rough obstacle when using ActionScript 3.0.

Implementation:

In a nutshell, ActionScript 2.0 used to refuse right-to-left languages, so we can’t embed fonts, wrap text properly, and when it comes to Arabic text we also see our characters splitted which is inappropriate for Arabic readers, the old solution for this was loading text encoded in UTF-8 from external text/xml files, and still no word wrapping, and on Mac machines Arabic letters still splitted, we can align text but its direction still not right-to-left.

So, logically, I know that I have to construct my Arabic text manually, to achieve that I had to analyze Arabic rules for character joining with all its special cases, and then find a way to force its direction to be left-to-right, and the word wrapping part is not that hard after all, this is basically what I did actually!

My only problem was in finding a way to inject all proper Arabic characters with its unique cases, for a long time I found that very difficult, Flash doesn’t allow me to copy those from other text processing tools, and of course there’s no way to type each character a lone in different cases using keyboard.

Finally, I found a way to do that, and it was very simple, I found the following charts at Unicode official website.

That’s it! Now I can instruct ActionScript 2.0 to handle my Arabic characters properly, by looping through the input string and replacing each Arabic character with it’s proper glyph according to it’s position within each word with a few conditions to maintain Arabic Grammar.

The task is basically easy, but when turning that to a complete bi-directional solution, things got a little bit complicated, but it the results were very good.

Results:

My favorite justification is for one of my clients, Hafez Avocats.
Data-driven, Multilingual, and Flash-based website developed in ActionScript 2.0 using this solution, the performance is stunning.

A JavaScript version of this solution to resolve custom Arabic fonts on mobile Safari for iOS.




Social Comments

22 Advanced Comments

October 22nd, 2012

Dear Ahmed,please can we find a solution for the character ﭪ like ﭪنزويلا. (for the flash version).Best Regards.



November 1st, 2012

@Nohra

Just posted an update v2.7 which has both characters enabled ڤ and چ
If you require additional characters, please let me know

November 2nd, 2012

Dear Ahmed,I want to thank you for your cooperation, and your useful script for Arabic.Many Thanks.

November 2nd, 2012

http://www.clickbeyonds.com/test.zipDear Ahmad,kindly you can download an example that I need to know how to fix it + if I want to write the numbers in Hindi format (how to do this?).Many thanks in advance.



hamed
November 3rd, 2012

Hello Sir could u add پ گ ژ

November 5th, 2012

sorry for these many amendments and requirements, can we add all the “التحريكات”, ً, ُ, ٍ, etc … to the arabic text.again many thanks.



November 5th, 2012

@hamed

Just posted another update v2.8 which has both characters enabled پ , گ and ژ
If you require additional characters, please let me know



November 5th, 2012

@Nohra

Arabic diacritics are supported, if you experience problems with those please send me sample file ..
For the hindi numerals, kindly use the latest version 2.8, I did reactivate the hindiNumeralsOnly property, for usage check the sample files ..
Hope this helps ..

November 5th, 2012

for the Arabic diacritics (they are working fine with the font Traditional Arabic used by you in the example, but if I use for example the font “Times New Roman”, the arabic diacritics dissappear, and makes a problem if I write for example “الليلُ”.for this one, I need to fix the “?” at the end of the lin, it appears 2008 ?, while it must appear ? 2008.http://www.clickbeyonds.com/test.zip



November 5th, 2012

@Nohra

You need to use Unicode Arabic Fonts (i.e. Arial Unicode MS, Andalus, KacstOnce, Traditional Arabic, .., etc).
My parser requires Form-B representation for proper character mapping.
For the “?” issue, I did fix that for you, kindly use the latest version 2.9.

November 7th, 2012

You are great Ahmed. I have two issues to fix them also and sorry for disturbing you. you can download the example from this link: http://www.clickbeyonds.com/test2.zip- first text is clear what is the problem- second text problem is that the “?” move to a second line while it can be on the same line + the space in height increase wihout having empty text (I think it comes from the <br> and <p> in the class.thank you.



November 8th, 2012

@Nohra

There were a few bugs in terms of Arabic Diacritics, I did fix those at the latest version 3.0 ..
For your “?” line-break issue, yes, it is because of the script wrapping formula, but this one is a bit tricky, since it vary based on font-size and font-family, so try to set the wrapFactor property to the value of 1.0 or even higher like 1.1 before you use the parseArabic method:

var utils:StringUtils = new StringUtils();
utils.wrapFactor = 1.1;
 
...
 
output.htmlText = utils.parseArabic(inputString, output.width, format);
November 10th, 2012

Dear Ahmed,I uploaded a new fla file that contains some bugs for Arabic Diacritics + error in the display of the text (first one). P.S: attached the used font inside the folder. http://www.clickbeyonds.com/test3.zip     Regards.



November 10th, 2012

@Nohra

Fixed at version 3.1

November 12th, 2012

Great job. thank you Ahmed.



November 12th, 2012

You’re most welcome ..

December 5th, 2012

Fantastic work. I’ve been using the 2.0 version for a while. What’s the difference between StringUtils.as, and the new ArabicLight.as? Also, do I need to import the RegExp.as? I can’t upgrade current project to version 3.1 yet, but in 2.0 everything seems to be working great, but when I pass in something with bi-directional languages inside a (), then it throws it off unless I put space padding between the parenthesis and the text? Any thoughts as to why? Same thing for periods (.). Overall, thanks much for writing these useful classes.



December 5th, 2012

@Tou Lee

ArabicLight.as is much faster since it neglects all inline HTML parsing, and yes, it requires RegExp.as (only for ArabicLight version), but you don’t have to import that cause it’s already imported in ArabicLight.as, just keep RegExp.as file in place alongside ArabicLight.as

Bi-directional text was having many issues throughout the past versions, those should have been fixed at the latest version 3.1, if you still have access to your source files, I recommend to use the latest version, just replace the package files, usage is pretty much the same I suppose.

If you have anymore questions, you’re most welcome to ask.

Tou Lee
May 29th, 2013

I’ve finally upgraded to the new v3.1. I’ve been noticing that the Arabic won’t be parsed or reversed correctly if the text field is set to “single line”, but it is when it’s set to “multiline”. Any idea of why it would do this, and is there anyway that I could keep the text field as singleline? Thanks again for the parser!

Asker
July 22nd, 2013

hey guys. Do you have some issues with multiline text?

C Khoury
November 19th, 2013

hi, we’re using your brilliant .js solution to deal with arabic script on iOs Safari – all is working well BUT there seems to be no way we can get numerals (1-10) to get parsed along ‘normal’ text. Everything works 100% in Chrome / Safari etc on desktop, but when it comes to the iOs version, any numbers behave as if no parser gets applied, relying instead on whatever is the default font installed.Has anyone experienced this, might anyone have a solution for this? I am guessing(?) there is maybe scope to extending the parser itself, but would have thought that numbers are included already? THanks for any input …

کهف
April 5th, 2014

السلام علیکمin case of “ی” behaves like this:السلام عل ک ی مwhy?can resolve?how?-I add spaces to represent result.please answer me!