Note: This document is in utf-8 (Unicode). See bottom of page. http://www.cs.um.edu.mt/~srl/SMC/
![``+'' [SMC Logo]](smclogo.jpg)
An organization dedicated to the Maltese Language in electronic form
1. Having standards would encourage use, by adding ‘weight’ to the idea of commonplace Maltese text processing.
2. Standards allow data interchange between organizations.
3. Standards allow compliance and consistency testing. One could have a stamp, “This product passes these well known tests, and is therefore ready for Maltese Computing”.
4. Once accepted in the minds of relevant parties, standards provide easy labelling of a set of features.
5. Examples
ASCII [ISO-646] - has served the world well since the 1960's or so, allowing disparate computers to communicate -- in English and English-like languages.
6. It's true that the industry, and common usage may take some time to catch up to 'ideal' Maltese usage - short term and phase-in plans may need to be considered. However, we should design for the ideal that's best for the language and the people first, and only secondarily consider technical aspects. Maltese is not a difficult language for the computer to support! Consider Japanese, with its two alphabets and tens of thousands of symbols.
1. Draw upon common practice in Malta
2. Make use of existing global Industry standards as much as possible, i.e. ISO
3. Tight integration with official bodies such as MSA
1. How members are added?
2. Voting?
3. Adoption of standards?
4. Should the organization be formalized?
1. These standards should be made freely available to the public, to encourage adoption.
1. Please note, these are all wishful. no endorsement is implied!!
2. Akkademja tal Malti - writers' org.
3. Għaqda ta’ l-Għalliema tal-Malti (teachers org)
4. MITTS – Malta Information Technology and Training Services Ltd.
5. MSA - Maltese Standards Authority. Legislation is currently pending.
6. Finance/Bank Standards association/Central bank (for currency related issues)
7. Business associations /Chambers of Commerce
8. University of Malta, and other schools
9. ISO - for text (latin3/Unicode) and possibly standardized date/time?
10. Unicode - for text processing
11. Maltese Internet Society - http://www.isoc.org.mt/ - no response
12.
ISV's
might have layouts or other technology
1. Angelo Dalli has produced a document on “Data Representation Formats in Maltese (v2.1)” (or try elsewhere in the maltese-computing archives)
2. Today, most software is oriented towards English and its 26-letter alphabet, with a fair amount of support for ‘Western European’ characters (i.e. ISO 8859-1, Latin-1).
3. The Maltese alphabet requires the characters: ċ (C Dot), ġ (G Dot), ħ (H Cross), and ż (Z Dot). All of these have upper and lowercase variatsions. - 8 characters in total. In addition, ‘għ’ and ‘ie’ are digraphs, which are probably best encoded as a combination of ‘g’ and ‘ħ’, etc. In keyboard input and in sorting, they need to be treaded as single characters.
* Q: What other languages use these characters, for reference? According to Unicode 3.0:
ċ,ġ are
used in old Irish Gaelic orthography
ħ is similar to Cyrillic small letter 'tshe'.
ż is used in polish
4. Two encodings (both ISO) that support these letters are ISO-10646-1 (Unicode) and ISO-8859-3 (Latin-3).
* ISO-10646-1 is a character set more commonly referred to as ‘Unicode’. Unicode encodes these eight characters in the Latin block. It is quite widespread in it's adoption. There is a simple 8-bit encoding for Unicode called UTF-8, recognized by almost all browsers.
* ISO-8859-3, or Latin 3, seems much less well known. It does contain all eight needed characters. Since this is a simple 8-bit encoding, ISO-8859-3 could be used as an interchange format to pass text through non-compliant software that expects Latin-1. In addition, it may be easier to make keyboard layouts for windows 9x and the Macintosh using latin-3, as these systems do not support Unicode as their underlying format. See: http://czyborra.com/charsets/iso8859.html#ISO-8859-3 It should be codepage # 28593 on Windows, but does not seem to be well supported there.
* A variety of ad-hoc encodings have been developed, some around using the letters
[]{};':
, for instance, for easy keyboard entry. Two of the best-known encodings seem to be for the fonts Tornado, and MalteseTimes, respectively.
Sorting of text (collation)s
1. Besides the additional single characters, the two digraphs (ie and għ) need to be sorted in their proper order.
2. a b ċ [c] d e f ġ g għ h ħ i ie j k l m n o p q r s t u v w x [y] ż z
3. a b c. [c] d e f g. g gh/ h h/ i ie j k l m n o p q r s t u v w x [y] ż z (if your browser cannot read Maltese text at present)
4. Note: 'c'/'y' are not used in Maltese, however it is included here for completeness when foreign words are included, especially Maltese names which do not use the same orthography.
5. Issue: What about English or other language words mixed in, and should they be collated as such, for example should ‘ie’ in English “friend” cause the word to follow “frisk” in a Maltese language book’s index? It seems that a reader would expect a book’s index or other alphabetization to have a single primary emphasis (i.e. English or Maltese ordering).
3. For a snapshot of Maltese data, see ICU's Maltese Data. Comments, corrections, etc. are encouraged. (This is a viewer onto an open source internationalisation dataset.)
Transliteration
1. Latinise - one-way. wiċċ -> wicc
2. c. gh <-> ċ għ (A way of typing in Maltese?) (This would be useful if Maltese needs to be converted into pure 'English' letters, with the ability to convert it back again.)
3. Importing of existing documents (Tornado, etc..)?
4. Għarbi <-> Maltese ? Unknown usefulness.. Perhaps converting place-names into Maltese.
5. Other languages?
Spell checking
* Need to find a provider of source data.. Dictionary.?
* Needs linguistic analysis - word endings, vowel placement, silent letter [Gh] placement.
* GNU ISPELL might be a good interchange format? [www.gnu.org]
* OCR - optical character recognition
* * Find out about university projects
Hyphenation This and Punctuation probably want to follow a style books.
* Need list of words which can and cannot be broken
* Look into Arabic usage
* Todo: lookup "Thesis it-tagħrif" by Carmel Azzopardi [ask for it at um library]
* Examples:
destinazz|joni
nirk|bu
mir|kbu
..z|zjoni < maybe break it this way
ipar|ttu
Punctuation
* For grammar checkers?
* (I'm told: put the period outside the quotation - British style.).
Keyboard Layout
* This is quite a hot topic (to say the least) so I have moved it to a separate page.
Date and Time formatting
* Days of the week: il-Ħadd, it-Tnejn, it-Tlieta,...
* Months: Jannar, Frar,..
* Question- how to abbreviate into two or three letters?
* t’ Awissu, not ta’ Awissu
Number Handling
* Decimal separator - '.'
* Thousands separator - ','
* Number spell-out – i.e. cheque format "wieħed u għoxrin"
2. Currency handling
* "Lm1,000.45"?
* How about handling of foreign formats, Eur, FF, USD, etc..
* For mils – 3c5 , etc..
* Customer and place names
* Bills
* Advertising
* Businesses which would make heavy use of Maltese.
1. Legal Profession
2. Insurance
* Publications [non-web]
1. Newspapers
2. Magazines
3. Books
* Web sites, email, chat..
1. Especially need to standardize on encodings for delivery and input.
2. Home
* Web sites and email
* SMS
* Personal (written) correspondence
* Personal records/finance
* Spell checking.
* Maltese Standardisation authority
* ID card records
* DMV records
* Local Councils (publications, records)
* Elections
* Courts? see Legal..
* Textbooks
* Records
* Library system
* Distance instruction
1. Computer related
* Inclusion of data (locales, fonts, ..)
1. Input method/keyboard support
2. When there is an agreed-upon standard that has backing from the sectors that this document hopes for, OS vendors should listen..
3. Specific Vendors:
* MacOS - has verMalta which is 22, for the country code. should be smRoman because it's a roman based script. See Region Codes.
* Windows - 0x043A is the locale code for Maltese/Malta, and 0x3a is for just Maltese by itselfs
1. Maltese (and generically, multilingual) enablement
2. Use of available packages for custom work
* ISV/VAR/resellers/retailers
1. Bundling of appropriate software - and key stickers
http://www.cs.um.edu.mt/~srl/SMC
srl (at) um.edu.mt - srl (at)
monkey.sbay.org
Note: The opinions stated in this page are my responsibility
and should not be construed as endorsement or policy on the part of any other
organisation or agency.
ċ=c with dot,
Ċ=C with dot,
ġ=g with dot,
Ġ=G with dot,
ħ=h with slash,
Ħ=H with slash,
ż=z with dot,
Ż=Z with dot.
Steven R. Loomis, $Id: charter.html,v 1.1 2001/01/15 08:49:38 srl Exp $