) and fonts in digital malayalam - wordpress.com...text/data/പ ഠ /വ വര •...
TRANSCRIPT
-
Text (Data) and Fonts in digital Malayalam
Rajeesh K Vഒൿേ ാബർ 31, 2019
cto, TEXByte Solutions
1
-
About the speaker
• Entrepreneur• erp software architect• Free (സ ത ) software developer & user• Programmer• Jack of many trades, master of some
cbn © 2019 Rajeesh K V 2
-
Data v/s Presentation
-
Text/Data/പാഠം/വിവരം
• Computers don’t recognize ‘A’ or ‘ക’. They know only 0x41or 0x0D15 instead
• Text/data are stored as ‘code points’ (binary dataaccording to certain agreed ‘standard’) — ‘encoding’
• ASCII: 0x41→ ‘A’, 0x42→ ‘B’,… but — no code for ‘ക’!
3
-
Text/Data/പാഠം/വിവരം
• Computers don’t recognize ‘A’ or ‘ക’. They know only 0x41or 0x0D15 instead
• Text/data are stored as ‘code points’ (binary dataaccording to certain agreed ‘standard’) — ‘encoding’
• ASCII: 0x41→ ‘A’, 0x42→ ‘B’,… but — no code for ‘ക’!
3
-
Text/Data/പാഠം/വിവരം
• Computers don’t recognize ‘A’ or ‘ക’. They know only 0x41or 0x0D15 instead
• Text/data are stored as ‘code points’ (binary dataaccording to certain agreed ‘standard’) — ‘encoding’
• ASCII: 0x41→ ‘A’, 0x42→ ‘B’,… but —
no code for ‘ക’!
3
-
Text/Data/പാഠം/വിവരം
• Computers don’t recognize ‘A’ or ‘ക’. They know only 0x41or 0x0D15 instead
• Text/data are stored as ‘code points’ (binary dataaccording to certain agreed ‘standard’) — ‘encoding’
• ASCII: 0x41→ ‘A’, 0x42→ ‘B’,… but — no code for ‘ക’!
3
-
ASCII — Data
t e x t ←− ASCII (encoding)
↑ ↑ ↑ ↑
0x74 0x65 0x78 0x74 ←− data
4
-
ASCII — Presentation
t e x t ←− font↑ ↑ ↑ ↑
t e x t ←− ASCII
↑ ↑ ↑ ↑
0x74 0x65 0x78 0x74 ←− data
5
-
ASCII — Presentation
t e x t ←− font↑ ↑ ↑ ↑
t e x t ←− ASCII
↑ ↑ ↑ ↑
0x74 0x65 0x78 0x74 ←− data
6
-
ASCII — Presentation
Change the font, and…
t e x t ← font→↑ ↑ ↑ ↑
t e x t ← ASCII→
↑ ↑ ↑ ↑
0x74 0x65 0x78 0x74 ← data→
t e x t↑ ↑ ↑ ↑
t e x t
↑ ↑ ↑ ↑
0x74 0x65 0x78 0x74
7
-
Font
8
-
Font
9
-
ASCII chart
Source — Shawn Hymel 10
-
Malayalam
-
Code point for Malayalam
• No code points for Malayalam characters in ASCII!
• Then how did ISM and PageMaker work?
or
What you see is not what you have
11
-
Code point for Malayalam
• No code points for Malayalam characters in ASCII!• Then how did ISM and PageMaker work?
or
What you see is not what you have
11
-
Code point for Malayalam
• No code points for Malayalam characters in ASCII!• Then how did ISM and PageMaker work?
or
What you see is not what you have
11
-
Code point for Malayalam
• No code points for Malayalam characters in ASCII!• Then how did ISM and PageMaker work?
or
What you see is not what you have
11
-
ASCII font glyphs — ML TT Revathi
12
-
ISCII Malayalam
t e x t ← font→↑ ↑ ↑ ↑
t e x t ← ASCII→
↑ ↑ ↑ ↑
0x74 0x65 0x78 0x74 ← data→
I¯ p I↑ ↑ ↑ ↑
t e x t
↑ ↑ ↑ ↑
0x74 0x65 0x780x74
13
-
ISCII Malayalam
Change the font, and…
t e x t ← font→↑ ↑ ↑ ↑
t e x t ← ASCII→
↑ ↑ ↑ ↑
0x74 0x65 0x78 0x74 ← data→
ç « õ ç↑ ↑ ↑ ↑
t e x t
↑ ↑ ↑ ↑
0x74 0x65 0x78 0x74
14
-
Source – Ashok Kumar P K
15
-
ASCII — problems
• Only 8 bits (1 byte) to represent any character — at most256 characters
• Text/Data (പാഠം) is still Latin, not Malayalam• Information interchange — ⟨document + font⟩• Sorting (അകാരാദി മം)• Searching (information retrieval)
16
-
ASCII — problems
• Only 8 bits (1 byte) to represent any character — at most256 characters
• Text/Data (പാഠം) is still Latin, not Malayalam
• Information interchange — ⟨document + font⟩• Sorting (അകാരാദി മം)• Searching (information retrieval)
16
-
ASCII — problems
• Only 8 bits (1 byte) to represent any character — at most256 characters
• Text/Data (പാഠം) is still Latin, not Malayalam• Information interchange — ⟨document + font⟩
• Sorting (അകാരാദി മം)• Searching (information retrieval)
16
-
ASCII — problems
• Only 8 bits (1 byte) to represent any character — at most256 characters
• Text/Data (പാഠം) is still Latin, not Malayalam• Information interchange — ⟨document + font⟩• Sorting (അകാരാദി മം)
• Searching (information retrieval)
16
-
ASCII — problems
• Only 8 bits (1 byte) to represent any character — at most256 characters
• Text/Data (പാഠം) is still Latin, not Malayalam• Information interchange — ⟨document + font⟩• Sorting (അകാരാദി മം)• Searching (information retrieval)
16
-
Data outlive software
Source — Martin Malmsten, National Library of Sweden
17
-
Unicode*
• Unique code points for (almost) every writing system(‘script’/ലിപി) in the world
• Text/Data (പാഠം) always represents one particular script(Devanagari: Hindi/Sanskrit)
• Only basic characters are encoded, not conjuncts (usually)• Standard agreed and supported by many operatingsystems and application softwares
• Preferred encoding for data interchange, Web, Govtdocuments…
*www.unicode.org18
www.unicode.org
-
Unicode*
• Unique code points for (almost) every writing system(‘script’/ലിപി) in the world
• Text/Data (പാഠം) always represents one particular script(Devanagari: Hindi/Sanskrit)
• Only basic characters are encoded, not conjuncts (usually)• Standard agreed and supported by many operatingsystems and application softwares
• Preferred encoding for data interchange, Web, Govtdocuments…
*www.unicode.org18
www.unicode.org
-
Unicode*
• Unique code points for (almost) every writing system(‘script’/ലിപി) in the world
• Text/Data (പാഠം) always represents one particular script(Devanagari: Hindi/Sanskrit)
• Only basic characters are encoded, not conjuncts (usually)
• Standard agreed and supported by many operatingsystems and application softwares
• Preferred encoding for data interchange, Web, Govtdocuments…
*www.unicode.org18
www.unicode.org
-
Unicode*
• Unique code points for (almost) every writing system(‘script’/ലിപി) in the world
• Text/Data (പാഠം) always represents one particular script(Devanagari: Hindi/Sanskrit)
• Only basic characters are encoded, not conjuncts (usually)• Standard agreed and supported by many operatingsystems and application softwares
• Preferred encoding for data interchange, Web, Govtdocuments…
*www.unicode.org18
www.unicode.org
-
Unicode*
• Unique code points for (almost) every writing system(‘script’/ലിപി) in the world
• Text/Data (പാഠം) always represents one particular script(Devanagari: Hindi/Sanskrit)
• Only basic characters are encoded, not conjuncts (usually)• Standard agreed and supported by many operatingsystems and application softwares
• Preferred encoding for data interchange, Web, Govtdocuments…
*www.unicode.org18
www.unicode.org
-
Unicode Malayalam†
†www.unicode.org/charts/PDF/U0D00.pdf19
www.unicode.org/charts/PDF/U0D00.pdf
-
Unicode Malayalam
• ‘ക’→ 0D05
• ‘െകാ’→ ക + െ◌ാ→ 0D05 0D4A• ‘ ’→ ക + ◌് + ക→ 0D05 0D4D 0D05• Use any Unicode Malayalam font to display data
or
What you see is what you have
20
-
Unicode Malayalam
• ‘ക’→ 0D05• ‘െകാ’→ ക + െ◌ാ→ 0D05 0D4A
• ‘ ’→ ക + ◌് + ക→ 0D05 0D4D 0D05• Use any Unicode Malayalam font to display data
or
What you see is what you have
20
-
Unicode Malayalam
• ‘ക’→ 0D05• ‘െകാ’→ ക + െ◌ാ→ 0D05 0D4A• ‘ ’→ ക + ◌് + ക→ 0D05 0D4D 0D05
• Use any Unicode Malayalam font to display data
or
What you see is what you have
20
-
Unicode Malayalam
• ‘ക’→ 0D05• ‘െകാ’→ ക + െ◌ാ→ 0D05 0D4A• ‘ ’→ ക + ◌് + ക→ 0D05 0D4D 0D05• Use any Unicode Malayalam font to display data
or
What you see is what you have
20
-
Unicode Malayalam
• ‘ക’→ 0D05• ‘െകാ’→ ക + െ◌ാ→ 0D05 0D4A• ‘ ’→ ക + ◌് + ക→ 0D05 0D4D 0D05• Use any Unicode Malayalam font to display data
or
What you see is what you have
20
-
Unicode Malayalam
• ‘ക’→ 0D05• ‘െകാ’→ ക + െ◌ാ→ 0D05 0D4A• ‘ ’→ ക + ◌് + ക→ 0D05 0D4D 0D05• Use any Unicode Malayalam font to display data
or
What you see is what you have
20
-
Unicode Malayalam
t e x t ← font→↑ ↑ ↑ ↑
t e x t ← Unicode→
↑ ↑ ↑ ↑
74 65 78 74 ← data→
ക ക↑ ↑ ↑ ↑ ↑ ↑
ക ത ◌് ത ◌ു ക
↑ ↑ ↑ ↑ ↑ ↑
0D15 0D24 0D4D 0D24 0D41 0D15
21
-
Unicode Malayalam
Change the font, and…
t e x t ← font→↑ ↑ ↑ ↑
t e x t ← Unicode→
↑ ↑ ↑ ↑
74 65 78 74 ← data→
ക ു ക↑ ↑ ↑ ↑ ↑ ↑
ക ത ◌് ത ◌ു ക
↑ ↑ ↑ ↑ ↑ ↑
0D15 0D24 0D4D 0D24 0D41 0D15
22
-
Data entry — Inscript
k m s j ←− keyboard
↓ ↓ ↓ ↓ ←− input method
ക സ േ◌ ര ←− Unicode
↓ ↓ ↓ ↓
0D15 0D38 0D47 0D30 ←− data
23
-
Data entry — Inscript
k m s j ←− keyboard↓ ↓ ↓ ↓ ←− input method
ക സ േ◌ ര ←− Unicode
↓ ↓ ↓ ↓
0D15 0D38 0D47 0D30 ←− data
23
-
Data entry — Inscript
k m s j ←− keyboard↓ ↓ ↓ ↓ ←− input method
ക സ േ◌ ര ←− Unicode
↓ ↓ ↓ ↓
0D15 0D38 0D47 0D30 ←− data
23
-
Data entry — Inscript
k m s j ←− keyboard↓ ↓ ↓ ↓ ←− input method
ക സ േ◌ ര ←− Unicode
↓ ↓ ↓ ↓
0D15 0D38 0D47 0D30 ←− data
23
-
Data entry — Transliteration
ka s E ra ←− keyboard
↓ ↓ ↓ ↓ ←− input method
ക സ േ◌ ര ←− Unicode
↓ ↓ ↓ ↓
0D15 0D38 0D47 0D30 ←− data
24
-
Data entry — Transliteration
ka s E ra ←− keyboard↓ ↓ ↓ ↓ ←− input method
ക സ േ◌ ര ←− Unicode
↓ ↓ ↓ ↓
0D15 0D38 0D47 0D30 ←− data
24
-
Data entry — Transliteration
ka s E ra ←− keyboard↓ ↓ ↓ ↓ ←− input method
ക സ േ◌ ര ←− Unicode
↓ ↓ ↓ ↓
0D15 0D38 0D47 0D30 ←− data
24
-
Data entry — Transliteration
ka s E ra ←− keyboard↓ ↓ ↓ ↓ ←− input method
ക സ േ◌ ര ←− Unicode
↓ ↓ ↓ ↓
0D15 0D38 0D47 0D30 ←− data
24
-
Pitfalls
Some common mistakes in typing/data entry
or
The great ISM/ തിയലിപി hangover
ASCII Unicode Shaping
ക േ◌ സ ര ക സ േ◌ ര കേസര
◌ ഗ ◌ീ സ ◌് ഗ ◌് ര ◌ീ സ ◌് ീസ് I'm looking at you, Manorama!
ക ◌ു ക ക ത ◌് ത ◌ു ക ക ക
25
-
Pitfalls
Some common mistakes in typing/data entry
or
The great ISM/ തിയലിപി hangover
ASCII Unicode Shaping
ക േ◌ സ ര ക സ േ◌ ര കേസര
◌ ഗ ◌ീ സ ◌് ഗ ◌് ര ◌ീ സ ◌് ീസ് I'm looking at you, Manorama!
ക ◌ു ക ക ത ◌് ത ◌ു ക ക ക
25
-
Pitfalls
Some common mistakes in typing/data entry
or
The great ISM/ തിയലിപി hangover
ASCII Unicode Shaping
ക േ◌ സ ര ക സ േ◌ ര കേസര
◌ ഗ ◌ീ സ ◌് ഗ ◌് ര ◌ീ സ ◌് ീസ് I'm looking at you, Manorama!
ക ◌ു ക ക ത ◌് ത ◌ു ക ക ക
25
-
Pitfalls
Some common mistakes in typing/data entry
or
The great ISM/ തിയലിപി hangover
ASCII Unicode Shaping
ക േ◌ സ ര ക സ േ◌ ര കേസര
◌ ഗ ◌ീ സ ◌് ഗ ◌് ര ◌ീ സ ◌് ീസ് I'm looking at you, Manorama!
ക ◌ു ക ക ത ◌് ത ◌ു ക ക ക
25
-
Pitfalls
Some common mistakes in typing/data entry
or
The great ISM/ തിയലിപി hangover
ASCII Unicode Shaping
ക േ◌ സ ര ക സ േ◌ ര കേസര
◌ ഗ ◌ീ സ ◌് ഗ ◌് ര ◌ീ സ ◌് ീസ് I'm looking at you, Manorama!
ക ◌ു ക ക ത ◌് ത ◌ു ക ക ക
25
-
Pitfalls
Some common mistakes in typing/data entry
or
The great ISM/ തിയലിപി hangover
ASCII Unicode Shaping
ക േ◌ സ ര ക സ േ◌ ര കേസര
◌ ഗ ◌ീ സ ◌് ഗ ◌് ര ◌ീ സ ◌് ീസ്
I'm looking at you, Manorama!
ക ◌ു ക ക ത ◌് ത ◌ു ക ക ക
25
-
Pitfalls
Some common mistakes in typing/data entry
or
The great ISM/ തിയലിപി hangover
ASCII Unicode Shaping
ക േ◌ സ ര ക സ േ◌ ര കേസര
◌ ഗ ◌ീ സ ◌് ഗ ◌് ര ◌ീ സ ◌് ീസ് I'm looking at you, Manorama!
ക ◌ു ക ക ത ◌് ത ◌ു ക ക ക
25
-
Pitfalls
Some common mistakes in typing/data entry
or
The great ISM/ തിയലിപി hangover
ASCII Unicode Shaping
ക േ◌ സ ര ക സ േ◌ ര കേസര
◌ ഗ ◌ീ സ ◌് ഗ ◌് ര ◌ീ സ ◌് ീസ് I'm looking at you, Manorama!
ക ◌ു ക ക ത ◌് ത ◌ു ക ക ക
25
-
The ISM/ തിയലിപി hangover
26
-
Data outlive software
• Data/പാഠം must be stored for future
• Decouple data from software/formatting• Data must be searchable• Corollary: Data must be archivable• The one thing worse than ASCII documents — scanneddocuments
• TEI — Text Encoding Initiative• Sayahna Foundation
27
-
Data outlive software
• Data/പാഠം must be stored for future• Decouple data from software/formatting
• Data must be searchable• Corollary: Data must be archivable• The one thing worse than ASCII documents — scanneddocuments
• TEI — Text Encoding Initiative• Sayahna Foundation
27
-
Data outlive software
• Data/പാഠം must be stored for future• Decouple data from software/formatting• Data must be searchable
• Corollary: Data must be archivable• The one thing worse than ASCII documents — scanneddocuments
• TEI — Text Encoding Initiative• Sayahna Foundation
27
-
Data outlive software
• Data/പാഠം must be stored for future• Decouple data from software/formatting• Data must be searchable• Corollary: Data must be archivable
• The one thing worse than ASCII documents — scanneddocuments
• TEI — Text Encoding Initiative• Sayahna Foundation
27
-
Data outlive software
• Data/പാഠം must be stored for future• Decouple data from software/formatting• Data must be searchable• Corollary: Data must be archivable• The one thing worse than ASCII documents
— scanneddocuments
• TEI — Text Encoding Initiative• Sayahna Foundation
27
-
Data outlive software
• Data/പാഠം must be stored for future• Decouple data from software/formatting• Data must be searchable• Corollary: Data must be archivable• The one thing worse than ASCII documents — scanneddocuments
• TEI — Text Encoding Initiative• Sayahna Foundation
27
-
Data outlive software
• Data/പാഠം must be stored for future• Decouple data from software/formatting• Data must be searchable• Corollary: Data must be archivable• The one thing worse than ASCII documents — scanneddocuments
• TEI — Text Encoding Initiative
• Sayahna Foundation
27
-
Data outlive software
• Data/പാഠം must be stored for future• Decouple data from software/formatting• Data must be searchable• Corollary: Data must be archivable• The one thing worse than ASCII documents — scanneddocuments
• TEI — Text Encoding Initiative• Sayahna Foundation
27
-
Text shaping
-
Complex text shaping
Unicode solves one part of the problem (data).Complex scripts, unlike Latin, change the shape and order ofglyphs.
G N U −→ GNUf f i −→ ffiഗ ◌് ന ◌ു −→ക സ േ◌ ര −→ കേസര
28
-
Complex text shaping
Unicode solves one part of the problem (data).Complex scripts, unlike Latin, change the shape and order ofglyphs.
G N U −→ GNU
f f i −→ ffiഗ ◌് ന ◌ു −→ക സ േ◌ ര −→ കേസര
28
-
Complex text shaping
Unicode solves one part of the problem (data).Complex scripts, unlike Latin, change the shape and order ofglyphs.
G N U −→ GNUf f i −→ ffi
ഗ ◌് ന ◌ു −→ക സ േ◌ ര −→ കേസര
28
-
Complex text shaping
Unicode solves one part of the problem (data).Complex scripts, unlike Latin, change the shape and order ofglyphs.
G N U −→ GNUf f i −→ ffiഗ ◌് ന ◌ു −→
ക സ േ◌ ര −→ കേസര
28
-
Complex text shaping
Unicode solves one part of the problem (data).Complex scripts, unlike Latin, change the shape and order ofglyphs.
G N U −→ GNUf f i −→ ffiഗ ◌് ന ◌ു −→
ക സ േ◌ ര −→ കേസര
28
-
Complex text shaping
Unicode solves one part of the problem (data).Complex scripts, unlike Latin, change the shape and order ofglyphs.
G N U −→ GNUf f i −→ ffiഗ ◌് ന ◌ു −→ക സ േ◌ ര −→
കേസര
28
-
Complex text shaping
Unicode solves one part of the problem (data).Complex scripts, unlike Latin, change the shape and order ofglyphs.
G N U −→ GNUf f i −→ ffiഗ ◌് ന ◌ു −→ക സ േ◌ ര −→ കേസര
28
-
What is ‘text shaping’?
29
-
Unicode Malayalam font glyphs — Rachana
30
-
Complex text shaping
• Font (glyphs) + ‘OpenType’ shaping rules
• Operating System support required for proper shaping• ക + ◌് + ക→ (shaping rules)• OpenType specification‡ — followed by GNU/Linux,Windows applications; Apple uses ‘AAT’
• HarfBuzz§ shaping engine (libre software) used byGNU/Linux, Qt, GTK, Android, Scribus, XƎTEX, LibreOffice...
• Adobe use their own shaping engine, has bugs/issueswith shaping. Even they are going to use HarfBuzz!
‡docs.microsoft.com/en-us/typography/opentype/spec/§www.freedesktop.org/wiki/Software/HarfBuzz/
31
docs.microsoft.com/en-us/typography/opentype/spec/www.freedesktop.org/wiki/Software/HarfBuzz/
-
Complex text shaping
• Font (glyphs) + ‘OpenType’ shaping rules• Operating System support required for proper shaping
• ക + ◌് + ക→ (shaping rules)• OpenType specification‡ — followed by GNU/Linux,Windows applications; Apple uses ‘AAT’
• HarfBuzz§ shaping engine (libre software) used byGNU/Linux, Qt, GTK, Android, Scribus, XƎTEX, LibreOffice...
• Adobe use their own shaping engine, has bugs/issueswith shaping. Even they are going to use HarfBuzz!
‡docs.microsoft.com/en-us/typography/opentype/spec/§www.freedesktop.org/wiki/Software/HarfBuzz/
31
docs.microsoft.com/en-us/typography/opentype/spec/www.freedesktop.org/wiki/Software/HarfBuzz/
-
Complex text shaping
• Font (glyphs) + ‘OpenType’ shaping rules• Operating System support required for proper shaping• ക + ◌് + ക→ (shaping rules)
• OpenType specification‡ — followed by GNU/Linux,Windows applications; Apple uses ‘AAT’
• HarfBuzz§ shaping engine (libre software) used byGNU/Linux, Qt, GTK, Android, Scribus, XƎTEX, LibreOffice...
• Adobe use their own shaping engine, has bugs/issueswith shaping. Even they are going to use HarfBuzz!
‡docs.microsoft.com/en-us/typography/opentype/spec/§www.freedesktop.org/wiki/Software/HarfBuzz/
31
docs.microsoft.com/en-us/typography/opentype/spec/www.freedesktop.org/wiki/Software/HarfBuzz/
-
Complex text shaping
• Font (glyphs) + ‘OpenType’ shaping rules• Operating System support required for proper shaping• ക + ◌് + ക→ (shaping rules)• OpenType specification‡ — followed by GNU/Linux,Windows applications; Apple uses ‘AAT’
• HarfBuzz§ shaping engine (libre software) used byGNU/Linux, Qt, GTK, Android, Scribus, XƎTEX, LibreOffice...
• Adobe use their own shaping engine, has bugs/issueswith shaping. Even they are going to use HarfBuzz!
‡docs.microsoft.com/en-us/typography/opentype/spec/§www.freedesktop.org/wiki/Software/HarfBuzz/
31
docs.microsoft.com/en-us/typography/opentype/spec/www.freedesktop.org/wiki/Software/HarfBuzz/
-
Complex text shaping
• Font (glyphs) + ‘OpenType’ shaping rules• Operating System support required for proper shaping• ക + ◌് + ക→ (shaping rules)• OpenType specification‡ — followed by GNU/Linux,Windows applications; Apple uses ‘AAT’
• HarfBuzz§ shaping engine (libre software) used byGNU/Linux, Qt, GTK, Android, Scribus, XƎTEX, LibreOffice...
• Adobe use their own shaping engine, has bugs/issueswith shaping. Even they are going to use HarfBuzz!
‡docs.microsoft.com/en-us/typography/opentype/spec/§www.freedesktop.org/wiki/Software/HarfBuzz/
31
docs.microsoft.com/en-us/typography/opentype/spec/www.freedesktop.org/wiki/Software/HarfBuzz/
-
Complex text shaping
• Font (glyphs) + ‘OpenType’ shaping rules• Operating System support required for proper shaping• ക + ◌് + ക→ (shaping rules)• OpenType specification‡ — followed by GNU/Linux,Windows applications; Apple uses ‘AAT’
• HarfBuzz§ shaping engine (libre software) used byGNU/Linux, Qt, GTK, Android, Scribus, XƎTEX, LibreOffice...
• Adobe use their own shaping engine, has bugs/issueswith shaping.
Even they are going to use HarfBuzz!
‡docs.microsoft.com/en-us/typography/opentype/spec/§www.freedesktop.org/wiki/Software/HarfBuzz/
31
docs.microsoft.com/en-us/typography/opentype/spec/www.freedesktop.org/wiki/Software/HarfBuzz/
-
Complex text shaping
• Font (glyphs) + ‘OpenType’ shaping rules• Operating System support required for proper shaping• ക + ◌് + ക→ (shaping rules)• OpenType specification‡ — followed by GNU/Linux,Windows applications; Apple uses ‘AAT’
• HarfBuzz§ shaping engine (libre software) used byGNU/Linux, Qt, GTK, Android, Scribus, XƎTEX, LibreOffice...
• Adobe use their own shaping engine, has bugs/issueswith shaping. Even they are going to use HarfBuzz!
‡docs.microsoft.com/en-us/typography/opentype/spec/§www.freedesktop.org/wiki/Software/HarfBuzz/
31
docs.microsoft.com/en-us/typography/opentype/spec/www.freedesktop.org/wiki/Software/HarfBuzz/
-
Complex text shaping engine
$ hb-shape -v Rachana-Regular.ttf "െകാ"
1: (െകാ)1: 1: [e1|k1@1112,0|a2@2700,0]
e1→ െ◌ k1→ ക a2→ ◌ാ
32
-
Complex text shaping engine
$ hb-shape -v Rachana-Regular.ttf "െകാ"1: (െകാ)
1: 1: [e1|k1@1112,0|a2@2700,0]
e1→ െ◌ k1→ ക a2→ ◌ാ
32
-
Complex text shaping engine
$ hb-shape -v Rachana-Regular.ttf "െകാ"1: (െകാ)1:
1: [e1|k1@1112,0|a2@2700,0]
e1→ െ◌ k1→ ക a2→ ◌ാ
32
-
Complex text shaping engine
$ hb-shape -v Rachana-Regular.ttf "െകാ"1: (െകാ)1: 1: [e1|k1@1112,0|a2@2700,0]
e1→ െ◌ k1→ ക a2→ ◌ാ
32
-
Complex text shaping engine
$ hb-shape -v Rachana-Regular.ttf "െകാ"1: (െകാ)1: 1: [e1|k1@1112,0|a2@2700,0]
e1→ െ◌ k1→ ക a2→ ◌ാ
32
-
The lookup rules state machine
pref pre-base form ◌് + ര→ ◌
pstf post-base form ◌് + വ→ ◌ , ◌് + യ→ ◌
blwf below-base form ◌് + ല→ ◌
akhn akhant conjuncts ക + ◌് + ക→
pres pre-base substitution ◌ + പ→
psts post-base substitution + ◌ു→
blws below-base substitution പ + ◌→
33
-
Kerning — TN Joy
34
-
Complex text shaping — OpenType lookup rules
35
-
Complex text shaping
• Font = Art + Engineering
• Design — Glyphs, Ascender, Descender, Character spacing,Word spacing etc.
• Programming — OpenType shaping rules (GSUB [glyphsubstituion], GPOS [kerning, mark positioning] etc.)
• Shaping engine support — Windows xp, Pango, Qt4,LibreOffice ≤ 5.x… v/s Windows Vista+ (Uniscribe),HarfBuzz
36
-
Complex text shaping
• Font = Art + Engineering• Design — Glyphs, Ascender, Descender, Character spacing,Word spacing etc.
• Programming — OpenType shaping rules (GSUB [glyphsubstituion], GPOS [kerning, mark positioning] etc.)
• Shaping engine support — Windows xp, Pango, Qt4,LibreOffice ≤ 5.x… v/s Windows Vista+ (Uniscribe),HarfBuzz
36
-
Complex text shaping
• Font = Art + Engineering• Design — Glyphs, Ascender, Descender, Character spacing,Word spacing etc.
• Programming — OpenType shaping rules
(GSUB [glyphsubstituion], GPOS [kerning, mark positioning] etc.)
• Shaping engine support — Windows xp, Pango, Qt4,LibreOffice ≤ 5.x… v/s Windows Vista+ (Uniscribe),HarfBuzz
36
-
Complex text shaping
• Font = Art + Engineering• Design — Glyphs, Ascender, Descender, Character spacing,Word spacing etc.
• Programming — OpenType shaping rules (GSUB [glyphsubstituion],
GPOS [kerning, mark positioning] etc.)• Shaping engine support — Windows xp, Pango, Qt4,LibreOffice ≤ 5.x… v/s Windows Vista+ (Uniscribe),HarfBuzz
36
-
Complex text shaping
• Font = Art + Engineering• Design — Glyphs, Ascender, Descender, Character spacing,Word spacing etc.
• Programming — OpenType shaping rules (GSUB [glyphsubstituion], GPOS [kerning, mark positioning] etc.)
• Shaping engine support — Windows xp, Pango, Qt4,LibreOffice ≤ 5.x… v/s Windows Vista+ (Uniscribe),HarfBuzz
36
-
Complex text shaping
• Font = Art + Engineering• Design — Glyphs, Ascender, Descender, Character spacing,Word spacing etc.
• Programming — OpenType shaping rules (GSUB [glyphsubstituion], GPOS [kerning, mark positioning] etc.)
• Shaping engine support
— Windows xp, Pango, Qt4,LibreOffice ≤ 5.x… v/s Windows Vista+ (Uniscribe),HarfBuzz
36
-
Complex text shaping
• Font = Art + Engineering• Design — Glyphs, Ascender, Descender, Character spacing,Word spacing etc.
• Programming — OpenType shaping rules (GSUB [glyphsubstituion], GPOS [kerning, mark positioning] etc.)
• Shaping engine support — Windows xp, Pango, Qt4,LibreOffice ≤ 5.x…
v/s Windows Vista+ (Uniscribe),HarfBuzz
36
-
Complex text shaping
• Font = Art + Engineering• Design — Glyphs, Ascender, Descender, Character spacing,Word spacing etc.
• Programming — OpenType shaping rules (GSUB [glyphsubstituion], GPOS [kerning, mark positioning] etc.)
• Shaping engine support — Windows xp, Pango, Qt4,LibreOffice ≤ 5.x… v/s
Windows Vista+ (Uniscribe),HarfBuzz
36
-
Complex text shaping
• Font = Art + Engineering• Design — Glyphs, Ascender, Descender, Character spacing,Word spacing etc.
• Programming — OpenType shaping rules (GSUB [glyphsubstituion], GPOS [kerning, mark positioning] etc.)
• Shaping engine support — Windows xp, Pango, Qt4,LibreOffice ≤ 5.x… v/s Windows Vista+ (Uniscribe),HarfBuzz
36
-
Shaping issues
• Perfect shaping of all conjuncts may not always work asexpected
• Report bugs (and respect the License)
rachana.org.insmc.org.in
37
rachana.org.insmc.org.in
-
Shaping issues
• Perfect shaping of all conjuncts may not always work asexpected
• Report bugs (and respect the License)
rachana.org.insmc.org.in
37
rachana.org.insmc.org.in
-
Shaping issues
• Perfect shaping of all conjuncts may not always work asexpected
• Report bugs (and respect the License)
rachana.org.insmc.org.in
37
rachana.org.insmc.org.in
-
Shaping issues
• Perfect shaping of all conjuncts may not always work asexpected
• Report bugs (and respect the License)
rachana.org.insmc.org.in
37
rachana.org.insmc.org.in
-
Shaping issues
• Perfect shaping of all conjuncts may not always work asexpected
• Report bugs (and respect the License)
rachana.org.insmc.org.in
37
rachana.org.insmc.org.in
-
Shaping issues
• Perfect shaping of all conjuncts may not always work asexpected
• Report bugs (and respect the License)
rachana.org.insmc.org.in
37
rachana.org.insmc.org.in
-
Shaping issues
• Perfect shaping of all conjuncts may not always work asexpected
• Report bugs (and respect the License)
rachana.org.insmc.org.in
37
rachana.org.insmc.org.in
-
Update fonts
• And it gets fixed
• Update fonts• GNU/Linux — via package update• Windows, macOS — uninstall existing version, download &install new version
• Android — use Magisk
38
-
Update fonts
• And it gets fixed
• Update fonts• GNU/Linux — via package update• Windows, macOS — uninstall existing version, download &install new version
• Android — use Magisk
38
-
Update fonts
• And it gets fixed
• Update fonts• GNU/Linux — via package update• Windows, macOS — uninstall existing version, download &install new version
• Android — use Magisk
38
-
Update fonts
• And it gets fixed
• Update fonts• GNU/Linux — via package update• Windows, macOS — uninstall existing version, download &install new version
• Android — use Magisk
38
-
Questions?
ന ി.
39
-
Questions?
ന ി.
39
Data v/s PresentationMalayalamText shaping