mathematical content in documentation (dita europe 2008)
TRANSCRIPT
Mathematical Content in Documentation
DITA Europe, November 2008
DITA Europe 2008: Mathematical Content in Documentation 1
Introduction
• Lois Patterson• Creating technical documentation since 1995• Currently working at QuIC Financial
Technologies: www.quic.com
DITA Europe 2008: Mathematical Content in Documentation 2
Mathematical content in documentation: the challenges
In general:• Mixture of text and alien formats• Not understood by many tools• Output is always a challengeFor me: • Legacy documentation• Multiple input formats• Highly complex, mathematical content
DITA Europe 2008: Mathematical Content in Documentation 3
Where do we start?
• Tools galore, but no standard solution.• Many ways to create different solutions.• What solutions best meet everyone’s needs:
SMEs, tech writers, users?
DITA Europe 2008: Mathematical Content in Documentation 4
What is mathematical content?
• Sometimes just graphics• Equations created in Word or FrameMaker• TeX code• SVG code• MathML
• Any others?
DITA Europe 2008: Mathematical Content in Documentation 5
Where does DITA fit?
• DITA ≠ out of the box solution for math.• DITA and the ideas flowing from DITA still
valuable for mathematical content.• Maturity model concept can also apply to
mathematical content within DITA.
DITA Europe 2008: Mathematical Content in Documentation 6
Presentation and markup for math
• “Just” graphics and text• TeX/LaTeX• MathML• SVG (Scalable Vector Graphics)
DITA Europe 2008: Mathematical Content in Documentation 8
TeX/LaTeX: a brief introduction
• For scientific and mathematical notation, TeX typesetting language invented in 1976.
• LaTeX is TeX plus macros for typesetting.• Many theses and scientific articles written in
TeX/LaTeX. E.g. http://www.livingreviews.org• TeX is a presentation language; no semantic
meaning.
DITA Europe 2008: Mathematical Content in Documentation 9
What is MathML?
• “[An] XML application for describing mathematical notation and capturing both its structure and content. The goal of MathML is to enable mathematics to be served, received, and processed on the World Wide Web, just as HTML has enabled this functionality for text.”http://www.w3.org/Math/
• Presentation MathML, focus on presentation, produces very long markup.
• Content MathML includes no presentation information. Can use stylesheets.
DITA Europe 2008: Mathematical Content in Documentation 10
One equation: multiple formats
Black-Scholes equation - Nobel-Prize-winning equation. Foundation of financial mathematics. We’ll see:PNGTeXMathMLSVG
DITA Europe 2008: Mathematical Content in Documentation 12
Black-Scholes equation example (from Wikipedia)
Highlighted equation is a .png file, with the TeX markup as ALT text in the HTML markup.
DITA Europe 2008: Mathematical Content in Documentation 13
Black-Scholes equation format comparison
Graphic (.png)
TeX MarkupC(S_0,T) = e^{-rT}(F\Phi(d_1) - K\Phi(d_2))
What about MathML for this equation? Too long to show on one page.
DITA Europe 2008: Mathematical Content in Documentation 14
Black-Scholes: MathMLLaTex converted to MathML, displayed in browser MathML Source
DITA Europe 2008: Mathematical Content in Documentation 15
Converted LaTeX to Presentation MathML with a LaTeX to MathML converter: http://www.maths.nottingham.ac.uk/personal/drw/lm.html
Black-Scholes equation example: SVG
• Using converter*, I produced SVG markup.• SVG is XML markup, also viewable in FireFox.• Can be rendered as raster graphics with Java
or Python packages.• Small snippet:
<symbol overflow="visible" id="glyph2-1"><path style="stroke: none;" d="M 3.59375 -2.21875 C 3.59375 -2.984375 3.5 -3.546875 3.1875 -4.03125 C 2.96875 -4.34375 2.53125 -4.625 1.984375 -4.625 C 0.359375 -4.625 0.359375 -2.71875 0.359375 -2.21875 C 0.359375 -1.71875 0.359375 0.140625 1.984375 0.140625 C 3.59375 0.140625 3.59375 -1.71875 3.59375 -2.21875 Z M 1.984375 -0.0625 C 1.65625 -0.0625 1.234375 -0.25 1.09375 -0.8125 C 1 -1.21875 1 -1.796875 1 -2.3125 C 1 -2.828125 1 -3.359375 1.09375 -3.734375 C 1.25 -4.28125 1.6875 -4.4375 1.984375 -4.4375 C 2.359375 -4.4375 2.71875 -4.203125 2.84375 -3.796875 C 2.953125 -3.421875 2.96875 -2.921875 2.96875 -2.3125 C 2.96875 -1.796875 2.96875 -1.28125 2.875 -0.84375 C 2.734375 -0.203125 2.265625 -0.0625 1.984375 -0.0625 Z M 1.984375 -0.0625 "/></symbol>
* http://www.tlhiv.org/ltxpreview/ DITA Europe 2008: Mathematical Content in Documentation 16
DITA and MathML and SVG
• DITA 1.1 supports the <foreign> element.• Can specialize to support the inclusion of
MathML and SVG in your DITA topics.• If we use a combination of standard DITA,
MathML, and SVG, will we achieve utopia?
DITA Europe 2008: Mathematical Content in Documentation 17
Customize DITA Open Toolkit to work with MathML and SVG
• Can use the Plus Plugins provided on the DITA-Users group: http://tech.groups.yahoo.com/group/dita-users/files/Demos/
• DITA-Users group provides a great deal of help – highly recommended!
• XMetal and MathFlow can be tweaked so that the DITA Open Toolkit works with MathML/SVG. DITA Europe 2008: Mathematical Content
in Documentation 18
Input
• Microsoft Word with MathType• Open Office with built-in math editor• FrameMaker with Equation Editor• XMetal (or Arbortext) + MathFlow• TeX/LaTeX• Scribbled on scrap paper• Create yourself, or take what is given you?
DITA Europe 2008: Mathematical Content in Documentation 20
Input – standards or not?
• SMEs provide content however I can get it.• Enforcement of standards for input would be
counterproductive.• Anyone with success enforcing standards for
input?• Difficulty is that many input formats do not
play well with DITA.
DITA Europe 2008: Mathematical Content in Documentation 21
Input: Microsoft Word with MathType
• A WYSIWYG editor, MathType can output Presentation MathML or LaTeX.
• Somewhat harder from LaTeX or MathML back to editing in MathType.
• How to avoid cut and paste if your content is in DITA topics?
DITA Europe 2008: Mathematical Content in Documentation 22
Open Office with built-in math editor
• Free tool.• Easy to use.• Math editor outputs MathML.• Integration with DITA possible, but not
seamless.
DITA Europe 2008: Mathematical Content in Documentation 23
FrameMaker with Equation Editor
• FrameMaker has many positive qualities.• Only supports DITA 1.0 currently - OK if your
math content will just be graphics.• Equation Editor does not “naturally” output to
LaTeX or MathML.• Can use in conjunction with Mif2Go, but still
math will be graphics only.
DITA Europe 2008: Mathematical Content in Documentation 24
XMetal (or Arbortext) + MathFlow
• Slick combination, fun to work with.• MathFlow is similar to MathType (both are
Design Science products).• Can customize the DITA OT that comes with
XMetal/Arbortext for math output.• Perfect for DITA authoring.• Unlikely your SMEs use this combination.
DITA Europe 2008: Mathematical Content in Documentation 25
MathFlow Exchange
• ImportsWord + MathType documents, and exports them into Arbortext as XML + MathML. You may be able to DITA-cize the content.
• Is DITA the guiding principle or an afterthought?
• Dealing with mathematical content is like a never-ending conversion project.
DITA Europe 2008: Mathematical Content in Documentation 26
LaTeX input
• Convert LaTeX equations to MathML, or graphics.
• Convert LaTeX documents (like an article) to HTML + graphics.
• Use Hermes technology to convert to xHTML + graphics.
• No straight line to DITA.
DITA Europe 2008: Mathematical Content in Documentation 27
Scrap paper input
• More work for me, but I can enter it into whatever program I like!
DITA Europe 2008: Mathematical Content in Documentation 28
Review strategies
• Get basic input, integrate it with non-math text.
• Review is crucial, but how?• If you use DITA, difficult to allow direct editing
by SMEs.• In practice with non-DITA documentation,
SMEs rarely directly edit anyway.
DITA Europe 2008: Mathematical Content in Documentation 29
Ideal review solution
• Our SMEs want to be able to review and edit equations in real time.
• How?• Publish DITA content to a Confluence Wiki, enable
editing, republish.• Lombardi Software – DITA to Wiki• Many details to work out.• Keeping TeX markup accessible will be paramount.
DITA Europe 2008: Mathematical Content in Documentation 30
Publishing outputs
Some choices:• xHTML + MathML• xHTML + graphics• .chm with graphics• PDF with graphics
All require modifying the DITA OT. Plus Plugins (as mentioned earlier) are very helpful.
DITA Europe 2008: Mathematical Content in Documentation 31
Output: xHTML with MathML
• If everyone uses FireFox, or Internet Explorer with MathPlayer, this is a great solution.
• Falls apart if a documentation user has a different environment.
• Still need ways of introducing index, searchability, etc. (like you can do with EclipseHelp). Standard DITA issues.
DITA Europe 2008: Mathematical Content in Documentation 32
Other output formats
• xHTML + graphics• .chm with graphics• PDF with graphicsIf you have MathML and want these formats,
you must render MathML as graphics. This will require modifying build scripts,
installing DITA OT plugins, and using Java or other rendering tools.
DITA Europe 2008: Mathematical Content in Documentation 33
Rendering tools
• MimeTeX renders TeX equations as .gifs.• Java packages jeuclid and batik render
MathML.• Antenna House renders MathML for HTML
and PDF. www.forkosh.dreamhost.com/mimetex.html
jeuclid.sourceforge.net/ xmlgraphics.apache.org/batik/ www.antennahouse.com/product/mathml.htm
DITA Europe 2008: Mathematical Content in Documentation 34
Less-explored options
• Building applications with Adobe AIR.• Adobe PDFs can manage some mathematically
relevant wizardry.• What could you do with Math and Flash?• How would these integrate with DITA?
DITA Europe 2008: Mathematical Content in Documentation 35
As we become more mature . . . .
• Make math equations searchable.• Work with metadata, attributes.• This is a Ph.D thesis topic, and not one we will
solve here.• Google the following:
Mowgli projectA More Canonical Form of Content MathML to Facilitate Math SearchAn Investigation of Index Formats for the Search of MathML Objects
DITA Europe 2008: Mathematical Content in Documentation 36
DITA Europe 2008
• Thank you for attending.• Resources URL: http://dita.xml.org/blog/loisbc• Email: [email protected]
DITA Europe 2008: Mathematical Content in Documentation 37