Browse, edit, and add to this collaborative knowledgebase of information on using and understanding DITA.
Blue links point to existing wiki pages. Red links represent suggestions for pages where you can be the first to add content.
This page provides an outline of Wiki pages only; see the Site map for a view of the entire site.
- How to add or edit a Wiki page
- Contribute content
The DITA Open Toolkit can produce output in the following formats:
It can also be extended to produce output in arbitrary formats.
Specialization is the process by which new designs are created based on existing designs, allowing new kinds of content to be processed using existing processing rules.
It is the means by which the standard DITA language may be extended for new semantic or structural roles.
Specialization allows you to define new kinds of information (new structural types or new domains of information), while reusing as much of existing design and code as possible, and minimizing or eliminating the costs of interchange, migration, and maintenance.
Specialization may be used to introduce new map types, information types, or domains. An example of a map specialized for a specific application is the Bookmap specialization provided as part of the OASIS DITA 1.1 Standard. An example of a topic specialized for a particular role is message specialization (provided as a msgref plugin of the DITA Open Toolkit). An example of a community-prescribed domain specialization is the hazard domain proposed for DITA 1.2 by the Machine Industry Specialization Subcommittee of the OASIS DITA TC.
Community specialization plugins.
Besides those specializations created as OASIS Standards under the auspices of the OASIS DITA Technical committee, specializations have also been created as community plugins for the DITA Open Toolkit and as file uploads at sites such as the Yahoo! dita-users forum. In addition, many companies have developed specializations that are used internally, and sometimes shared by arrangement with business partners. Finally, some businesses have developed specializations that represent internal business process or workflows; these are usually trade secret assets of those businesses. However, all follow the same methodologies, which means that all such DITA content is interchangeable and (with the appropriate DTDs and processing overrides) interoperable in processing with other content producers or publishers.
Other examples of popular specializations include:
Specializations may support particular subject matter areas, such as:
A comprehensive description of a specialization would include, directly or via links:
An example of a specialization with all these components that works as a plugin of the DITA Open Toolkit is the music specialization.
DITA topics are the basic units of DITA content.
Topics are the basis for high-quality information. Each topic is organized around a single subject or answers a single question.
Each topic is typically authored as a unit. It consists of a title, which captures the subject of the topic, and further content.
As stated in the essay paper: Introduction to the DITA architecture, DITA topics are organized by DITA maps. It is also possible to nest sub-topics within a topic. Specialization enables the creation of specialized topics and other units of content that are tailored to particular structural requirements. Content referencing (conref) enables fragments of content to be reused from a single source. Conditional processing permits a single source to support the needs of multiple audiences.
The architectural specification describes topics and information typing at
A 5-minute tutorial on DITA Topics is available as a Flash animation.
Editors for the Architecture area:
DITA maps collect and organize references to DITA topics to indicate the relationships among the topics. They can be used to identify the topics you want to include in a deliverable, and to create tables of contents and related links for the information.
Maps can organize topics into hierarchies, tables, and groups, and also have special elements for referencing other maps. You can use multiple maps to pull different deliverables out of the same set of topics, and to separate the concerns of managing deliverables and architecting information from the concerns of topic authoring.
The architectural specification describes DITA maps and relationship tables at
For a practical approach to setting up relationship tables, see
People often ask what to do to make extra headings and levels of hierarchy appear within a table of contents.
A common misconception is that if a map includes a submap, the title of the submap will appear as an extra heading in output. This does not occur. Also, the topics within the submap appear at the same hierarchy level as other topics that are peer to the submap itself; being in a submap does not cause extra "indenting".
To add an extra heading to a map, a good method is to use a <topicref> element pointing to a topic that contains a title and nothing else. This method is easy to manage for translation.
Another method is to use the <topichead> element. This element did not appear in PDF output using previous versions of the DITA Open Toolkit. However, version 1.4 of the DITA Open Toolkit does create headings from <topichead>.
Editors for the Architecture area:
A team encounters quite a few challenges when transitioning from unstructured to structured writing. For my team, the struggles associated with learning DITA and following the DITA structure were minor compared with the challenges presented by corralling hundreds of individual topics into a logical hierarchy. The biggest challenge we faced was managing topic relationships.
In DITA, related topics can be managed either in the DITA topics themselves, or via DITAMAPs using relationship tables <reltable>.
While topic relationships can be stored in the topics themselves, as products evolve and user interfaces change, a topic that was required for release 1.0 of a product may no longer be needed in release 2.3. If related topics are maintained at the topic level, removing a topic that is no longer part of the system may involve modifying the related topics of a dozen different DITA files.
However, if related topics are managed using a relationship table in a DITAMAP, removing a topic that is no longer needed involves changing only one file. Relationship tables can be used to easily manage the related topics that are associated with each DITA topic.
After doing a lot of research into relationship tables, my team nearly abandoned the effort because mapping a large number of topics in the standard three-column relationship table was too complex. However, once we realized that a four-column approach was much more effective in creating and maintaining the relationships between DITA topics, our information modeling and online help creation process became much easier to manage.
To illustrate the benefits of a four-column relationship table, the following example topics will be used.
The following concept topics are designed to introduce the subject matter to the reader. These topics will cover information ranging from, "What is it?" to "Why should the reader care?"
The following task topics are designed to walk the user through a procedure that goes from step 1 to step N, with the end result being the successful completion of the task at hand.
The following reference topics are designed to support the user in completing a task. For example, if step 3 of our task instructs the user to complete field xyz, the reference topic for that window will describe field xyz and include supporting background information that may not appear in the task.
Using a standard three-column relationship table and a four-column relationship table, the following relationships will be applied to each topic in the example documentation set:
Each row in a relationship table defines an individual set of relationships between topics. In the relationship table diagram below, there are seven rows that are used to define the relationships between ten different topics. The difficulty in dealing with this type of relationship table is that the relationships for some topics are defined in multiple rows. So when a topic relationship changes, the writer maintaining the DITAMAP must perform a detailed analysis of the relationship between topics in several rows to enact a change.
In DITA, these relationships are expressed as follows:
<reltable>
<relheader>
<relcolspec type = "concept"></relcolspec>
<relcolspec type = "task"></relcolspec>
<relcolspec type = "reference"></relcolspec>
</relheader>
<relrow>
<relcell>
<topicgroup collection-type = "family">
<topicref navtitle = "What is a user?" href="user_c.dita" type = "concept"></topicref>
<topicref navtitle = "What is a role?" href="role_c.dita" type = "concept"></topicref>
<topicref navtitle = "How are users and roles related?" href="user_role_c.dita" type = "concept"></topicref>
</topicgroup>
</relcell>
<relcell></relcell>
<relcell></relcell>
</relrow>
<relrow>
<relcell>
<topicref navtitle = "What is a user?" href="user_c.dita" type = "concept"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Create a new user account" href="user_create_t.dita" type = "task"></topicref>
<topicref navtitle = "Edit an existing user account" href="user_edit_t.dita" type = "task"></topicref>
<topicref navtitle = "Associate a user account with a role" href="user_role_associate_t.dita" type = "task"></topicref>
</relcell>
<relcell></relcell>
</relrow>
<relrow>
<relcell>
<topicref navtitle = "What is a role?" href="role_c.dita" type = "concept"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Create a new role" href="role_create_t.dita" type = "task"></topicref>
<topicref navtitle = "Edit an existing role" href="role_edit_t.dita" type = "task"></topicref>
<topicref navtitle = "Associate a user account with a role" href="user_role_associate_t.dita" type = "task"></topicref>
</relcell>
<relcell></relcell>
</relrow>
<relrow>
<relcell>
<topicref navtitle = "How are users and roles related?" href="user_role_c.dita" type = "concept"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Associate a user account with a role" href="user_role_associate_t.dita" type = "task"></topicref>
</relcell>
<relcell></relcell>
</relrow>
<relrow>
<relcell></relcell>
<relcell>
<topicgroup collection-type = "family">
<topicref navtitle = "Create a new user account" href="user_create_t.dita" type = "task" linking = "sourceonly"></topicref>
<topicref navtitle = "Edit an existing user account" href="user_edit_t.dita" type = "task" linking = "sourceonly"></topicref>
<topicref navtitle = "Create a new role" href="role_create_t.dita" type = "task" linking = "sourceonly"></topicref>
<topicref navtitle = "Edit an existing role" href="role_edit_t.dita" type = "task" linking = "sourceonly"></topicref>
<topicref navtitle = "Associate a user account with a role" href="user_role_associate_t.dita" type = "task" linking = "targetonly"></topicref>
</topicgroup>
</relcell>
<relcell></relcell>
</relrow>
<relrow>
<relcell></relcell>
<relcell>
<topicref navtitle = "Create a new user account" href="user_create_t.dita" type = "task"></topicref>
<topicref navtitle = "Edit an existing user account" href="user_edit_t.dita" type = "task"></topicref>
</relcell>
<relcell>
<topicref navtitle = "User Account window details" href="user_r.dita" type = "reference"></topicref>
</relcell>
</relrow>
<relrow>
<relcell></relcell>
<relcell>
<topicref navtitle = "Create a new role" href="role_create_t.dita" type = "task"></topicref>
<topicref navtitle = "Edit an existing role" href="role_edit_t.dita" type = "task"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Role window details" href="role_r.dita" type = "reference"></topicref>
</relcell>
</relrow>
</reltable>
While the four-column relationship table below has three more rows than the standard three-column relationship table, the larger table makes it much easier to maintain the relationships between the various relationships between the topics. In the example below, when a topic relationship changes, the writer can simply locate and modify a single row that defines the links for that topic.
In DITA, these relationships are expressed as follows:
<reltable>
<relheader>
<relcolspec type = "topic"></relcolspec>
<relcolspec type = "concept"></relcolspec>
<relcolspec type = "task"></relcolspec>
<relcolspec type = "reference"></relcolspec>
</relheader>
<relrow>
<relcell>
<topicref navtitle = "What is a user?" href="user_c.dita" type = "concept" linking = "sourceonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "What is a role?" href="role_c.dita" type = "concept" linking = "targetonly"></topicref>
<topicref navtitle = "How are users and roles related?" href="user_role_c.dita" type = "concept" linking = "targetonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Create a new user account" href="user_create_t.dita" type = "task" linking = "targetonly"></topicref>
<topicref navtitle = "Edit an existing user account" href="user_edit_t.dita" type = "task" linking = "targetonly"></topicref>
<topicref navtitle = "Associate a user account with a role" href="user_role_associate_t.dita" type = "task" linking = "targetonly"></topicref>
</relcell>
<relcell></relcell>
</relrow>
<relrow>
<relcell>
<topicref navtitle = "What is a role?" href="role_c.dita" type = "concept" linking = "sourceonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "What is a user?" href="user_c.dita" type = "concept" linking = "targetonly"></topicref>
<topicref navtitle = "How are users and roles related?" href="user_role_c.dita" type = "concept" linking = "targetonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Create a new role" href="role_create_t.dita" type = "task" linking = "targetonly"></topicref>
<topicref navtitle = "Edit an existing role" href="role_edit_t.dita" type = "task" linking = "targetonly"></topicref>
<topicref navtitle = "Associate a user account with a role" href="user_role_associate_t.dita" type = "task" linking = "targetonly"></topicref>
</relcell>
<relcell></relcell>
</relrow>
<relrow>
<relcell>
<topicref navtitle = "How are users and roles related?" href="user_role_c.dita" type = "concept" linking = "sourceonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "What is a user?" href="user_c.dita" type = "concept" linking = "targetonly"></topicref>
<topicref navtitle = "What is a role?" href="role_c.dita" type = "concept" linking = "targetonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Associate a user account with a role" href="user_role_associate_t.dita" type = "task" linking = "targetonly"></topicref>
</relcell>
<relcell></relcell>
</relrow>
<relrow>
<relcell>
<topicref navtitle = "Create a new user account" href="user_create_t.dita" type = "task" linking = "sourceonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "What is a user?" href="user_c.dita" type = "concept" linking = "targetonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Associate a user account with a role" href="user_role_associate_t.dita" type = "task" linking = "targetonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "User Account window details" href="user_r.dita" type = "reference" linking = "targetonly"></topicref>
</relcell>
</relrow>
<relrow>
<relcell>
<topicref navtitle = "Edit an existing user account" href="user_edit_t.dita" type = "task" linking = "sourceonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "What is a user?" href="user_c.dita" type = "concept" linking = "targetonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Associate a user account with a role" href="user_role_associate_t.dita" type = "task" linking = "targetonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "User Account window details" href="user_r.dita" type = "reference" linking = "targetonly"></topicref>
</relcell>
</relrow>
<relrow>
<relcell>
<topicref navtitle = "Create a new role" href="role_create_t.dita" type = "task" linking = "sourceonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "What is a role?" href="role_c.dita" type = "concept" linking = "targetonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Associate a user account with a role" href="user_role_associate_t.dita" type = "task" linking = "targetonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Role window details" href="role_r.dita" type = "reference" linking = "targetonly"></topicref>
</relcell>
</relrow>
<relrow>
<relcell>
<topicref navtitle = "Edit an existing role" href="role_edit_t.dita" type = "task" linking = "sourceonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "What is a role?" href="role_c.dita" type = "concept" linking = "targetonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Associate a user account with a role" href="user_role_associate_t.dita" type = "task" linking = "targetonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "Role window details" href="role_r.dita" type = "reference" linking = "targetonly"></topicref>
</relcell>
</relrow>
<relrow>
<relcell>
<topicref navtitle = "Associate a user account with a role" href="user_role_associate_t.dita" type = "task" linking = "sourceonly"></topicref>
</relcell>
<relcell>
<topicref navtitle = "What is a user?" href="user_c.dita" type = "concept" linking = "targetonly"></topicref>
<topicref navtitle = "What is a role?" href="role_c.dita" type = "concept" linking = "targetonly"></topicref>
<topicref navtitle = "How are users and roles related?" href="user_role_c.dita" type = "concept" linking = "targetonly"></topicref>
</relcell>
<relcell></relcell>
<relcell></relcell>
</relrow>
<relrow>
<relcell>
<topicref navtitle = "User Account window details" href="user_r.dita" type = "reference" linking = "sourceonly"></topicref>
</relcell>
<relcell></relcell>
<relcell>
<topicref navtitle = "Create a new user account" href="user_create_t.dita" type = "task" linking = "targetonly"></topicref>
<topicref navtitle = "Edit an existing user account" href="user_edit_t.dita" type = "task" linking = "targetonly"></topicref>
</relcell>
<relcell></relcell>
</relrow>
<relrow>
<relcell>
<topicref navtitle = "Role window details" href="role_r.dita" type = "reference" linking = "sourceonly"></topicref>
</relcell>
<relcell></relcell>
<relcell>
<topicref navtitle = "Create a new role" href="role_create_t.dita" type = "task" linking = "targetonly"></topicref>
<topicref navtitle = "Edit an existing role" href="role_edit_t.dita" type = "task" linking = "targetonly"></topicref>
</relcell>
<relcell></relcell>
</relrow>
</reltable>
Using a four column relationship table in our SunGard Higher Education DITAMAPs, we are able to easily maintain the relationships between hundreds of topics authored by a distributed writing team. Each team member can modify the specific row that applies to the topic in question without having to analyze the relationships of any other topic.
While proper up-front planning is still vital to assign topic relationships, knowing that each source topic is always located in the first column and all target topics can be found in columns based on the topic type has made the job of maintaining those relationships much easier. -->
-Zak Binder
Advisory Technical Writer
SunGard Higher Education
zak.binder@sungardhe.com
Content referencing (conref) is a convenient DITA mechanism for reuse of content from other topics or maps. A fragment of content in one topic or map can be pulled by reference into any other topic or map where the content is allowed. To create the reference, start by creating an empty element of the type that you want to pull in, and then use the element's conref attribute to provide the target's location.
The architectural specification describes content referencing at:
The language specification describes the syntax of the conref attribute at:
Editors for the Architecture area:
Conditional processing, also known as profiling, is the filtering or flagging of information based on processing-time criteria. The filtering mechanism first matches against the criteria, and then takes a specified action.
DITA provides several built-in attributes to hold the values for filter criteria for an element. These are:
It is possible, for example, to specify the platform or audience that a particular paragraph applies to. The values of these attributes can then be leveraged by any number of processes, including filtering, flagging, search, and indexing.
There is a proposal for DITA 1.1 that will enable specializers to define their own metadata attributes for use in conditionally processing content.
The architectural specification describes conditional processing at
Editors for the Architecture area:
The DITA Open Toolkit, or dita-ot for short, is a set of Java-based, open source tools that provide a "reference implementation" for processing DITA maps and topical content. You
can download the OT and install it for free on your computer, to get started
with topic-based writing and publishing.
The DITA Open Toolkit is a modest publishing system. The Toolkit transforms DITA content (maps and topics) into publishing deliverable formats such as web (XHTML), print (PDF), and Help (CHM and Eclipse). Your output files are simply generated in your file system. It is up to you to move them to your website, or into your print publishing process.
The OT is integrated into many authoring tools (e.g.,FrameMaker, <oXygen/>, XMetaL) and content management systems (e.g., Astoria, Bluestream, IXIASOFT, XyEnterprise).
If you find installing the OT too difficult, consider a free running version of the DITA Open Toolkit provided as an online software-as-a-service by the DITA Users organization.
There you can have an online workspace folder with DITA docsets from IBM and Comtech Services. Edit the files and build/publish them as web (XHTML), print (PDF), and Help (Eclipse). All your work can be browsed on the web by your colleagues as a portfolio of your DITA work with the OT.
The home page of the DITA Open Toolkit is http://dita-ot.sourceforge.net/. The DITA Open Toolkit Installation Guide, which is on that page, contains a list of supporting open-source software to download, together with a set of versions that are known to be compatible.
The instructions on how to install and use the OT are maintained by Anna van Raaphorst and Dick Johnson as the DITA Open Toolkit User Guide. Note that there are multiple packages available for the toolkit; if you are trying it out for the first time, you probably want to get the "full" package (now called "full easy install") that comes with other required software, and requires less manual setup.
Don Day maintains a DITA Open Toolkit Resources Page.
DITA user groups (DUGs) or DITA special interest groups (DIGs) facilitate knowledge sharing among users in specific geographic areas or among those with similar interests.
If you participate in a DITA user group not listed above, please add a link to your group's homepage. You may also use DITA XML.org to host pages for your DUG.
See comments at the bottom of this page. If you're looking for others interested in forming a new group in your local area or within your field of interest, please select the "add new comment" link below. Include your email address to encourage others in your area to contact you.
March 24, 2010: Speaker to be determined
February 24, 2010: Understanding DITA 1.2: Keys, conref extensions, and more
Presenters: Robert D Anderson (IBM) and Kristen James Eberlein (Eberlein Consulting)
June 24, 2009: Introduction to FrameMaker and DITA
Presenter, Terry Smith
May 27, 2009: IDCMS Blue: IBM's Information Development Content Management Strategy
Presenter: Mike Iantosca, IBM
April 22, 2009: The Xquery language and the DITA Open Toolkit
Presenter: Tom Ed White, Tekelec
March 25, 2009: Producing PDFs using Bookmap
Presenter: Julio Vazquez, Systems Documentation, Inc.
January 28, 2009: IBM DITA Wiki: Growing DITA Across the Enterprise
Presenter: Don Day, IBM User Technologies; Chair, OASIS DITA Technical
Committee; Architect, Lightweight DITA Publishing Solutions. Jointly
sponsored with the Central TX DITA User Group.
October 22:
Crossing Organizational Boundaries with DITA
Presenter: Colleen Smith, Teradata Corporation,
Content Management Information Architect
September 24, 2008: DITA, Metada, and Taxonomy
Presenters: Robert Berry, Mike Harris, and Paul Arellanes (IBM). Jointly sponsored with the Central TX DITA User Group.
August 27, 2008: DITA 1.2: Understanding the upcoming release
Presenter: Robert D. Anderson, IBM. Chief Architect of the DITA Open Toolkit.
July 23, 2008: Managing Content and Producing Output with the Eclipse IDE
Presenter: Tom Ed White, Tekelec
June 25, 2008: The DITA Troubleshooting Specialization
Presenter: Carolyn Inkster, IBM
May 28, 2008: Brushing your teeth with DITA: Leveraging relationships to improve usability
Presenter: Shane Taylor, Computer Task Group
April 23, 2008: Using IBM Task Modeler to Create DITA-based Information Sets
Presenter: Kristen James Eberlein, Systems Documentation, Inc.
March 26, 2008: Round table discussion on DITA maps and relationship tables
February 27, 2008: Organizational meeting
January 23, 2008: Organizational meeting
The history of DITA is the history of its many powerful characteristics - modularity, structured writing, information typing, separation of content from presentation, single-sourcing, minimalism, topic-based, task-orientation, content reuse, conditional processing, localization-friendly, multi-channel, component publishing, usability, consistency, object-orientation, inheritance, specialization, simplified XML.
If you don't understand all these DITA characteristics, you may not have analyzed the DITA Business Case properly - for your organization, or for yourself if you are a professional writer.
You don't have to know how to do all these things to use DITA, but if there is no one in your organization who knows why you should use them, you may have a problem. If you have already been doing some of these things, you will want to know how DITA incorporates them.
The historian of technical communications, R. John Brockmann, researched efforts to document products going back centuries. He finds that some of today's hottest new documentation ideas were present in the work of those creating, documenting, and selling the technology of manufacturing just after the revolutionary war.
( From Millwrights to Shipwrights to the Twenty-First Century: Explorations in a History of Technical Communication in the United States)
Today's computers, with their spectacular graphical interfaces, allow us to present animated visual images, even 3-D models to illustrate complex machinery. But this is not the work of the everyday tech writer. Flash animations and computer-aided design (CAD) demand skills more like those found in a game design team than a lone tech writer and wordsmith.
Brockmann found that two-dimensional images were a key part of 18th century technical documents. And modern ideas like modularity were there in the form of documents which were as often a set of cards as a book. He also found that early work was very user-centered and task oriented, and that it took advantage of knowledge already available to the user.
It seems that much of the change in today's technical documentation is the direct influence of the computer, and for some obvious reasons:
At Harvard in the 1960's, computers were enlisted to become "teaching machines" by the behaviorist B.F.Skinner. His ideas of "programmed learning" still have influences in today's eLearning models. His work required knowledge to be broken down into chunks.
Hughes STOP - (Sequential Thematic Organization of Publications) advocated a storyboard approach with two-page spreads. A large graphic on one page, with clear labels, faces the main explanatory text on the opposite page.
The U.S. Navy published the Quick Reader Comprehension (QRC) method in 1961. It explicitly called for modular documentation that could be reassembled and reused for different purposes, perhaps the first mention of Reuse.
David Ausubel first proposed Advance Organizers in 1960. They are formal versions of the teacher telling the students what will be said (then saying it, then telling them what was said - a summary, in the classic three-step teaching method). Ausubel advocated images and clear titles and subtitles that revealed the structure in a document.
In the mid-'60's Robert Horn (winner of an ACM SIGDOC Lifetime Achievement Award for Documentation) developed Information Mapping techniques and founded the company by that name. Common "Information Types" were identified in dozens of standard document types like user manuals, policy and procedure manuals, annual reports, etc. Identifying standard information types is at the heart of DITA (Darwin Information Typing Architecture).
In the late '60's, Charles Goldfarb, Edward Mosher and Raymond Lorie (whose surname initials were used by Goldfarb to make up the acronym GML) created IBM's Generalized Markup Language for documents. In 1974, GML became SGML, with the help of Yuri Rubinsky and others. SGML was the standard for many years of structured documents in the military, aerospace, and large computer companies. It became the basis of DocBook.
In 1973, David Wooley at the University of Illinois developed PLATO Notes, a kind of message board. Posted topics were the basis of an online community supporting the PLATO timesharing system. Ray Ozzie used PLATO Notes as a student at Illinois and in the 1990's created Lotus Notes, including some features of PLATO notes.
In 1980, the ANSI standard committee for Information Processing published the first working draft of the SGML standard. SGML was the standard for many years of structured documents in the military, aerospace, and large computer companies. It became the basis of DocBook.
In 1981 a team at IBM led by Fred Bethke called for a new "task orientation" in computer software documentation. Their report, IBM Improving usability of publications (1981), contrasted documents that reflected the software systems architecture. They found that a user had to already understand the software to find the help they needed. Inexperienced users got lost. Another approach had been role-based documentation. But the new idea for Bethke was task orientation, which deals with the tasks people commonly perform with computer programs, regardless of their job titles, and focuses on the information needed to perform the tasks.
In 1981 Interleaf introduced technical publishing software for document authoring and composition. It included word processing, graphics, charts, tables, equations, image editing, and automatic page layout. Interleaf automatically generated indexes and tables of contents for books, and featured conditional processing of content.
In 1983 IBM's Santa Teresa Laboratory published Producing Quality Technical Information
(now unavailable), guidelines for technical writing, mostly within IBM
and mostly for software documentation. The team of writers included
Fred Bethke, whose earlier IBM Publishing Guidelines has established
the importance of task orientation. They identified seven quality
characteristics as task orientation, organization, entry points,
clarity, visual communication, accuracy, and completeness.
In 1984, the new Apple Macintosh was a revolution in computer user interfaces and a similar revolution in computer documentation. The user interface for documents was WYSIWYG (what you see is what you get - when you print the document). Affordable Desktop Publishing was born. The first DTP program, the $99 MacPublisher, was created by Bob Doyle, in the year of the Mac. Aldus (later Adobe) PageMaker followed in 1985. These tools led technical writers to style their documents and even arrange the content layout on the page. To this day DTP thinking is the most important inhibitor of content reuse, mixing presentation with content.
The new Macintosh Documentation Guidelines called for three sections. A Learning overview with tutorials that introduce new concepts and functions, an extensive Using section that spells out how to accomplish tasks, and a program Reference section. To this day, well written books on computers (for example those from O'Reilly) have Learning (e.g., Learning PHP), Using (e.g., Programming PHP), and Definitive Reference volumes.
Note how Learning, Using, and Reference map perfectly onto the three DITA information types specialized from the basic DITA Topic structure - Concept, Task, and Reference. And note that the Macintosh "Using" section was task-oriented, just as IBM was recommending.
In 1984 Lotus introduced their spreadsheet program 1-2-3, which was later the first software to use the F1 key to invoke context-sensitive topic-based online help.
In 1986 FrameMaker was introduced on the Sun OS. This DTP program was designed for long-form documents like books. It became very popular among professional tech writers and at $2500 was a major competitor for the much more expensive Interleaf system.
In 1987 Ralph Walden wrote Microsoft 's first online Help system, QuickHelp, for MS-DOS. He would later develop WinHelp and HTML Help.
In 1986 R. John Brockmann published Writing Better Computer User Documentation. Brockmann described the changes needed to move from paper docs to online. He reported on the new task-based approach, which limits information to that needed to perform a single task, assuming that the user can find general information elsewhere, or very likely already knows it.
In 1988 SoftQuad founder Yuri Rubinsky gave his high-school friend Peter Sharpe the task of developing Author/Editor for SGML, the first specialized SGML editing application.
In 1989 Bob Horn published Mapping Hypertext, an extraordinary book with fantastic illustrations - all drawn by Horn himself - exhibiting the kind of structured writing that Information Mapping was proposing for all documentation. This is still one of the most important books in the history of documentation in general (it's not about computer docs). The book described the seven information types of a structured document - classification, concept, principle, procedure, process, structure, and fact. Horn was inspired by Harvard Professor George Miller's famous work on the Magical Number Seven (plus or minus two) as the number of things easily learned at one time.
Learning theorist Dr. Ruth Clark would trimmed these down to five - concept, principle, procedure, process, and fact - her information types for Training and eLearning - in her workshops and book Developing Technical Training: A Structured Approach for Developing Classroom and Computer-based Instructional Materials. But Clark says the idea for these five types came from instructional theorist M.David Merrill and his "Content-Performance Matrix," not from Information Mapping.
In 1990 MIT Press published the research results of another IBM team led by John M. Carroll. Carroll's book, The Nurnberg Funnel introduced the idea of minimalism in technical writing. It was task orientation carried to an extreme. Minimalism meant small non-linear chunks readable in any order. It emphasized reading To Do, not reading To Know or To Learn, a phrase first introduced by Ginny Reddish. It attacked the standard systems approach to learning of Gagne and Briggs, with its hierarchical decomposition of learning objectives, which remains to this day as a standard in learning systems. And it emphasized handling errors when the user could not accomplish a task.
Minimalism included the earlier IBM task-based approach, and it limited instructions to the bare minimum needed to perform a single task, assuming that the user can find general information elsewhere, or very likely already knows it.
William Horton published Designing and Writing Online Documentation: Help Files to Hypertext in 1990. It contains clear references to many of the most important concepts in technical writing - task orientation and topic-based content, single sourcing and reuse, and conditional processing, Horton called for new names for tech writers - "document weavers" and "topic writers." Horton's topics had topic sentences, smooth transitions, and summaries. These are difficult to accomplish when online topics are written to be read in any order, and there is no beginning, middle, and end (p.216).
Also in 1990 Microsoft introduced WinHelp 1.0, developed by Ralph Walden, Cheryl Zubak, and others.
In 1991 Ed Weiss produced a book - How To Write Usable User Documentation - on structured and modular documentation that was itself an excellent example of structured and modular documentation, following closely the STOP methodology developed in the 1960's.
In 1991 Sun Microsystems introduced FrameBuilder, a version of FrameMaker with added support for SGML.
In 1991 Arbortext released Adept, their SGML editor, later to be known as Epic, and finally simply the Arbortext editor, when Arbortext was acquired by PTC.
In 1992 Blue Sky software released RoboHelp, a task-oriented, topic-based, online Help Authoring Tool (HAT). Later they changed the company name to eHelp. eHelp was acquired by Macromedia, prinmarily to get the Flash-based tool RoboDemo (now Captivate). RoboHelp was discontinued, but after Macromedia was acquired by Adobe, RoboiHelp became a strong part of Adobe's Technical Communication Suite, including FrameMaker.
In 1992 at Lotus a team of user assistance specialists started to design a single "Working Together" common help design for Lotus' office package SmartSuite and Lotus Notes. John Hunt (1-2-3), Janet Smith (Freelance Graphics), Bryan Steh (Word Pro), and Susanna Doyle (Notes) created a core design with six topic types: overview, context-sensitive, steps, details, examples, glossary, and reference. It used Notes for content management and WordPro templates for editing, with single-source/multiple output to a variety of delivery formats.
In 1992 the HyTime standard for Hypermedia and Time-based content (an application of the SGML architecture) identified problems with linking document types that was to inform the specialization mechanism in DITA.
In 1994 JoAnn Hackos published her landmark Managing Your Documentation Projects, revised and republished as Information Development: Managing Your Documentation Projects, Portfolio, and People by Wiley in 2006. Fully in tune with task orientation, Hackos book described only three information types - concept, procedure, and reference (p.236). This seems to be a combination of Information Mapping's seven types, Ruth Clark's simplification to five types, and Apple Macintosh Documentation Guidelines three components.
In the mid '90's, Yuri Rubinsky's team at SoftQuad (creators of one of the first and most popular HTML editors, HoTMetaL, became involved in the development of a compromise markup language somewhere between the extraordinarily complex SGML and the popular new HTML (Hypertext Markup Language) for web pages. (HoTMetaL was the precursor to today's XMetaL from Justsystems.) HTML was a disaster from the point of structured reusable component documentation, not least because it combined presentation markup with structural markup. The new markup language was XML (eXtensible Markup Language), and SoftQuad developed one of the first XML editors, XMetaL.
In November 1995 John Carroll convened a workshop, sponsored by the Society of Technical Communication (STC), to evaluate Minimalism in the years since the Nurnberg Funnel. Carroll invited his major colleagues - R. John Brockmann, David Farkas, JoAnn Hackos, Hans van der Meij, Janice C. (Ginny) Reddish, and others.
In 1995 Adobe acquired FrameMaker and FrameBuilder, which was to become FrameMaker + SGML, and eventually the more affordable Structured FrameMaker, now included with every copy of FrameMaker, though used by a small percentage of tech writers. Most writers continue to prefer unstructured documents.
In 1995 CNET Founders Halsey Minor and Jonathan Rosenberg built their own Web content management system and it introduced a number of today's core capabilities, like content reuse and personalization. Page templates assembled the content dynamically from a relational database. They sold the system to Vignette for a share in that new company. It was the first "content management system (CMS)."
In Toronto in 1996, an IBM documentation team including Michael Priestley, Bob Fraser, Dennis Bockus, and Jamie Roberts was developing a Help system for IBM's new line of VisualAge software. Roberts had just returned from graduate study at University of Waterloo and attended a brainstorming session to define some basic information topic types for the new Help. He was inspired by Lotus' online help in 1-2-3, which then had the reputation of being very good at user experience. Lotus Help included procedures (called "steps") and overview (concept). Roberts scribbled "concept, task, and reference" on a napkin, handed it to Bockus for implementation, and a new help document architecture was born. There is not much unusual about a Help system that is task-based and assembled from topics. What was new was that this was to become the simplified form of XML known as DITA, with very significant contributions from Lotus, which IBM had just acquired. And it was to be both web-based and delivered as a PDF.
After the release of Windows 95 and WinHelp 4, in 1996 Scott Boggan, David Farkas, and Joe Welinske wrote Developing Online Help for Windows 95. It had a strong task orientation and was topic-based. But "concept/overview" was only one of ten standard topics, which did not include "reference," but did wisely include errors and troubleshooting.
In 1996, IBM signed a long-term agreement to use the Arbortext Adept editor for internal SGML document creation.
In 1997 Microsoft released Compressed HTML Help (.CHM), based on compiled HTML, images, and Javascript.
In 1998, JoAnn Hackos and Ginny Reddish published the definitive reference on task analysis, User and Task Analysis for Interface Design, and John Carroll published the edited proceedings of his 1995 workshop, Minimalism Beyond the Nurnberg Funnel, with major contributions by Hackos and Reddish.
In 1998, IBM revised their 1980's PQTI tech writing guide, retitling it Developing Quality Technical Information. The team of writers was led by Gretchen Hargis and included only one from the PQTI team, Polly Hughes. They now cited nine quality characteristics - accuracy, clarity, completeness, concreteness, organization, retrievability, style, task orientation, and visual effectiveness. Note that task orientation had slipped from first to eight out of nine quality characteristics.
In March 2001, IBM introduced DITA as a series of developerWorks articles about a new simplified version of XML for documentation. It was intended to replace IBM's IBM ID Doc, an internal version of SGML for IBM's technical software support. While XML was enjoying great use as a data exchange method (RSS and SOAP protocols), it had little traction as a document markup language. DITA was an attempt to make a simplified XML starter set for documentation markup, one designed from the outset to encourage reuse of small content components. The key ideas were to be simpler than the complex SGML and also be usable online.
The goal of DITA was to formalize information typing practices, both print and online, and also enable an extensible typing architecture through specialization of base topics. DITA maps were a way to standardize collection publishing and information architecture/outlining models. DITA was initially known as MITA, for Mendel Information Typing Architecture, to emphasize the object orientation of the new architecture, with its "inheritance" and evolution of topic structures via specialization. Since MITA was already a somewhat proprietary acronym, IBM switched to Darwin and DITA.
In May 2002, IBM added domain specialization to topic specialization, and demonstrated these in the Open Toolkit, a reference implementation of DITA publishing, with a starter set of XSLT stylesheets. IBM encouraged authoring tool vendors to integrate the Open Toolkit as a means of publishing DITA, and most have done so.
JoAnn Hackos' Content Management for Dynamic Web Delivery was published in 2002. She described creating an information (content) model and developing information types, such as procedures, concepts, warnings, specifications, and others. Michael Priestley, the DITA specialization architect, and Dave Schell, then the principal DITA evangelist for IBM, wrote a 3-page vignette on DITA, perhaps its first mention in a book. They stressed topic orientation, information typing, specialization, inheritance, and two architectures, one for information and one for specialization. Hackos briefly mentions the AutoCAD Learning Assistance software she and learning guru Wayne Hodgins developed for Autodesk. It was cleverly dubbed CPR for its three-tabbed interface to concept, procedure, and reference.
In 2003, two books appeared on single sourcing and content reuse, Single sourcing: Building Modular Documentation, by Kurt Ament, and Managing Enterprise Content, by Ann Rockley.
In 2003, IBM published a second edition of Developing Quality Technical Information, by Gretchen Hargis and others. Now the nine quality characteristics were rearranged once more, putting task orientation first again. More importantly, they added an introductory chapter that called for content to be structured as separate information types, specifically task, concept, and reference. (Note the correct order of the three basic DITA information types.) This book is all about DITA without mentioning the name, probably because IBM was using DITA internally but not yet sharing it with the world when the book was drafted.
In April 2004, the Organization for the Advancement of Structured Information Standards (OASIS), formed a Technical Committee to explore a DITA Standard. The TC included XML tools vendors, consultants on Information Architecture and Content Management Systems (CMS), and end users of the DITA Document Type Definitions (DTD) and Schemas needed for the new DITA Standard.
In February 2005, IBM donated the Open Toolkit, a limited version of their internal Information Developers Workbench, to SourceForge. IBM continues to develop the OT, which is not a part of the OASIS DITA Standard efforts.
DITA 1.0 was approved as an OASIS Standard in June 2005
DITA 1.1 was approved in August 2007, adding a new Bookmap specialization.
DITA 1.2 is expected sometime in 2008. It will add structured learning, creation of Learning Objects with DITA, which will be compatible with eLearning standards such as SCORM.
From Millwrights to Shipwrights to the Twenty-First Century: Explorations in a History of Technical Communication in the United States, by R. John Brockmann.
History of Outlining (and STOP).
Quick Reader Comprehension (1961).
Hughes STOP - Sequential Thematic Organization of Publications (1965).
IBM Improving usability of publications (1981). Task-orientation HTML version
Writing Better Computer User Documentation (1986)
Mapping Hypertext, Robert Horn, Lexington Institute (1989).
Designing and Writing Online Documentation: Help Files to Hypertext, by William Horton (1990).
The Nurnberg Funnel, John M. Carroll, MIT Press(1990).
How To Write Usable User Documentation, by Edmond Weiss, Oryx, (1991)
Managing Your Documentation Projects, by JoAnn Hackos (Wiley, 1994).
Developing Online Help for Windows 95, by Scott Boggan, David Farkas, and Joe Welinske, (Solutions, 1996).
Standards for Online Communication, by JoAnn Hackos (Wiley, 1997).
Robert Horn, Visual Language (1998).
User and Task Analysis for Interface Design, by JoAnn Hackos and Janice C. (Ginny) Reddish (1998).
Minimalism Beyond the Nurnberg Funnel, John Carroll, MIT Press(1998).
Two approaches to modularity (1999). Robert Horn compares structured writing to Hughes STOP.
Review of the Nurnberg Funnel (1999) Robert Horn compares structured writing to Minimalism.
The Impact of Single Sourcing and Technology, Ann Rockley, 2001.
Cisco/Clark Reusable Learning Objects.
Content Management for Dynamic Web Delivery, by JoAnn Hackos (Wiley 2002).
Managing Enterprise Content, by Ann Rockley, New Riders, 2003.
Single sourcing: Building Modular Documentation, by Kurt Ament, Andrew Publishing, 2003.
Robert Horn Powerpoint on Visual Language.(2003).
Developing Quality Technical Information: A Handbook for Writers and Editors (2nd Edition) , by Gretchen Hargis, Michelle Carey, Ann Kilty Hernandez, Polly Hughes, Deirdre Longo, Shannon Rouiller, Elizabeth Wilde (IBM Press, Information Management Series, 2004).
Information Development: Managing Your Documentation Projects, Portfolio, and People, by JoAnn Hackos (Wiley, 2006).
The purpose of the OASIS DITA Technical Committee (DITA TC) is to define and maintain the Darwin Information Typing Architecture (DITA) and to promote the use of the architecture for creating standard information types and domain-specific markup vocabularies.
Review the list of organizations that participate in the DITA TC.
Everyone is welcome to join the DITA TC. If you are employed by an existing OASIS member, you can go directly to the DITA TC web site and click on Join This TC. If your employer is not currently an OASIS member, you can find out how to become involved at Join OASIS.
The current work of the DITA TC is made visible through e-mail archives and a TC wiki. A consolidated zip file with all specifications, DTDs, and Schemas for DITA 1.1 is publicly available at dita1.1.zip.
The DITA TC home page also includes links to the public sites for the current subcommittees. All subcommittees operating under the DITA TC are listed on the OASIS DITA TC home page. More information is available about -OASIS DITA TC specialization subcommittees-.
Users may be interested in the mailing lists and additional information listed on the DITA TC web site.
This page contains information about the work of the OASIS DITA Learning and Training Content Specialization Subcommittee (at http://www.oasis-open.org).
See also:
Creating DITA specializations for better information design and interoperability throughout the semiconductor industry.
This subcommittee represents a community of interest within various semiconductor companies who believe that there is value in creating a DITA specialization for the industry. Not only will this enable better integration with the development of the OASIS DITA Standard, but will provide an opportunity for
NOTE: Version 1.3 was released in 2006. This information should no longer be considered up to date. Up to date information on the toolkit can be found here: The DITA Open Toolkit
The next major release (1.3) of the DITA Open Toolkit is being evaluated for scope and schedule. The project team has been tracking requirements coming from the OASIS DITA TC, the dita-users support forum, and among the dita-ot developer community discussions. The open requirements are listed below. The purpose of this document is to host separate design discussions for each of these items. Out of these discussions, the project team will assess the relative priorities and available resource (contributors and code, for example) that can be applied to the proposed schedule.
Task_Name, Duration, Start, Finish
Coding, 1 Month, start of July, around 06-8-10
Test Execution, ~ 1 Month, Mid July, around 06-8-22
Release Prep, ~ 1.5 weeks, around 06-8-23, end of August
Release, , end of August, end of August
Don Day, DITA OT Team Lead
Tentative schedule:
Design discussions for the proposed items (tentative list) are here:
1. Support for DITA 1.1/bookmap (stakeholders: OASIS DITA TC and popular community request)
2. Eclipse integration (stakeholders: community and dev team)
3. Incremental builds (stakeholders: advanced users, dita-ot developers)--Deferred to 1.4 to gather more case studies, understand requirements better.
4. GUI/usability (stakeholders: dita-user community, loudly heard at recent DITA conferences)--Revised to focus first on installability, defer GUI to related projects (much as editor vendors provide their own GUIs, in effect).
5. Fix topicmerge (stakeholders: dev team, judged to be strategic and a necessary base for the new bookmap support)
6. Fix indexing (stakeholders: known lack of full support for all users, languages)
7. Fully enable the XML catalog resolver (stakeholders: )
8. Ant refactoring (stakeholders: dev team, judged to be strategic for toolkit maintenance and future plugin development)
9. Document refactoring (stakeholders: end-users, other document authors)
If you have requests for other issues that you feel are critical for 1.3, please send them to Don Day (dond at us.ibm.com) so that I can compare them with other reports. The agenda for 1.3 is already very large for an aggressive schedule, therefore I would need to understand the value and justification of any new request for the larger community in order to assess it against these that have already been identified. Thanks!
Please note that actual work on creating or revising the DITA OASIS Standard or specification must take place within the OASIS DITA Technical Committee, which operates under the open OASIS Technical Process. This process, which governs issues such as transparency, contributions, licensing, participation, and disclosure, assures that OASIS Standards remain widely available and safe to implement, produced in an open, democratic, and accountable method.
While this page is a great place to get an enhancement idea started and to encourage informal collaboration, once the idea is ready to be advanced, the discussion should move to the formal OASIS Technical Committee Process. All those who wish to participate in standards development or more closely observe this work are strongly encouraged to join OASIS. A variety of membership levels are offered to assure that everyone affected by standards may contribute to their creation.
You can also provide feedback, such as feature requests for additions or changes to the DITA specification, through the OASIS DITA Comment Form. All comments received via that form are documented and reviewed by the OASIS DITA Committee members and publicly archived.
There are two main approaches to subject classification currently being explored for future use in DITA, which this page will call the map-based approach and the metadata approach.
In the map-based approach, a taxonomy is represented using a hierarchy in a DITA map. Each member of the hierarchy is a specialization of the <topicref> element. Each <topicref> element points to a topic that describes the subject of that node of the taxonomy.
In the metadata approach, the <data> element (which is being introduced in DITA 1.1) is used to record properties. The property stated in a <data> element is considered to apply to:
When the <data> element is used within content, the property that it states is considered to apply to the directly enclosing content element. When the <data> element is used in metadata contexts, the property that it states is considered to apply to the nearest enclosing content element (such as <topic>).
Broader topics:
Related topic: Taxonomy specialization plug-in, Introduction to Specialization
The DITA OASIS Standard builds content reuse into the authoring process, defining an XML architecture for designing, writing, managing, and publishing many kinds of information in print and on the Web.
The standard is advanced through an open process by the OASIS DITA Technical Committee, a group that encourages new participation from developers and users.
See Portuguese translation of this page.
The DITA OASIS Standard defines an XML architecture for designing, writing, managing, and publishing technical documentation in print and on the Web. DITA (commonly pronunced dit'-uh) builds content reuse into the authoring process for document creation and management.
Topic-Based Authoring
Focusing on a common topic model as a conceptual unit of authoring, DITA provides a core set of topic types derived from concept, task, and reference. DITA defines a specialization mechanism for extending markup to represent either new topic types or new domains of markup common across sets of topic types. DITA maps can combine topics into various kinds of deliverables. Content can be shared among maps or topics. Class-based processing ensures new specializations can be supported with existing tools, speeding the testing and adoption of new designs.
With DITA, all content is inherently reusable. That's because DITA's strength lies in a unified content reuse mechanism that enables an element to replace itself with the content of a like element, either in the current topic or in a separate topic that shares the same content models.
Supporting Multiple Deliverables and Publishing Channels
DITA enables organizations to deliver content as closely as possible to the point-of-use, making it ideal for applications such as integrated help systems, web sites, and how-to instruction pages. DITA's topic-oriented content can be used to exploit new features or delivery channels as they become available.
DITA enables highly automatable processes, consistent authoring, and enhanced applicability to specific industries. Content owners benefit from industry support, interoperability, and reuse of community contributions. At the same time, content owners address the specific needs of their business or industry.
Benefits
DITA can be used to...
Get Involved
DITA is advanced by the OASIS DITA Technical Committee. Its members include representatives of:
...and other XML tools vendors, consultants on Information Architecture and Content Management Systems (CMS), and users.
Participation remains open to all organizations and individuals. A wide variety of membership levels and rates are offered to ensure all those who are affected by DITA have a voice in its development. See Join OASIS or contact member.services@oasis-open.org for details.
For some, perhaps the real question is Why XML? (or What is XML?), but assuming you have answered those questions (and are using XML), then the next step is to locate an appropriate data model for your content. This is an important step because you will spend a lot of time and money developing processes and selecting tools to support your chosen data model. XML, by definition, is extensible and allows you to create any valid structure that suits your needs, but before you decide to develop your own, consider the pre-existing options (see Don't Invent XML Languages for a discussion on why not to develop your own). If you can leverage and build on top of someone else's work, why not?
DITA is a data model for authoring and publishing topic-based content. It was developed by IBM for internal use and has since been released to the open-source community (now under the guidance of OASIS). This architecture and data model were designed by a cross-company workgroup representing user assistance teams working throughout IBM. After an initial investigation in late 1999, the workgroup developed the architecture collaboratively during 2000 through postings to a database and weekly teleconferences. Since that time IBM has migrated thousands of pages of content to DITA.
But, why DITA?
Well, assuming your content fits into the topic-based data model, DITA's increasing popularity means that more and more authoring and publishing tools will be developed to support that model. The DITA Open Toolkit allows you to generate many popular output formats (HTML, HTML Help, PDF, Java Help, etc.) from DITA-based content. If you develop your own data model, you'll have to pay to develop those transformations. DITA's modular architecture, supports efficient reuse of content at the word, phrase or topic level. DITA also has the concept of "specialization," which allows you to develop elements of your own that are based on core DITA elements. This helps you to customize DITA to support your particular types of content while continuing to take advantage of the base DITA tools and transformations.
Learn more
The following articles provide additional information:
Edit and add other wiki pages to this section.
Topic-based authoring has been a mainstay of technical information development since we first began developing help systems. We learned quickly enough that we couldn't split our existing books into help topics by making every heading level a new help page. Information originally designed with a narrative flow no longer made sense nor assisted users in finding exactly the content they needed. We had to rethink the type of information that our help systems should include and create a new set of standards for its development. The result is topic-based authoring.
Authoring in topics provides information developers with a way to create distinct modules of information that can stand alone for users. Each topic answers one question: "How do I ...?" "What is ...?" "What went wrong?" Each topic has a title to name its purpose and contains enough content for someone to begin and complete a task, grasp a basic concept, or look up critical reference information. Each topic has a carefully defined set of the basic content units that are required and accommodates other optional content. As information developers learn to author in topics and follow sound authoring guidelines consistently, we gain the ability to offer information written by many different experts that looks and feels the same to the users.
Not only has topic-based authoring become the norm for well-designed help systems, information architects have learned that formulating consistently structured topics facilitates readability and information access in traditional, more linear book structures. Readers are able to identify task-based topics within sections and chapters because the tasks look the same and contain the same essential content units. Readers learn that conceptual and background information is always located in the same position in the table of contents with respect to the tasks. Readers come to depend upon standard reference sections that contain similarly structured details for ease of lookup.
The core information types in DITA support the structures that underlie most well-designed technical information. Any organization that follows best practices in formation architecture will find the core DITA structure a good fit. But they also challenge us to become even more disciplined in structuring information according to a set of carefully defined business rules. The benefit of such disciplined information structuring is the consistent presentation of information that helps you build reader confidence and simplify the reader's task of knowing how to navigate and use your information.
Benefits of topic-based authoring
Authoring in structured topics provides you with a sophisticated and powerful way to deliver information to your user community. You will find benefits that decrease your development costs and time to market, as well as provide increased value to your customers:
Structured topics contain only the information needed to understand one concept, perform one procedure, or look up one set of reference information.
Structured, topic-based authoring promotes consistency in the presentation of similar information.
Topics can be reviewed by subject-matter experts as soon as they are ready. They need be reviewed only once, even if they appear in multiple deliverables, reducing the burden on reviewers.
Topics can be translated before entire volumes are complete, reducing the time to market for global customers. Topics in multiple languages can
be combined into language-specific deliverables without extra desktop publishing time and expense.
Assembling topics into multiple deliverables can be automated, reducing production time and costs.
Consistently structured topics are easier to reuse in multiple deliverables.
Structured topics may be combined in new ways to meet changes in product solutions, work structures, geographies, industries, or other customer configurations.
Topics are easier to update immediately instead of waiting for the next release of an entire library of documents.
Consistently structured topics help users build a firm mental model of the types of information you are presenting.
Consistently structured topics help users navigate more quickly to the information they need.
If one of your business goals is to use information topics in multiple deliverables, you need to build a repository of topics that are clearly defined according to your standard set of information types. Your repository is also characterized by the metadata attributes you associate with you topics.
DITA provides you with such a standard as a starting point. DITA gives you the capability to expand upon its core information types when you need to accommodate the special needs of your customers and your information.
Defining information types for your topics
If your information is like most in the technical information industry, you have a great diversity of structures in your information, especially if those topics are embedded in the threaded, narrative sections and chapters of books. Your first job is to inventory your content to identify its range and diversity.
In most cases, you will find lots of tasks, containing step-by-step instructions for reaching a specific goal. The dominance of the task in technical information is why DITA includes the task as one of the three core information types.
Accompanying tasks, you are likely to find background, description, and conceptual information that explains what something is. DITA labels such supporting information "concepts". You will also find tables, lists, diagrams, process flows, and other information that can be labeled as "reference," the information that no one wants to memorize but must be easy to look up.
Once you have completed your content inventory, you need to carefully analyze the three core information types provided with DITA. The standard structure for task, concept, and reference is presented in the DITA specification. Experiment with accommodating your content to the standard structure. In most cases, your content easily fits into a standard DITA structure.
Where you may encounter difficulties is with the diversity of your own content rather than with the DITA information types. Some of the content in your inventory will not even meet your own guidelines. Often, that content was written by people long gone from your organization or was influenced by subject-matter experts who wanted it their way rather than following your authoring guidelines.
Our recommendation is to focus on the essential underlying structure of your content rather than the idiosyncrasies and accidents of individual writers over the years. If you find an odd structure in a task, for example, ask if that structure is the best way of conveying the information to the user or if the task can be rewritten following the structure of a standard DITA information type. Most of the time, you will find that the standard is the best solution.
One of the more common problems you will find with some of the content you examine is mixed structure. Tasks start out with long discussions of background information. Concepts end up including step-by-step procedures. Tables of reference material end up with concepts in the footnotes or tasks incorporated into table cells.
Although mixed information types are possible in DITA, we don't recommend using them. Consider that by separating information carefully and rigorously into the neat, consistent information-type buckets provided, you will have information that you can present much more dynamically and flexibly to users. If a users wants to know the steps of a task, they won't have to skip over background that they don't want to think about yet. You can refer them to that conceptual and background information through a related-topic reference or a hypertext link rather than embed lengthy conceptual information in the task.
By chunking your information according to well-defined information types rather than combining types randomly, you gain flexibility in distributing your information to people who need it most. You also make the relationships between chunks of information more obvious. If you believe that users will profit from reading background information before performing a task, by using related-topic links, you can ensure that they know about the relationship and why reviewing the concept or background is advantageous.
Adding new information types
Although we find that most technical information fits neatly into the standard DITA information types, we recognize that you may discover that you have special information types that cannot be accommodated by the standard content units or that you want to label those content units with more descriptive XML tag names. At that point, you need to pursue specialization.
Consider an example in the semiconductor industry. A great deal of detailed information about a chip design is contained in an information type called a register description. Although a register description falls into the class of reference information types, it has some very specific and detailed content. By specializing on the standard reference information type, you can build a register description specialization that standardizes the content with appropriate XML elements names, assisting the writers and providing additional metadata to facilitate searches. Many similar opportunities for specialization may present themselves in your content. But be careful to exhaust the possibilities of the standard information types before pursuing the differences.
The more differences you present to writers and readers, the more opportunities there are for confusion. With too many choices of information types, an information developer is more likely to chose incorrectly. With too many subtle differences in the presentation of information, your users are more likely to become confused when they are unable to find the standard set of content that they have come to expect.
Introduction to DITA: A Basic User Guide to the Darwin Information Typing Architecture, by Comtech Services, Inc.
See Portuguese translation of this page.
"Structuring your Documents for Maximum Reuse," Janice (Ginny) Redish, Best Practices, June 2000. [Best Practices is the bimonthly newsletter of the Center for Information-Development Management (CIDM)]
Ginny Redish outlines a step-by-step procedure for creating structured documents. Even if you aren't yet considering single sourcing, you'll find that structuring documents is an extremely useful, time-saving technique. It works in traditional publishing and is useful for individual writers in any situation where they have to create the same type of document many times. It is essential for teams of writers who are contributing parts to a large document or to a set of documents. (link coming soon)
The books listed here contain information relevant to topic-based authoring:
Sissi Closs: Single Source Publishing. Topicorientierte Strukturierung und DITA, Entwickler-Press, 2006
This book describes the Single Source Publishing history and explains the relevant concepts focused on topic-oriented structures. Siss Closs has developed the class concept method with which adequate topic and link types can systematically be developed for any kind of content. In this book, the class concept method is described in detail. In addition, the book contains a DITA short reference.
Jonathan and Lisa Price, Hot Text, New Riders Press, 2002
Hot Text focuses on good writing practices, including topic-based authoring, and applies these to web-based deliverables. It includes XML authoring that is directly applicable to implementing the DITA model.
Robert E. Horn, Mapping Hypertext: The Analysis, Organization, and Display of Knowledge for the Next Generation of On-Line Text and Graphics, Lexington Institute, 1990
Robert Horn is the developer of Information Mapping(tm). Although this book focuses on online information, it is one of the few publically available discussions of the topic-based principles of information mapping. In the book, Horn explains how to chunk, organize, and sequence content.
JoAnn Hackos and Dawn Stevens, Standards for Online Communication, Wiley, 1997
Hackos and Stevens focus on topic-based authoring in the context of online information systems. They include both help and web design in the examples. However, the topic-based authoring principles are central to the writing methods detailed in the book. The authors demonstrate how topic-based authoring differs significantly from book-based authoring.
Kurt Ament, Single Sourcing: Building Modular Documentation, Noyes Publications, 2002
Ament explains in plain language and by example how to develop single source documents. He shows technical writers how to develop standalone information modules, then map these modules to a variety of audiences and formats using proven information mapping techniques.
Gretchen Hargis et al. , Developing Quality Technical Information: A Handbook for Writers and Editors, IBM Press, 2nd edition, 2004
Many books about technical writing tell you how to develop different parts of technical information, such as headings, lists, tables, and indexes. Instead, we organized this book to tell you how to apply quality characteristics that, in our experience, make technical information easy to use, easy to understand, and easy to find.
The DITA Architecture area contains references to specifications and articles that explain the architectural underpinnings of the DITA language.
This introduction to the DITA Architecture contains overall technical information about the DITA language and its architecture.
In the DITA architecture, DITA topics are organized by DITA maps. It is also possible to nest sub-topics within a topic. Specialization enables the creation of specialized topics and other units of content that are tailored to particular structural requirements. Content referencing (conref) enables fragments of content to be reused from a single source. Conditional processing permits a single source to support the needs of multiple audiences.
The current version of DITA is specified in the OASIS Darwin Information Typing Architecture (DITA) Standard v1.0. The OASIS DITA Technical Committee is currently working on version 1.1.
The OASIS site http://www.oasis-open.org/specs/index.php#ditav1.0 contains the official versions of the following documents:
The following article provides additional information:
Introduction to the Darwin Information Typing Architecture - IBM developerWorks, 01 Mar 2001, Updated 28 Sep 2005
This document is a roadmap for DITA: what it is and how it applies to technical documentation.
Editors for the Architecture area:
Technical work produced by the OASIS DITA Technical Committee includes:
The compressed file dita-document-definitions-1.0.1.zip contains version 1.0.1 of the DITA DTDs and Schemas. Version 1.0.1 contains all of the fixes described in this document: Fixes for OASIS DITA 1.0. It also contains a change document 101changes.txt (one for the DTDs, and another for the Schemas). These files list all actual bug changes, including the changed catalog for DTDs. They also contain a pointer to the web page above for a full description of the changes.
A consolidated zip file with all specifications, DTDs, and Schemas is publicly available in the documents section dita10.zip.
This document is revision #4 of cd1.zip. The document details page shows the complete revision history.
The first committee draft (before revision), cd1.zip is still available in the Documents section.
What's new in DITA 1.1:
Version 1.1 of DITA provides enhanced print publishing capabilities with new DITA Bookmap specialization, including extended book metadata. The standard offers more indexing capabilities with new elements for "see" and "see-also" references. It features new elements for defining structured metadata as well as the ability to add new metadata attributes through specialization.
The specification files (HTML and PDF Spec documents, DITA source, DTDs and Schemas) are available here:
http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=dita#technical
The Overview is a great entry point into the various parts of the DITA 1.1 standard:
http://docs.oasis-open.org/dita/v1.1/OS/overview/overview.html
Previous versions of the DITA specification