
Last updated: 09/09/2010
Document Transcription and Markup
4.1 The Papers of Abraham Lincoln (PAL) will run every Lincoln document through three stages of transcription and markup.
STAGE 1 TRANSCRIPTION
4.2 Stage 1 transcription encompasses basic structural transcription and markup. Transcribers will use the document image to transcribe the source text and use markup to best replicate the appearance of the source text.
Stage 1 Elements/Tags
4.3 Stage 1 elements/tags are derived from the PAL DTD, and only represent a selection of elements not the entire range of elements available in the DTD. There are three groups of Stage 1 elements: header elements, structural elements, and editorial elements.
Header Elements/Tags
4.4 Header elements/tags allows the transcriber to enter metadata regarding the document. By Stage 1 transcription, many of these elements will already be populated with data. (To view this structure, open X-Metal and select "New Document," tags on view.)
<PAL></PAL>: This element encloses all <doc>s.
<doc></doc>: This element is the main container of every document. It has two main sections, <Header></Header> and <div1></div1>. Make sure the value in the attribute "docDesc" matches the wording of the document name in <docTitle>. Also check the attribute "docClass" to see if it has the proper value (e.g. "letter").
attribute "docDesc"= e.g. "Abraham Lincoln to Mary Lincoln"
attribute "docClass"= e.g. "letter"
<Header></Header>: This element is the first of two main containers appearing within <doc> (<div1></div1>being the other one). <Header></Header> contains all editorial data, including: provenance, source links, image links, workflow data, and all other non-source text data and markup. See: <AccessionInfo></AccessionInfo>, <PublicationInfo></PublicationInfo>, <Workflow></Workflow>, and <docDateInfo></docDateInfo>.
<AccessionInfo></AccessionInfo>: In<Header></Header> this element encloses <bibl></bibl>. It is used to tag information regarding the location of original documents.
<bibl></bibl>: A child of <Header></Header>, in <AccessionInfo></AccessionInfo> it encloses specific bibliographic information about the document. See <biblScope></biblScope>, <box></box>, <Extent></Extent>, <MSType></MSType>, <link></link> , <pages></pages>, <title></title>, and <vol></vol>.
<MSType></MSType>: Contained in <AccessionInfo></AccessionInfo> <bibl></bibl>, this element encloses manuscript type, e.g. "Autograph Letter Signed." See MANUSCRIPT SYMBOLS.
<Extent></Extent>: Contained in <AccessionInfo></AccessionInfo> <bibl></bibl>, this element encloses a number which indicates the number of pages in the source text. It should always match the number of images for the source text.
<box></box>: In <AccessionInfo></AccessionInfo> <bibl></bibl>, this element encloses a box number or identification that is part of the location of an original document.
<vol></vol>: In <AccessionInfo></AccessionInfo> <bibl></bibl>, this element encloses the volume number or identifier of the document source.
<pages></pages>: In <AccessionInfo></AccessionInfo> <bibl></bibl>, this element encloses the page number(s) of source.
<Link></Link>: This element allows the editor to link his or her document to a number of other records both within and without the document title. For instance, <link> to document images and sources by clicking the "Copy a link" hyperlink in the desired image or source record and pasting the pre-formed <link></link> element in the host document's <bibl></bibl>. Likewise, <link> two documents by using <link></link> within the <RelatedDocuments></Related Documents> element. (See Related Documents, 4.13.)
</bibl>: This tag closes <bibl>
</AccessionInfo>: This tag closes <AccessionInfo>
<PublicationInfo></PublicationInfo>: In <Header></Header>, this element encloses <bibl></bibl>. The bibliographic information contained here describes where the document had previously been published (e.g. <bibl>The Collected Works of Abraham Lincoln <biblScope>1:26</biblScope> </bibl>).
<bibl>
<biblScope></biblScope>: Typically appearing inside <PublicationInfo></PublicationInfo> <bibl></bibl>, this element encloses the volume and page numbers of a published source.
<title></title>: When used in <Header></Header> (usually in <PublicationInfo></PublicationInfo> <bibl></bibl>), this element encloses the title of a publication in a citation. The attribute "level" assigns a specific kind of publication and will display the title in the appropriate citation style.
</bibl>
</PublicationInfo>: This closes <PublicationInfo>.
<Workflow></Workflow>: In <Header></Header>, this element encloses one or more <WorkEntry></WorkEntry tags . <Workflow></Workflow> includes details describing the progress made on the document as it goes through the editing process.
<WorkEntry></Work Entry>: In <Header></Header> this child element of <Workflow></Workflow> is set with attributes that contain specific information regarding the progress of work on the document. The attribute "resp" contains the capitalized initials of the person responsible for this stage. The attribute "contribRole" allows for the choice of a number of work roles for the person responsible for this stage (e.g. "transcriber"). The attribute "stage" allows for the choice of a number of work stages (e.g. "Stage 1 Transcription"). <WorkEntry></Work Entry> can contain a <comment> for inclusion of details regarding any special circumstances that arose during that stage of work. .
"resp"= "ABC" (initials of transcriber)
attribute "contribRole"= "transcriber"
attribute "stage" = "Stage1 Transcription"
<comment>
</WorkEntry>: This closes <WorkEntry>
</Workflow>: This closes <Workflow>
<docDateInfo/>: This element is used to denote the date associated with a source text. This element always appears in <Header></Header>, but will also appear in substantive <endorsement>s (that have their own date) in <div1></div1>.
<docTitle></docTitle>: This element always appears in <Header></Header.> and encloses the editorial text indicating the document name (e.g. "Abraham Lincoln to Mary Todd"). The wording should match that appearing in the docDesc attribute of <doc></doc>. See DOCUMENT NAMING GUIDELINES.
</docTitle>: This closes <docTitle>
<RelatedDocuments></Related Documents>: In <Header></Header>, this element encloses <links> to other documents that may have some connection or relevance to the document. (See Related Documents, 4.13.)
</RelatedDocuments>: This closes <Related Documents>.
</Header>: This closes <Header>
Structural Elements/Tags
4.5 Structural elements/tags allows the transcriber to replicate the basic layout of the source text page.
<div1></div1>: This element appears after <Header></Header> and it encloses all <div2></div2> elements in a document.
<div2></div2>: This element appears within <div1></div1> and encloses all textual material in a document. Endorsements and other additional passages of source texts following the main body of the document should be placed in distinct <div2></div2> elements.
<LetterHead></Letterhead>: This element encloses all printed opening information at the top of a letter exclusive of dateline. Formatting icons on the toolbar or the "rend" attribute allows the positioning of <LetterHead></Letterhead>.
<dateLine></dateLine>: This element encloses <place></place> and <date></date>, and the corresponding source text indicating the location at which, and the date on which, the letter was written. Although it can appear in other places, this tag typically inhabits the upper right-hand corner of a letter. Formatting icons on the toolbar or the attribute "rend" allows the placement of dateline in various locations.
attribute "rend" =(default right) allows placement of dateline in various locations.
attribute "value" = yyyy-mm-dd
<row>: Encloses one horizontal row in a table and is populated by <cells>
<space/>: This singleton element is used to fix a set amount of blank space in a document.
<list></list>: This element encloses a list of items in source text, such as a single-column list of signers. Individual items on the list are enclosed in <item></item>. Formatting icons on the toolbar or the attribute "rend" allows positioning of <list></list>.
attribute "rend"= allows positioning of list
attribute "rend"= allows positioning of individual item
attribute "rend"= allows positioning of closing
<ps></ps>: This element encloses the author's postscript or N. B.
attribute "n"= number of document page following page break
<br/>: This singleton element indicates the location of a line break.
attribute "rend" = allows placement of postmark
<address> encloses the addressee's name and/or title, and location
attribute "rend" = allows placement of address
<addressLine> encloses a single line of a multi-line address
</div2>: This closes <div2>
</div1>: This closes <div1>
</doc>: This closes <doc>
</PAL>: This closes all <docs>
Editorial Elements/Tags
4.6 Editorial elements/tags allow the transcriber to replicate the appearance of individual words from the source text. (Unless otherwise indicated, all of these elements apply to source text rendered in the elements listed above.)
<print></print>: This element encloses printed text in a partially printed document. Not used in <LetterHead></LetterHead> or in an all-print documents.
<r></r>: The rend element allows enclosed text to be rendered or displayed in a variety of ways.
Render text in one of the following values:
attribute "as"= superscript, italic, bold, bold italic, subscript, underscore, double underscore, smallcaps, allcaps, superscript-underscore.
Note: All variants of underscored superscripts will be standardized to the single superscript-underscore. (7/6/2009)
Note: With the exception of superscript-underscore, all the functions of <r></r> can be performed using the formatting icons on the toolbar.
<add></add>: This element encloses authorial or editorial additions (insertions ) to source text.
<del></del>: This element encloses deletions or strikeouts in source text. Used with <unclear></unclear> or <gap/> when deleted source text is unreadable.
<endnote></endnote>: In <doc><doc>, this element encloses the text of an editorial note (e.g. "Mary Todd" written over "Ann Rutledge"). Used with <ref></ref>.
<ref></ref>: Used in <div1><div 1> to indicate the placement of an <endnote></endnote> reference point in the text of the document.
<unclear></unclear>: This element encloses source text that is hard to read or obscured. Used in combination with <gap/> for illegible source text as in: <unclear><gap/></unclear>. Used with <del></del> when deleted source text is unreadable.
<gap/>: When used alone, this singleton element indicates a substantial amount of missing source text, such as when an entire page is missing from a multi-page document. More often, however, <gap/> will either be used in combination with <unclear></unclear> or in conjunction with both <del></del> and <unclear></unclear>.
<frac></frac>: This element encloses a numerical fraction in source text. It automatically comes with the child elements <num></num>, to enclose the numerator, and <denom></denom> to enclose the denominator.
<num></num>: A child element of <frac></frac>, this element encloses the numerator in a numerical fraction in source text.
<denom></denom>:A child element of <frac></frac>, this element encloses the denominator in a numerical fraction in source text.
Stage 1 Transcription Conventions
Editorial Policy Manual: Selective Book Edition
4.7 Except where specified, editors will continue to follow the transcription conventions developed for the documents in the Selected Edition. (10/24/05)
Abbreviations (11/05/07)
4.8 The editorial staff will standardize spaces between abbreviated elements:
Do not use spaces between abbreviated elements of geographic terms:
U.S., N.Y.
Do not use a space within abbreviations of time of day:
a.m., p.m.
Do not use a space with academic degrees:
M.D., LL.D.
Use a space following initials standing for personal names, whether there is a period or not in the original document:
E. M. Stanton, A. Lincoln
G Welles, S P Chase
Use a space following abbreviated elements of a title or rank:
Lt. Col.
Adj. Gen.
Asst. Postmaster
P. M.
Punctuation
4.9 Ending punctuation is expressed in a variety of ways. Some of them are:
end ###
end ___
end –
end. __
end (smudged mark where a period should be located)
Use supplied commas to break up lists. (06/19/08)
All variants of underscored superscripts will be standardized to the single superscript-underscore. The attribute "rend" allows words or letters to be underscored superscripted.
|
Mr |
} |
Mr |
|
M |
||
|
M |
||
|
M |
(7/6/09)
Spelling and Capitalization
4.10 In Stage 1, transcribers follow a general policy of diplomatic transcription. Do not correct spelling or capitalization; where capitalization of words is questionable, employ modern usage. Encoding of misspelled words to allow a search on the correct spelling will occur in Stage 2 Markup.
Header Elements/Tags
<AccessionInfo>
<bibl>
<MSType></MSType>
<Extent></Extent>
Collection (optional)
Folder (optional)
<box></box> (where applicable)
<vol></vol> (where applicable)
<pages></pages> (where applicable)
<Link> source </Link>
<Link> document image </Link>
</bibl>
</AccessionInfo>
Add all the legal document numbers for any document that is already assigned a number. (12/8/05)
Record oddities of a document, such as damage, in the <AccessionInfo></AccessionInfo> element of the header. (12/13/05)
If a document is located, but the owner/repository will not allow editors to make any image, create a PubMan record for the document with as much information as known, and in the <AccessionInfo></AccessionInfo> element, include "Note: Unable to image, 2006". (2/23/06)
If a previously reported document is not found in the proper repository, include "Note: Missing, 2006" in the <AccessionInfo></AccessionInfo> element. (2/23/06)
If we encounter a pressbook copy after having located the original letter, add a second <bibl></bibl> to <AccessionInfo></AccessionInfo to indicate the presence of a pressbook copy at a different repository. Include the <MSType></MSType>, <Extent></Extent>, and other bibliographic information, including a link to the repository. (9/4/08) See Pressbook Copies, 1.9.
If we encounter an original, sent letter after having located the pressbook copy, add a new <bibl></bibl> to <AccessionInfo></AccessionInfo> before the pressbook copy <bibl></bibl>. Include the <MSType></MSType, <Extent></Extent>, bibliographic information, and link to the repository in the new <bibl></bibl>. (9/4/08) See Pressbook Copies, 1.9.
Attributes
docClass
docDesc
<Workflow>
<workEntry>
attribute "resp"= "ABC" (initials of transcriber)
attribute "contribRole"= "transcriber"
attribute "stage" = "Stage1 Transcription"
<comment>
</workEntry>
</Workflow>
<docDateInfo/>
attribute "value" = yyyy-mm-dd
attribute "YearCert" = certain (default) or inferred or uncertain
attribute "MonthCert" = certain (default) or inferred or uncertain
attribute "DayCert" = certain (default) or inferred or uncertain
attribute "EarliestSpecDate" = e.g. "1861-03-04"
attribute "LatestSpecDate" = e.g. "1865-04-15"
Related Documents
4.13 To link related documents, click the hyperlink "Copy link to this document as a related document" in the <Header></Header> of the document you wish to link to, then paste the pre-formed <link></link> element in the host document's <RelatedDocuments></Related Documents> element. (See Enclosures, 4.21.)
Workflow and Workentry
4.14 When transcribing, editors will enter workflow information between the accession section of the document header and the text information of the document. (10/24/05)
Place new <WorkEntry></WorkEntry> elements at the beginning of the entries within <Workflow></Workflow>. (06/19/08)
Structural Elements/Tags
Attributes
<date></date>
attribute "value" = yyyy-mm-dd
<space/>
attributes "dim"= horizontal or vertical
attribute "extent" = sets the size of the space
attribute "string"= sets space size to match word string
attribute "units"= use the default em unit size
<pb/>
attribute "n"= number of document page following page break
Address and Address Line
Document Dates
attribute "value" = yyyy-mm-dd
attribute "YearCert" = certain (default) or inferred or uncertain
attribute "MonthCert" = certain (default) or inferred or uncertain
attribute "DayCert" = certain (default) or inferred or uncertain
attribute "EarliestSpecDate" = e.g. "1861-03-04"
attribute "LatestSpecDate" = e.g. "1865-04-15"
Use "inferred" when the date can be approximated from other documents or context.
Use "uncertain" with XXs in month, day, or year.
Do not use "inferred" with Xxs or "uncertain" with numbers.
1862-XX-XX
18XX-05-XX
18XX-09-29
Use the following ranges in <docDateInfo/> for documents that fall into these categories:
President: 1861-03-04 to 1865-04-15
President-Elect: 1860-11-06 to 1861-03-03
Example: 1861-1864-11-13
When no date information is known and the date range is indeterminable, use XXs for missing parts of date in both the "value" attribute of <date></date> and <docDateInfo/> and in the "sortkey" attribute of <doc></doc>. (11/13/06)
Example: 18XX-XX-XX
Dateline
Endorsements
4.19 Use the <div2 type="endorsement"><div2> element to enclosed one or more additional passages of source texts following the main body of the document. Each endorsement should have a separate <div2></div2> element. The attribute "type" indicates Certification, Endorsement, or Docketing:
<div2 type="certification"><div2>: Use this element for an attestation or certification on an official document. An attestation or certification typically testifies to the truth or authenticity of the main document. They are often written by a clerk or some court official, but can also be written by a private individual. Examples include attestations by witnesses; certifications by clerks;"a true copy" or "copy" by secretaries or others, signed or not; witnessing by justice of the peace or notary publics; or certification by clerks of office holder’s position
Note: Former value of <attestation> collapsed into <certification> (9/25/09)
<div2 type="endorsement"><div2>: Use for endorsements that express orders or forward to another office.
<div2 type="docketing"><div2>: Use for endorsements that are part of a filing system, summarize the contents of the document, or describe action taken by others.
Transcribe endorsements from the top to the bottom of the page in the order they appear on the page. The attribute "order" assigns the chronological order of the endorsement in relation to other endorsements in the same document.
attribute "order"= the chronological placement of this endorsement in relation to other endorsements in the same document.If a date is present in substantive endorsements, include the date information in a <docDateInfo/> element immediately inside the <div2 type="endorsement"><div2> element. Do not use <docDateInfo/> with purely docketing endorsements.
<docDateInfo/> commonly used in <endorsement></endorsement> with type "Endorsement"
attribute "value" = yyyy-mm-dd
attribute "YearCert" = certain (default) or inferred or uncertain
attribute "MonthCert" = certain (default) or inferred or uncertain
attribute "DayCert" = certain (default) or inferred or uncertain
attribute "EarliestSpecDate" = e.g. "1861-03-04"
attribute "LatestSpecDate" = e.g. "1865-04-15"
Use "inferred" when the date can be approximated from other documents or context.
Use "uncertain" with XXs in month, day, or year.
Do not use "inferred" with Xxs or "uncertain" with numbers.
For the red letters and numbers that often appear at the top of the tri-folded documents from the War Department as part of their filing system, use the <table></table> element:
<table><row><cell>74C.</cell><<cellwidth="50"></cell><cell>1862</cell></row> </table>
This coding creates a one-row, three-cell table, where the middle cell is empty and its width is 50, which means that the middle cell is 50 percent of the width of the table. The effect is to separate the two chunks of text sufficiently, as they are in the document. (1/10/06) See the conventions for the <table></table> element in Policy 4.31.
In an <div2 type="endorsement"><div2> element use a <p></p> element for a block of text or paragraphs. Use the <br/> element for short broken entries where line breaks are to be retained. (12/13/05)
In endorsements, do not use <p></p> unless the endorsement has a paragraph indent or contains multiple paragraphs.(3/30/06)
Telegrapher's markings on both sender's and recipient's copy of telegrams are treated as endorsements with the docketing attribute. (5/17/06)
Envelopes
4.20 Use the <div2 type="envelope"><div2> element to enclose a separate envelope. Do not use for a back of letter address. When the envelope encloses a single document, place the text in a <div2 type="envelope"><div2> after the letter text . When the envelope encloses several documents, such as a letter with several enclosures, place the <div2 type="envelope"><div2> after the last <div2 type="enclosure"><div2> element. For addressed envelopes without an associated letter, use the <div2 type="envelope"><div2> element around the entire document. Use the <postmark></postmark, <address></address>, and <addrLine></addrLine> elements to format the envelope. (4/3/2006; revised, 6/19/08)
Enclosures
Letterhead
Example of tagging for a letter on "Executive Mansion" letterhead:
Do not include pre-printed <dateline></dateline> elements in the <LetterHead></LetterHead> element.
Ignore graphics in letterheads, but include stylized text where possible. (12/15/05)
Page breaks
Blank pages are tagged with successive <pb/> elements. (12/13/05)
Word breaks Between Pages
Places
Do not tag references to general geographic areas such as "the South", "slave states", "the West", etc. (12/12/05)
Do not use a <place></place> element for references to the United States, or the U.S. (12/12/05)
Postmarks
Postscripts
4.26 Use the <postscript></postscript> element to enclose an author’s P.S. or N.B.
Seals
Signed
Space
attributes "dim"= horizontal or vertical
The attribute "extent" sets the number of units the space will occupy (see attribute "units").
attribute "extent" = sets the size of the space
attribute "string"= sets space size to match word string
attribute "emendedperiod"= for spaces at end of sentence use yes, if not use no
The attribute "units" will default to em, the recommended unit size for setting a space.
4.31 For tabular formations, use the <table></table> element to enclose <row>s and <cell>s to create a table. In addition to its use with tabular formations, <table></table> is useful in formatting many different arrangements of source text, such as a multi-line long bracket or a multi-column list of petition signers.
The attribute "rules" allows the choice of dividers for rows, columns, all, or none (default). The attribute "border" allows for the choice of a border fixed on the left or right, with a default of none. The attribute "width" allows the option to expand the table horizontally in proportion to the display document width. The value of this attribute should be expressed as a percent.
attribute "rules" = (default none) provides dividers for rows, columns, or all
attribute "border" = (default none) provides borders for left or right
attribute "width" = provides the option to expand or contract table horizontally
attribute "rows"= sets the size of cell based on adjacent rows
attribute "cols"= sets the size of cell based on adjacent columns
attribute "align"= sets the horizontal position of text in cell
attribute "valign"= sets the vertical position of text in cell
attribute "border"= various border options for individual cell
attribute "width"= adjusts the percent of space occupied by cell in relation to other cells in that row
Editorial Elements/Tags
Attributes
<r></r>
<add></add>
<del></del>
attribute "type"= indicate whether strikethrough, diagstrikethrough, or erasure
<gap/>
attribute "reason"= illegible, damaged, or missing
<unclear></unclear>
Authorial or Editorial Additions (Insertions) to Source Text
Additions to the body of the text in another hand should be treated as insertions, not endorsements. The "hand" attribute of <add></add> will identify the difference in handwriting. (9/25/09)
Deletions or Strikeouts in Source Text
4.34 Editors will render stricken material using the <del></del> element, not the render as strikeout tagging. (11/18/05)
Use the <del></del> element to enclose deletions or strikeouts in source text. The attribute "type" indicates whether the deletion is a strikethrough (default), diagstrikethrough, or erasure.
attribute "type"= indicate whether strikethrough, diagstrikethrough, or erasure
When a deletion of source text is unreadable use <del></del> in combination with the <unclear></unclear> and <gap/> elements, as in, <unclear><del><gap/></del></unclear>.
To tag an illegible stricken passage in source text, use the combination <unclear><del><gap/></del></unclear>. In this scenario, be sure to set the <gap/> attribute "reason" to "illegible."
As a general rule, do not use <place></place>, <person></person>, and <org></org> elements in stricken material. In some circumstances, such as when Lincoln's name is stricken, use the <person></person> element tag. Do provide textual render elements such as superscripts in stricken text.
Editorial notes
4.35 Use the <endnote></endnote> element to enclose the text of an editorial note. Use in tandem with the <ref></ref> element.
Copy the value of the attribute "url" to paste in the "refurl" attribute of <ref></ref>. See <ref></ref>.
attribute "url"= copy the value that appears in this attribute.
For use of <endnote></endnote> in tandem with <ref></ref>, see 4.40.
Forged Material
4.36 Annotation of forged material should be done in Stage 1 transcription. Use the <endnote></endnote> element to enclose these annotations. (5/18/2006)
For documents that are in scope, onto which Lincoln forgeries have been added, transcribe only the legitimate document. (5/17/06)
To indicate the omission of forged text, use the appropriate elements, such as <div2 type="endorsement"><div2> , and within the element, note using the <endnote></endnote> element that "this endorsement is a forgery." (5/18/06)
Write-overs
4.37 Annotation of write-overs should be in Stage 1 transcription. Use the <endnote></endnote> element to enclose these annotation. (5/18/2006)
Use the endnote element to describe situations when an author writes letters or punctuation over existing letters or punctuation or replaces them. (12/8/05)
Example: "has" with "ve" written over the "s"
Transcribe as: "have" then go to the end of the document text, choose the <endnote></endnote> element, and inside the <endnote></endnote> attribute or url, copy the url provided; then, put the cursor after "have" in the <Div2></Div2>, choose <ref></ref>, and in attribute Refurl, paste url copied from the <endnote></endnote>.
Letters in one word written over letters to make a new word will be annotated in a endnote as: "word 1" changed to "word 2"; numbers written over numbers, in whole or in part, will be treated as complete words for this purpose. (12/20/02)
"county" changed to "country"
"vaggeness" changed to "vagueness"
"1862" changed to "1863"
"7" changed to "9"
Words written over partial words will be annotated in a footnote as: "word 2" written over "partial word 1" (12/20/02)
"Mr." written over "L"
The key distinction is whether "word1" is a complete word or a partial word. In each case, transcribe "word 2" in the document. (06/19/08)
<endnote><p>"has" changed to "have".</p></endnote>
<endnote><p>"Mr." written over "L".</p></endnote>
Missing Source Text
4.38 For illegible, unrecoverable passages in a source text use the <gap/> element in conjunction with the <unclear><unclear> element. (11/16/05)
The <gap/> attribute "reason" allows for the choice of the following values: illegible, damage, missing. poor handwriting, poor copy, stricken, ink bleed through, or damaged.
attribute "reason"=illegible, damage, missing
When a substantial amount of source text is missing, such as when an entire page is missing from a multi-page documents, use the <gap/> element on its own. In this case, the attribute "reason" should be set to the value "missing."
The <gap/> element can only be used alone to indicate a missing portion of text, such as missing page(s). The reason attribute is missing.(11/16/05)
Reference Points in Source Text
4.39 To indicate the placement of an <endnote></endnote> reference point in the text of the document, use the <ref></ref> element. After copying the value of "url" in the <endnote></endnote> element, paste the value in the <ref></ref> attribute "refurl." The <ref></ref> will display as an endnote number at the chosen reference point and clicking on the link will take the reader to the <endnote></endnote> in <doc></doc>. See conventions for <endnote></endnote>.
attribute "refurl"= paste the copied value from <endnote></endnote>
For more on the use of <ref></ref> in tandem with <endnote></endnote>, see 4.35.
Illegible or Obscured Source Text
4.40 For text with words or parts of words that are obscured (as a result of damage) or that are illegible, use the <unclear></unclear> element. The attribute "reason" allows for the choice of the following values: poor handwriting, poor copy, stricken, ink bleed through, or damaged. (11/16/05).
attribute "reason"= poor handwriting, poor copy, stricken, ink bleed through, or damaged
Ex: <unclear>Koerner</unclear>. It will be rendered as: [Koerner] and have the reason values: poor handwriting, poor copy, stricken, bleedthrough, damaged.
Apply the <unclear></unclear> element to full words or multiple words (not to individual letters in words). (11/16/05)
For illegible, unrecoverable passages in a source text use the <gap/> element in conjunction with the <unclear><unclear> element. (11/16/05)
Example: <unclear><gap/></unclear>. It will be rendered as [...] In this scenario, set the <gap/> attribute "reason" to "illegible," "damage," or "missing." (11/16/05)
For illegible strikeouts, use <gap/> inside <unclear></unclear> and <del></del> as in <unclear><del><gap/></del></unclear>. (11/16/05)
To tag an illegible stricken passage in source text, use the combination
<unclear><del><gap/></del></unclear>. In this scenario, be sure to set the
<gap/> attribute "reason" to "illegible." It will be rendered as: [...].(11/16/05)
See the conventions for <unclear></unclear>, <del></del>, and <gap/>.
Dates in Source Text
4.41 If a date is mentioned in the body of a letter, do not tag it. Use the <date></date> element only for the date of the document or the date of the endorsement and only in a <dateline></dateline> element. (12/8/05; revised 11/15/06)
Dates referring to other correspondence or events within the text may be tagged as a link to another document. (12/13/05)
Partially Printed Source Text
4.42 The <print></print> element is used only in partially printed documents. (11/00/05; revised 11/06/06).
Use <space/> both before and after filled-in, handwritten portions in a partially printed document (such as in a partially printed dateline). The attribute "dim" should be "horizontal", attribute "extent" should have a value of "2", and the attribute "units" should already be preset to the value "em". Indicate a blank space in a partially printed document by setting the value of extent at "5".
Exception: do not use <space/> before or after a fill-in immediately adjacent to printed text as in "<print>186</print>2<space/>" or "<print>Date</print>s<space/>"or <space/>s<print>he</print>. (10/30/07; revised, 6/19/08)
Example of partially printed dateline:
<LetterHead>Executive Mansion,</LetterHead>
<dateline><print><place refurl="___">Washington</place>,</print><space/><date value = 1862-05-17>May 17,<space/><print>186</print>2</date><space/><print>.</print> </dateline>
Events in Source Text
4.43 Transcribers should not tag events. (1/00/06)
Page Numbers in Source Text
4.44 Do not transcribe page numbers of letterbooks or pressbooks. (11/14/06)
Blank Space in Source Text
4.45 The element <space/> is used when a blank is left open or is filled-in with text. (5/18/06)
Use the em dash unit as follows:
5 em dashes for an empty "fill in the blank" or missing "186" in <date> in pressbook copy
2 em dashes for either side of a "fill in the blank" with text added
Editorial supply of text
4.46 For the editorial supply of text, not based on words in the original text use the <supplied></supplied element. (11/16/05)
Lists in Source Text
4.47 Use supplied commas to break up lists. (06/19/08)
Print Variations in Source Text
4.48 Reserve italics for documents that are partially printed and combine italic and roman print. Do not use italics for the executive mansion letterhead (that is considered script.)
The print on Commissions are also generally script and not italic. (4/3/06)
STAGE 2 MARKUP
Stage 2 Elements/Tags
4.49 Stage 2 elements/tags are derived from the PAL DTD, and only represent a selection of tags not the entire range of tags available in the DTD. By Stage 2, most of the header and structural elements should have been filled. Stage 2 tags allows the transcriber to further encode individual words from the source text to ease searching and clarity.
<sic></sic> attribute "corr" = : This element encloses misspelled words in source text. The attribute "corr" allows the transcriber to insert the correct spelling of the words.
attribute "corr" = correct spelling
<sic corr= "Embarras">Ambraw</sic>
<orig></orig> attribute "reg" = :
<abbr></abbr> attribute "expan" = :
<supplied></supplied>:
<foreign></foreign>: This element encloses source text (words or phrases) of a non-English language. Not used if whole document is non-English.
<docAuthor></docAuthor>:
<docContributor></docContributor>:
<addressee></addressee>:
<person></person>:
<personRef></personRef>:
<place></place>:
<placeRef></placeRef>:
<orig></orig>:
<org></org>:
<orgRef></orgRef>:
<name></name>:
<WorkEntry></WorkEntry>:
Stage 2 Markup Conventions
Document Author
4.50 The<docAuthor></docAuthor> element should reflect by whose authority the document was prepared. The "hand" attribute should reflect who actually penned the document. If attributed to Lincoln, then the editor should enter Lincoln's refurl in the "attribute" attribute.(11/18/05)
Printed documents such as newspapers, which contain documents that are addressed to Lincoln, such as open letters or petitions, will have no document author. (11/18/05)
Work Entry
4.51 Transcribers should enter Stage 2 workflow information between the accession section of the document header and the text information of the document. (10/24/05)
Place new <WorkEntry></WorkEntry> elements at the beginning of the entries within <Workflow></Workflow>. (06/19/08)
Use all 3 initials in all caps to identify yourself as the responsible person in the <Workflow>. Example: SGK (12/7/05)
Endorsements
4.52 In Stage 2, transcribers will make a distinction within the endorsement elements between "endorsement" and "docketing."
Abbreviations
4.53 When necessary to understand the word or document, abbreviations are to be expanded for clarity.
Use the <abbr></abbr> element to enclose abbreviations to be expanded. The attribute "expan" allows the transcriber to insert the expanded abbreviation.
<abbr></abbr> attribute "expan" = expanded abbreviation
In the <abbr></abbr> element, the default value of the "clarify" attribute is "no", meaning the expanded abbreviation will not appear in the diplomatic view of the transcription. When the value is "yes", the expanded version of the abbreviation will appear in the diplomatic view of the transcription as a bracketed expansion following the abbreviation. Expansions of all abbreviations will appear in the emended view. The "clarify" attribute works the same way with the <sic></sic> tag. (12/08/05
Expand all abbreviations except the following:(12/8/05)
| Mr. | Gen. | No. |
| Mrs. | Hon. | Dr. |
| v. | Rev. | et al. |
| etc. | M.D. | U.S. or U. States |
| first or middle initials of people | ||
For spacing between abbreviated letters, see the Stage 1 conventions for abbreviations.
Regularize the use of "&c" to "etc." Use the <orig></orig> element with an attribute value of "reg."(12/8/05)
Example: <orig reg="etc.">&c.</orig>
When expanding abbreviations, use the same case for initial letters as the abbreviation.
Example: <abbr expan= "Mademoiselle">Mme.</abbr>. (12/7/05)
Searches in dtSearch search the raw XML data, so some abbreviated parts of a title or name need to be expanded as a whole name or title.
Example: Sec. of War <abbr expan: "Secretary of War">Sec. of War</abbr>.(12/7/05)
Contractions are not abbreviations and they are not expanded. (12/7/05)
Multiple abbreviated words are placed within one abbreviation expansion element. (12/8/05)
Example: <abbr expan="Secretary of Interior">Sec. of Int.</abbr>
Expand "do" to "ditto" when appropriate. (12/13/05)
Example: <abbr expan=ditto>do</abbr>
For an abbreviation of a city or state in combination with a spelled out city or state, use the <abbr></abbr> element in conjunction with the <place></place> element and expand the <abbr></abbr> item only. (12/15/05)
Example: <place>Columbus,<abbr expan="Ohio">O.</abbr></place>
For an abbreviation of a name, use the <abbr></abbr> element in conjunction with the <person></person> element, and expand the last name only, unless a first or middle name is abbreviated beyond the initial. (12/15/05)
Example: for Mr. B, transcribe: Mr. <person><abbr expan="Billings">B</abbr></person>
Example: for A. L., transcribe: <person>A.<abbr expan="Lincoln">L.</abbr></person>
Example: for Wm Jones: <abbr expan="William">Wm</abbr> Jones</person>
Initials for military units with an uncertain expansion, use and <abbr></abbr> element, without an expansion pending further research. (12/15/05)
Example: <abbr>I.V.M.</abbr>
Punctuation as part of an abbreviation is included within the element. When the punctuation also provides closing punctuation for a sentence include closing punctuation in expansion. (12/08/05)
Example: <p>send this to the <abbr expan=House of Representatives.>H.R.</abbr></p>
Misspelled Words in Source Text
4.54 Use the <sic></sic> element to encode correct spelling of every misspelled word to allow a search on the correct spelling, and to provide correct spelling for the emended view. (10/24/05)
The attribute "corr" allows the transcriber to insert the correct spelling of the words.
attribute "corr" = correct spelling
Example: <sic corr= "Embarras">Ambraw</sic>
There is an exception for "special documents" which are vernacular in nature--in that they are wholly or largely written in a phonetic or native speaking or language style. Because the intent is to retain the flavor of the document, the emended view will be the same as the diplomatic view with regard to spelling. (11/16/05)
|
TAGS |
DIPLOMATIC VIEW |
EMENDED VIEW |
|
<sic corr= "Embarras">Ambraw</sic> |
Ambraw [Embarras]* |
Embarras |
|
<sic corr= "balance">ballance</sic> |
ballance |
balance |
|
<sic corr= "cough">coff</sic> |
coff [cough] |
cough |
|
<sic corr="president">Prezdint</sic> |
Prezdint Lincoln |
President Lincoln |
*XMetal will have a clarify attribute switch in <sic></sic> so that in diplomatic view a really strangely spelled word, such as "Ambraw," is followed by a correct spelling in brackets [Embarras]. (11/16/05)
Foreign (Non-English) Words in Source Text
4.55 Use the <foreign></foreign> element to enclose source text (words or phrases) of a non-English language. Do not use if the whole document is non-English. The attribute "lang" indicates the appropriate language.
attribute "lang"= choose the appropriate language
Editors will use a <foreign></foreign> element for a word or phrase in a foreign language and provide the language and translation as attributes. Editors will use a global "language" attribute for all elements of a document to indicate whole documents in a foreign language or translations of parts of foreign language documents. (11/18/05)
American Spellings of British Words
4.56 For American spellings of British words, use the <orig></orig> element and with the attribute value of "reg". Do not apply this tag to proper names such as Ford's Theatre.(12/13/05)
Example: <orig reg="flavor">flavour</orig>
Persons and References to Persons in Source Text
4.57 Use either the <person></person> or <personRef></personRef> element to encode persons referenced in source text.
If any part of a person's name is present, then use a <person></person> element. Do not include any title inside the <person></person> element. (12/8/05)
Example: "Judge Davis" would be tagged Judge <person refurl="DA0000">Davis</person>
Example: "A.L." would be tagged <person refurl="LI00006">A.L.</person>
Use <personRef></personRef> when no part of the name is present to refer to a specific person. (12/8/05)
Example: "the judge," meaning David Davis, would be tagged the <personRef refurl="DA0000">judge</personRef>.
With federal and state official titles alone as a reference to a specific person, use a <personRef></personRef> element around the whole title. (12/8/05)
Example: <personRef refurl="LI00006">President of the United States</personRef>.
For titles of officials of private corporations used to reference a specific person, include an <org></org> element. (12/8/05)
Example: <personRef refurl="JO00000">President</personRef> of the <orgRef refurl="IL00000">Illinois Central Railroad</orgRef>
For titles of officials of municipalities and foreign countries referring to a specific person, include a <place></place> element. (12/8/05)
Example: <personRef refurl="PE0000">President</personRef> of <place refurl="ME0000">Mexico</place>.
For military ranks referring to a specific person, use a <personRef></personRef> element with the rank, and omit the specifics of the military unit if provided. (12/8/05)
Example: <personRef refurl="XX0000">Colonel</personRef> of the 17th Illinois Volunteers.
No matter how many times a person is mentioned in a document, use the <person></person> element every time. (12/08/05)
Do not tag pronouns with <personRef></personRef> elements. Do not use the <personRef></personRef> element for titles with a name. (12/08/05)
Example: Captain <person>Kangaroo</person>
An attribute of <personRef></personRef> is "regularize". But, as an example if "brother" is in the text, do not regularize it to the person's name. Editors will need to provide the Ref Url in the attribute, which will provide the person's name. (DWS this still seems goofy to me.) (12/13/05)
Tag all names with a <person></person> element including historical figures such as Oliver Cromwell. (1/00/06)
For the <person></person> and <personRef></personRef> elements, place the punctuation outside the element. (12/08/05)
Places and References to Places in Source Text
4.58 Use either the <place></place> or <placeRef></placeRef> element to encode places referenced in source text.
Do not tag references to general geographic areas such as "the South", "slave states", "the West", etc. Do use a <placeRef></placeRef> for specific references such as "seceded states", and "Confederate states". (12/12/05)
Do not use a <place></place> element for references to the United States, or the U.S.(12/12/05)
Do use a <place></place> element for the "Confederate States of America," the "C.S.A," or the "Confederacy." References such as rebel states, and seceded states require a <placeRef></placeRef> element.
When referring to places smaller than a city, use the city's "refurl." For "Executive Mansion," and "U.S. Capitol," use "Washington, DC" as the place refurl. The actual words will appear in a <place></place> element, but the refurl will be Washington, DC's coordinates. (11/00/05)
References to large geographic locations or features such as states and rivers will be tagged <place></place> and linked to a refurl with coordinates in the attributes. (11/17/05)
For the <place></place> and <placeRef></placeRef> elements, place the punctuation outside the element. (12/08/05)
Add new places using "New Place" template in XMetal. Use Getty Thesaurus of Geographic Names
http://www.getty.edu/research/conducting_research/vocabularies/tgn/ to look up coordinates for places. Use the "decimal degrees" and "decimal minutes." Copy the entire number (including minus sign preceding the number if there is one), paste it into attributes of latitude and longitude of <coordinates>. (12/8/05)SortKey for Places is: "OlneyIllinois," "RichlandCountyIllinois" or "MississippiRiver" etc.
Document Name is: "Olney, Illinois", "Richland County, Illinois", etc.
Organizations and References to Organizations in Source Text
4.59 Use the <org></org> element to tag collections of individuals such as company, Congress, or a political party. (12/08/05)
Example: <org>Congress</org>
When used as a reference to an organization in the <LetterHead></LetterHead> element, tag "Senate," "House of Representatives," etc. using the <org></org> element when reference is made to the body. (12/15/05; revised 11/06/06)
For the <org></org> and <orgRef></orgRef> elements, place the punctuation outside the element. (12/08/05)
Literary References in Source Text
4.60 Use the <litRef></litRef> element to mark up characters from literature . Multiple characters from the same piece of literature get one <litRef></lifRef>. (1/00/06)
References to Titles of Publications in Source Text
4.61 For references to titles of books, articles, newspapers and journals, use the <title></title> element. Use the whole name of the newspaper as written, even if the official name is somewhat different. Do not use a <place></place> element for newspaper names. Include the volume number in the <title></title> element for reports of legal cases, but not for treatises. (12/12/05) (12/15/05)
Examples: <title>4 Dana</title> 389.
2 <title>Chitty's Pleading</title> 456
In the <title></title> element, select a value for the "type" attribute such as book, article, etc., which will only display the appropriate citation format in source notes, not in the <div1></div1>. (12/12/05)
Documents and Words Attributed to Abraham Lincoln in Source Text
4.62 For documents and words attributed to Abraham Lincoln but written by others, or printed in documents such as newspapers, or transcriptions of non-extant documents, or of oral words such as speeches or recollections, set "attribToAL" & attribute "True." This will only be used to made attribution of documents to Lincoln which are not written by Lincoln. (11/18/05)
Use <q></q> element when Lincoln is quoted in newspapers. (12/12/05)
The<docAuthor></docAuthor> element should reflect by whose authority the document was prepared. The "hand" attribute should reflect who actually penned the document. If attributed to Lincoln, then the editor should enter Lincoln's refurl in the "attribute" attribute. (11/18/05) See conventions for <docAuthor></docAuthor>
STAGE 3 MARKUP
Stage 3 Elements/Tags
4.63 <handList></handList>
<hand></hand>
addition of "hand" attributes to text "chunks"
<WorkEntry></WorkEntry>
Stage 3 Transcription Conventions
Handwriting
4.64 "print" as a value of the <hand></hand> attribute applies to anything not handwritten, except letterhead and partially printed documents. (11/00/05)
Additions to the body of the text in another hand should be treated as insertions, not endorsements. The "hand" attribute of the <add></add> element will identify the difference in handwriting. (9/25/09)
For signers of a petition, the signers are all included as "authors" in the heading. They are also captured on the <handList></handList> element in the <Header></Header>. The document name will be the first signer, "and others". (12/13/05)
"Contributors" to a document are those people who write on a document, but who are not otherwise captured in other headers such as "author," "recipient," "addressee," "endorsement author" or "endorsement recipient." Document contributors exclude a writer such as a secretary or telegrapher who wrote only as a scribe. These contributors are captured in a <handList><handList> element. (12/13/05)