Module which contains the function to analyse aphorism and commentaries line
There are two functions which are treating the references [W1 W2]
and the footnotes XXX.
The references
function has to be used before the footnotes
.
Authors: | Jonathan Boyle, Nicolas Gruel <nicolas.gruel@manchester.ac.uk> |
---|---|
Copyright: | IT Services, The University of Manchester |
exegis.analysis.
footnotes
(string_to_process, next_footnote)[source]¶This helper function takes a single string containing text and processes any embedded footnote symbols (describing additions, omissions, correxi, conieci and standard textual variations) to generate XML. It also deals with any XML generated using function _references().
The output is two lists of XML, one for the main text, the other for the apparatus.
Parameters: |
|
---|---|
Returns: |
|
Raises: |
|
exegis.analysis.
references
(line)[source]¶This helper function searches a line of text for witness references
with the form [WW LL]
and returns a string containing the original
text with each witness reference replaced with XML with the form
<locus target="WW">LL</locus>
.
\n
characters are added at the start and end of each XML insertion
so each instance of XML is on its own line.
It is intended this function is called by function main() for each line of text from the main body of the text document before processing footnote references using the _footnotes() function.
Parameters: | line (str) – contains the line with the aphorism or the commentary to analyse. |
---|---|
Raises: |
|
This module has been written to convert transcribed commentaries from text files to TEI compatible XML.
Funding is provided by an ERC funded project studying Arabic commentaries on the Hippocratic Aphorisms. The Principal Investigator is Peter E. Pormann, The University of Manchester.
It is anticipated the module will be used via the main.py module which attempts to to process any input file or directory containing files with a .txt extension.
Each text file base name should end in an underscore followed by a
numerical value, e.g. file_1.txt, file_2.txt, etc. The numerical value is
subsequently used when creating the title section <div>
element, e.g.
<div n="1" type="Title_section">
for file_1.txt.
Note
This is optional, by default the version is set at 1.
If processing succeeds two XML files will be created in a folder called XML. The XML file names start with the text file base name and end in _main.xml (for the XML files will be file_1_main.xml and file_1_app.xml.
If processing fails error messages will be saved in the exegis.log file.
The commentaries should be utf-8 text files with the format as documented in the associated documentation (docs/_build/index.html).
Authors: | Jonathan Boyle, Nicolas Gruel <nicolas.gruel@manchester.ac.uk> |
---|---|
Copyright: | IT Services, The University of Manchester |
exegis.aphorisms_to_xml.
AphorismsToXMLException
[source]¶Bases: Exception
Class for exception
exegis.aphorisms_to_xml.
Process
(fname=None, folder=None, doc_num=1)[source]¶Bases: exegis.baseclass.Exegis
Class to main hypocratic aphorism text to produce a TEI XML file.
fname
¶str – Name of the file to convert. The text file base name is expected to end with an underscore followed by a numerical value, e.g. file_1.txt, file_2.txt, etc. This numerical value is used when creating the title section <div> element, e.g. <div n=”1” type=”Title_section”> for file_1.txt.
folder
¶str, optional – Name of the folder where are the files to convert
doc_num
¶int, optional – version of the document treated. Default value: 1
aphorisms_dict
()[source]¶Create an order dictionary (OrderedDict object) with the aphorisms and commentaries.
_aph_com
¶dict – dictionary which contains the aphorisms and the commentaries associated.
Raises: | AphorismsToXMLException – if it is not possible to create the dictionary. |
---|
divide_document
()[source]¶Method to divide the document in the three main parts.
An exegis document si composed in three or four main parts:
This method will divide the document in the three or four parts.
_introduction
¶str – A string which contains the introduction of the document if present
_title
¶str – A string which contains the title of the document
_text
¶str – A string which contains the aphorisms and commentaries of the document
_footnotes
¶str – A string which contains the footnotes of the document
Raises: | AphorismsToXMLException – if it is not possible to divide the document. |
---|
main
()[source]¶A function to process a text file containing symbols representing references to witnesses and symbols and footnotes defining textual variations, omissions, additions, correxi or conieci. This function uses these symbols to produce files containing EpiDoc compatible XML.
If processing succeeds two XML files will be created in folder ./XML with file names that start with the text file base name and ending in _main.xml (for the main XML) and _apps.xml (for the apparatus XML). For example for file_1.txt the XML files will be file_1_main.xml and file_1_app.xml.
Modify the attribute xml
to add the title section in the main XML
Raises: | AphorismsToXMLException – if the processing of the file does not work as expected. |
---|
open_document
(fname=None)[source]¶Method to open and read the exegis document.
Parameters: | fname (str, optional) – name of the file to analyse. |
---|
folder
str, optional – Name of the folder where are the files to convert
fname
str – Name of the file to convert. The text file base name is expected to end with an underscore followed by a numerical value, e.g. file_1.txt, file_2.txt, etc. This numerical value is used when creating the title section <div> element, e.g. <div n=”1” type=”Title_section”> for file_1.txt.
text
¶str – string which contains the whole file in utf-8 format.
Raises: | AphorismsToXMLException – if document can not be:
- open
- there subfolder present in the folder
- file not treatable by the software (e.g. .DS_Store)
- file does not exist |
---|
read_template
()[source]¶Method to read the XML template used for the transformation
template
¶str – Contain the text of the XML template provided.
Raises: | AphorismsToXMLException – if template cannot be found or read. |
---|
Module which will contains the basic classes used in the code
Authors: | Nicolas Gruel <nicolas.gruel@manchester.ac.uk> |
---|---|
Copyright: | IT Services, The University of Manchester |
exegis.baseclass.
Exegis
[source]¶Bases: object
Basic class used for the software.
xml
¶list, optional – list of string which contains the XML related to the introduction to be include in the main XML part of the document.
xml_n_offset
¶int, optional – define the number of time the oss string is used (see above) default: 3
xml_offset_size
¶int, optional – define the number of times the same string is used to indent. Default: 4
xml_oss
¶str, optional – define the string used to indent xml statement. default ‘ ‘ * XML_OFFSET_SIZE.
Module to contains the configuration of the exegis software
Authors: | Nicolas Gruel <nicolas.gruel@manchester.ac.uk> |
---|---|
Copyright: | IT Services, The University of Manchester |
Module used to treat the footnotes from the hypocratic project.
Authors: | Jonathan Boyle, Nicolas Gruel <nicolas.gruel@manchester.ac.uk> |
---|---|
Copyright: | IT Services, The University of Manchester |
exegis.footnotes.
Footnote
(footnote=None, n_footnote=None, xml=None)[source]¶Bases: exegis.baseclass.Exegis
Class Footnote which treat an individual footnote
footnote
¶str – String which contains the footnote to treat.
n_footnote
¶int – Integer which give the reference number of the footnote treated.
xml
¶list – list which contains the app XML file.
check_endnote
()[source]¶Method to check if there are a note at the end of a footnote
If the symbol ; is present in the footnote. Everything after is considered as a note and will be added as that in the <app>
correction
(reason)[source]¶This helper function processes a footnote line describing correxi, i.e. corrections by the editor, these contain the string ‘correxi’.
The first input argument must be the footnote line with the following stripped from the start and end of the string:
*n*
(where n is the footnote number) from the start of
the string.
character from the end of the stringThe footnote is expected to contain at least one :
character and
have the following format:
The footnote line before the first :
character contains a string
of witness text, followed by a ]
character.
The footnote line after the ‘:’ character has one of two formats:
- multiple pairs of witness text + witness code, each pair separated by a
:
character- a single witness text followed by a space and a list of comma separated witness codes
The second input argument should be a list containing the apparatus XML, this function will add XML to this list.
The third input argument is a string defining the unit of offset for the XML, this defaults to four space characters.
It is intended this function is called by _footnotes() for correxi footnotes.
omission
()[source]¶Helper function processes a footnote line describing an omission
This helper function processes a footnote line describing an omission,
i.e. footnotes which contain the string om.
.
The textual variation MUST include only only two witnesses, hence omissions with two witnesses are not allowed since it would make no sense for both witnesses to omit the same text. Therefore the following should be true:
The first input argument must be the footnote line with the following stripped from the start and end of the string:
*n*
(where n is the footnote number) from the start of
the string.
character from the end of the stringThe footnote is expected to contain a single ‘:’ character and have the following format:
The second input argument should be a list containing the apparatus XML, this function will add XML to this list.
The third input argument is the string defining a unit of offset in the XML, this defaults to four space characters.
It is intended this function is called by _footnotes() for omission footnotes.
exegis.footnotes.
Footnotes
(footnotes=None)[source]¶Bases: object
Class to analyse and create the XML app file for the entire set of footnotes.
footnotes
¶list, str, OrderedDict, dict – List which contains the whole set of footnote from the exegis file.
Module which contains the class which create the XML part related to the introduction (if present) in the hippocratic aphorism document.
Authors: | Jonathan Boyle, Nicolas Gruel <nicolas.gruel@manchester.ac.uk> |
---|---|
Copyright: | IT Services, The University of Manchester |
exegis.introduction.
Introduction
(introduction, next_footnote)[source]¶Bases: exegis.baseclass.Exegis
Class Introduction which will create the introduction XML part
introduction
¶str – string which contain the introduction of the exegis aphorisms document.
next_footnote
¶int – integer which contains the footnote reference number which can be present.
Raises: | IntroductionException – if cannot create the xml code for the introduction. |
---|
Main module to treat aphorisms and convert them in XML files.
Authors: | Jonathan Boyle, Nicolas Gruel <nicolas.gruel@manchester.ac.uk> |
---|---|
Copyright: | IT Services, The University of Manchester |
exegis.main.
main
(args=None)[source]¶Run eXegis scripts to produce the TEI XML
Command line:
Usage:
exegis <files> [--relaxng=<relax>]
exegis -h | --help
exegis --version
Options:
-h --help Show this screen.
--version Show version.
--xml-template=<name> Name of the XML template
--relaxng=<name> Name of the Relaxng file use to validate the resulting XML
Examples:
exegis TextFiles
exegis Textfiles --xml-template=template.xml
exegis Textfiles --relaxng=tei.rng
exegis Textfiles --xml-template=template.xml --relaxng=tei.rng
Raises: | SystemExit – if the file or the folder to treat is not available. |
---|
Module which contains the class which create the XML part related to the title in the hippocratic aphorism document.
Authors: | Jonathan Boyle, Nicolas Gruel <nicolas.gruel@manchester.ac.uk> |
---|---|
Copyright: | IT Services, The University of Manchester |
exegis.title.
Title
(title, next_footnote=1, doc_num=1)[source]¶Bases: exegis.baseclass.Exegis
Class Title which will create the title XML part
title
¶str – string which contain the title of the exegis aphorisms document.
doc_num
¶int – integer which contain the version of the document.
next_footnote
¶int – integer which contains the footnote reference number which can be present.