exegis package

Submodules

exegis.analysis module

Module which contains the function to analyse aphorism and commentaries line

There are two functions which are treating the references [W1 W2] and the footnotes XXX.

The references function has to be used before the footnotes.

Authors:Jonathan Boyle, Nicolas Gruel <nicolas.gruel@manchester.ac.uk>
Copyright:IT Services, The University of Manchester
exception exegis.analysis.AnalysisException[source]

Bases: Exception

Class for exception

exegis.analysis.footnotes(string_to_process, next_footnote)[source]

This helper function takes a single string containing text and processes any embedded footnote symbols (describing additions, omissions, correxi, conieci and standard textual variations) to generate XML. It also deals with any XML generated using function _references().

The output is two lists of XML, one for the main text, the other for the apparatus.

Parameters:
  • string_to_process (str) – This string contains the text to be processed. This should contain a single line from the text file being processed, e.g. a title, aphorism or commentary. This string may already contain XML generated using the _references() function i.e. XML identifying witnesses with each <locus> XML on a new line.
  • next_footnote (int) – reference the footnote to find.
Returns:

  • 1. A Python list containing XML for the main text.
  • 2. A Python list containing XML for the critical apparatus.
  • 3. The number of the next footnote to be processed when this function – complete.
  • It is intended this function is called by main() on each line
  • of text from the main document body.

Raises:

AnalysisException – if footnote in commentary can not be defined.

exegis.analysis.references(line)[source]

This helper function searches a line of text for witness references with the form [WW LL] and returns a string containing the original text with each witness reference replaced with XML with the form <locus target="WW">LL</locus>.

\n characters are added at the start and end of each XML insertion so each instance of XML is on its own line.

It is intended this function is called by function main() for each line of text from the main body of the text document before processing footnote references using the _footnotes() function.

Parameters:

line (str) – contains the line with the aphorism or the commentary to analyse.

Raises:

AnalysisException – if references does not follow the convention [W1 W2]. e.g. will raise an exception if:

  • [W1W2] : missing space between the two witnesses
  • [W1 W2 : missing ]

exegis.aphorisms_to_xml module

This module has been written to convert transcribed commentaries from text files to TEI compatible XML.

Funding is provided by an ERC funded project studying Arabic commentaries on the Hippocratic Aphorisms. The Principal Investigator is Peter E. Pormann, The University of Manchester.

It is anticipated the module will be used via the main.py module which attempts to to process any input file or directory containing files with a .txt extension.

Each text file base name should end in an underscore followed by a numerical value, e.g. file_1.txt, file_2.txt, etc. The numerical value is subsequently used when creating the title section <div> element, e.g. <div n="1" type="Title_section"> for file_1.txt.

Note

This is optional, by default the version is set at 1.

If processing succeeds two XML files will be created in a folder called XML. The XML file names start with the text file base name and end in _main.xml (for the XML files will be file_1_main.xml and file_1_app.xml.

If processing fails error messages will be saved in the exegis.log file.

The commentaries should be utf-8 text files with the format as documented in the associated documentation (docs/_build/index.html).

Authors:Jonathan Boyle, Nicolas Gruel <nicolas.gruel@manchester.ac.uk>
Copyright:IT Services, The University of Manchester
exception exegis.aphorisms_to_xml.AphorismsToXMLException[source]

Bases: Exception

Class for exception

class exegis.aphorisms_to_xml.Process(fname=None, folder=None, doc_num=1)[source]

Bases: exegis.baseclass.Exegis

Class to main hypocratic aphorism text to produce a TEI XML file.

fname

str – Name of the file to convert. The text file base name is expected to end with an underscore followed by a numerical value, e.g. file_1.txt, file_2.txt, etc. This numerical value is used when creating the title section <div> element, e.g. <div n=”1” type=”Title_section”> for file_1.txt.

folder

str, optional – Name of the folder where are the files to convert

doc_num

int, optional – version of the document treated. Default value: 1

aphorisms_dict()[source]

Create an order dictionary (OrderedDict object) with the aphorisms and commentaries.

_aph_com

dict – dictionary which contains the aphorisms and the commentaries associated.

Raises:AphorismsToXMLException – if it is not possible to create the dictionary.
divide_document()[source]

Method to divide the document in the three main parts.

An exegis document si composed in three or four main parts:

  • The introduction (optional)
  • The title
  • The aphorisms
  • The footnotes

This method will divide the document in the three or four parts.

_introduction

str – A string which contains the introduction of the document if present

_title

str – A string which contains the title of the document

_text

str – A string which contains the aphorisms and commentaries of the document

_footnotes

str – A string which contains the footnotes of the document

Raises:AphorismsToXMLException – if it is not possible to divide the document.
main()[source]

A function to process a text file containing symbols representing references to witnesses and symbols and footnotes defining textual variations, omissions, additions, correxi or conieci. This function uses these symbols to produce files containing EpiDoc compatible XML.

If processing succeeds two XML files will be created in folder ./XML with file names that start with the text file base name and ending in _main.xml (for the main XML) and _apps.xml (for the apparatus XML). For example for file_1.txt the XML files will be file_1_main.xml and file_1_app.xml.

Modify the attribute xml to add the title section in the main XML

Raises:AphorismsToXMLException – if the processing of the file does not work as expected.
open_document(fname=None)[source]

Method to open and read the exegis document.

Parameters:fname (str, optional) – name of the file to analyse.
folder

str, optional – Name of the folder where are the files to convert

fname

str – Name of the file to convert. The text file base name is expected to end with an underscore followed by a numerical value, e.g. file_1.txt, file_2.txt, etc. This numerical value is used when creating the title section <div> element, e.g. <div n=”1” type=”Title_section”> for file_1.txt.

text

str – string which contains the whole file in utf-8 format.

Raises:AphorismsToXMLException – if document can not be: - open - there subfolder present in the folder - file not treatable by the software (e.g. .DS_Store) - file does not exist
read_template()[source]

Method to read the XML template used for the transformation

template

str – Contain the text of the XML template provided.

Raises:AphorismsToXMLException – if template cannot be found or read.
set_basename()[source]

Method to set the basename attribute if fname is not None

treat_footnotes()[source]

Method to treat Footnote.

Work even if division of the document didn’t work properly but for the footnotes part.

exegis.baseclass module

Module which will contains the basic classes used in the code

Authors:Nicolas Gruel <nicolas.gruel@manchester.ac.uk>
Copyright:IT Services, The University of Manchester
class exegis.baseclass.Exegis[source]

Bases: object

Basic class used for the software.

xml

list, optional – list of string which contains the XML related to the introduction to be include in the main XML part of the document.

xml_n_offset

int, optional – define the number of time the oss string is used (see above) default: 3

xml_offset_size

int, optional – define the number of times the same string is used to indent. Default: 4

xml_oss

str, optional – define the string used to indent xml statement. default ‘ ‘ * XML_OFFSET_SIZE.

note_xml(note)[source]

Method to create the apparatus note XML

Parameters:note (str) – contains the string to consider as a note in the XML
save_xml(fname=None, xml=None)[source]

Method to save the XML in the working directory

xml_main()[source]

Method which will create the XML file.

exception exegis.baseclass.ExegisException[source]

Bases: Exception

Class for exception

exegis.conf module

Module to contains the configuration of the exegis software

Authors:Nicolas Gruel <nicolas.gruel@manchester.ac.uk>
Copyright:IT Services, The University of Manchester

exegis.footnotes module

Module used to treat the footnotes from the hypocratic project.

Authors:Jonathan Boyle, Nicolas Gruel <nicolas.gruel@manchester.ac.uk>
Copyright:IT Services, The University of Manchester
class exegis.footnotes.Footnote(footnote=None, n_footnote=None, xml=None)[source]

Bases: exegis.baseclass.Exegis

Class Footnote which treat an individual footnote

footnote

str – String which contains the footnote to treat.

n_footnote

int – Integer which give the reference number of the footnote treated.

xml

list – list which contains the app XML file.

check_endnote()[source]

Method to check if there are a note at the end of a footnote

If the symbol ; is present in the footnote. Everything after is considered as a note and will be added as that in the <app>

correction(reason)[source]

This helper function processes a footnote line describing correxi, i.e. corrections by the editor, these contain the string ‘correxi’.

The first input argument must be the footnote line with the following stripped from the start and end of the string:

  1. All whitespace
  2. *n* (where n is the footnote number) from the start of the string
  3. . character from the end of the string

The footnote is expected to contain at least one : character and have the following format:

  1. The footnote line before the first : character contains a string of witness text, followed by a ] character.

  2. The footnote line after the ‘:’ character has one of two formats:

    1. multiple pairs of witness text + witness code, each pair separated by a : character
    2. a single witness text followed by a space and a list of comma separated witness codes

The second input argument should be a list containing the apparatus XML, this function will add XML to this list.

The third input argument is a string defining the unit of offset for the XML, this defaults to four space characters.

It is intended this function is called by _footnotes() for correxi footnotes.

omission()[source]

Helper function processes a footnote line describing an omission

This helper function processes a footnote line describing an omission, i.e. footnotes which contain the string om..

The textual variation MUST include only only two witnesses, hence omissions with two witnesses are not allowed since it would make no sense for both witnesses to omit the same text. Therefore the following should be true:

  1. The footnote line contains one colon character.
  2. The footnote line doesn’t contain commas.

The first input argument must be the footnote line with the following stripped from the start and end of the string:

  1. All whitespace
  2. *n* (where n is the footnote number) from the start of the string
  3. . character from the end of the string

The footnote is expected to contain a single ‘:’ character and have the following format:

  1. The footnote line before the ‘:’ character is a string of witness text, followed by the ‘]’ character, followed by a single witness code.
  2. The footnote line after the ‘:’ character contains an ‘om.’ followed by a single witness code.

The second input argument should be a list containing the apparatus XML, this function will add XML to this list.

The third input argument is the string defining a unit of offset in the XML, this defaults to four space characters.

It is intended this function is called by _footnotes() for omission footnotes.

class exegis.footnotes.Footnotes(footnotes=None)[source]

Bases: object

Class to analyse and create the XML app file for the entire set of footnotes.

footnotes

list, str, OrderedDict, dict – List which contains the whole set of footnote from the exegis file.

save_xml(fname='xml_app.xml')[source]

Method to save the XML app string in a file

Parameters:fname (str (optional)) – name of the file where the XML app will be saved.
xml_app()[source]

Method to create the XML add for the footnote

Returns:xml_app – list which contains the lines with the XML related to the footnotes
Return type:list
exception exegis.footnotes.FootnotesException[source]

Bases: Exception

Class for exception

exegis.introduction module

Module which contains the class which create the XML part related to the introduction (if present) in the hippocratic aphorism document.

Authors:Jonathan Boyle, Nicolas Gruel <nicolas.gruel@manchester.ac.uk>
Copyright:IT Services, The University of Manchester
class exegis.introduction.Introduction(introduction, next_footnote)[source]

Bases: exegis.baseclass.Exegis

Class Introduction which will create the introduction XML part

introduction

str – string which contain the introduction of the exegis aphorisms document.

next_footnote

int – integer which contains the footnote reference number which can be present.

Raises:IntroductionException – if cannot create the xml code for the introduction.
xml_main()[source]

Method to treat the optional part of the introduction.

Modify the attribute xml to add the title section in the main XML

exception exegis.introduction.IntroductionException[source]

Bases: Exception

Class for exception

exegis.main module

Main module to treat aphorisms and convert them in XML files.

Authors:Jonathan Boyle, Nicolas Gruel <nicolas.gruel@manchester.ac.uk>
Copyright:IT Services, The University of Manchester
exegis.main.main(args=None)[source]

Run eXegis scripts to produce the TEI XML

Command line:

Usage:
    exegis <files>  [--relaxng=<relax>]
    exegis -h | --help
    exegis --version

Options:
    -h --help                   Show this screen.
    --version                   Show version.
    --xml-template=<name>       Name of the XML template
    --relaxng=<name>            Name of the Relaxng file use to validate the resulting XML

Examples:
    exegis TextFiles
    exegis Textfiles --xml-template=template.xml
    exegis Textfiles --relaxng=tei.rng
    exegis Textfiles --xml-template=template.xml --relaxng=tei.rng
Raises:SystemExit – if the file or the folder to treat is not available.

exegis.title module

Module which contains the class which create the XML part related to the title in the hippocratic aphorism document.

Authors:Jonathan Boyle, Nicolas Gruel <nicolas.gruel@manchester.ac.uk>
Copyright:IT Services, The University of Manchester
class exegis.title.Title(title, next_footnote=1, doc_num=1)[source]

Bases: exegis.baseclass.Exegis

Class Title which will create the title XML part

title

str – string which contain the title of the exegis aphorisms document.

doc_num

int – integer which contain the version of the document.

next_footnote

int – integer which contains the footnote reference number which can be present.

xml_main()[source]

Method to treat the title.

Modify the attribute xml to add the title section in the main XML

exception exegis.title.TitleException[source]

Bases: Exception

Class for exception

Module contents