..    #!/usr/bin/env python3

..  _`cobol_defs`: 

#######################################################
COBOL Definitions Module -- Handle COBOL DDE's
#######################################################

..  py:module:: cobol.defs

This is a small set of class definitions and functions
that are used by :py:mod:`cobol.loader` as well as 
:py:mod:`cobol`.

The intent of this module is to avoid a few circular import dependencies.

The Architecture Problem
========================

We have an issue of separation of three concerns:

-   The underlying workbook and the parsing of CSV or XML or EBCDIC.
    This is the Physical Format.

-   The logical layout or schema we're imposing on the workbook's data.

-   The process of loading a schema, possibly using a meta-workbook.
    This includes the translation of COBOL notation into a useful schema.

Except for COBOL, a schema depends on a meta-workbook via a schema loader. 
But this is the limit of the relationship. We could say 

..    math::

    S = L(w)
    
Or ``schema= loader(workbook)``. This may involve a separate workbook file,
a separate sheet within a file or even just columns within 
the current sheet.

For COBOL, we'd like to keep schema, schema loader and workbook separate, also,
even though COBOL code doesn't depend on COBOL data files.
We'd still like to say ``schema= loader(cobol source)``. 

..    math::

    S = L(c)

We can imagine that an application will import a workbook class and a schema loader class.
It will load the schema, then open the workbook using the schema.

**However**.

A COBOL schema with an occurs depending on (i.e. a DDE with ``variably_located == True``)
will have the schema depending on each row in addition to the overall loading.

We're really taking about a Baseline Schema, :math:`S_b`, and a Row-Level Schema, :math:`S_r`,
that is built by resolving any Occurs Depending On

..    math::

    S_b = L(c)

..    math::

    s_r = R( d, S_b )
    
We've changed ``schema_baseline= loader(cobol source)`` and then,
for each row, ``schema_row= setSizeAndOffset(data, schema_baseline)``.

Where To Recompute
---------------------

The fundamental issue is this: when can we recompute the offsets? 

The choices for computing the offsets are these:

-   At :py:meth:`COBOL_File.rows_of` time -- eagerly, but in the wrong module.
    See below.

-   At :py:meth:`COBOL_File.row_get` time -- a bit more lazy, but still in the wrong
    module, since it's here, not in :py:mod:`cobol.loader`.
    
-   In the application before doing any schema processing on a given row. Very lazy.
    But now the application must be more deeply involved in ODO processing. The application
    would do something like the following. Sadly, it has a line that's easy to overlook.
    
..  parsed-literal::
    
    with open("xyzzy.cob") as source:
        dde_list, schema = COBOL_schema( source )
    with stingray.cobol.Character_File( filename, schema=schema ) as wb:
        sheet= wb.sheet( filename )
        for row in sheet.rows():
            **cobol.loader.setSizeAndOffset( dde_list[0] )**
            dump( schema, row )
            
The Module Dependency Problem
-------------------------------

The :py:class:`Usage` class properly depends on :py:mod:`cobol`. 
The :py:func:`cobol.loader.make_attr` function, also, properly depends on :py:mod:`cobol`.

The idea is that workbooks are more fundamental than schema. We might need to use
one workbook to build a schema to read another workbook. Schema are higher-level constructs.

We want to avoid any circular dependency between :py:mod:`cobol.loader` referring
back to :py:mod:`cobol`. 
The :py:class:`schema.RepeatingAttribute` definition has a weak version of this undesirable. 
dependency.  We finesse it
by defining a bunch of properties that exploit the underlying DDE details without
an explicit ``import`` of the DDE class.

To assure that ``cobol`` does not depend on ``cobol.loader``,
we'd have the class  :py:class:`schema.RepeatingAttribute` entirely built without
reference to the base DDE. 
This, however, means that we would effectively clone
the hierarchical relationships into the :py:class:`schema.RepeatingAttribute` objects.
Why bother?

If we extend :py:meth:`COBOL_File.rows_of` or :py:meth:`COBOL_File.row_get`, we
exacerbates the problem because it would introduce a circular ``import``. This 
would make ``cobol`` depend on ``cobol.loader`` explicitly.

Resolution
--------------

The ``setSizeAndOffset()`` function as well as a few other
post-processing functions belong in an intermediate module that both ``cobol``
and ``cobol.loader`` depend on. 

Specifically, Cell definitions, DDE definitions, and the related functions required
to build schema attributes from DDE's.

That way, ``cobol`` can import ``cobol.defs.setSizeAndOffset``.

Also, ``cobol.loader`` can import ``cobol.defs.DDE``.

And ``cobol.RepeatingAttribute`` can depend on ``cobol.defs.DDE``.

Overheads
=================

::

    """stingray.cobol.defs -- COBOL DDE and Tools."""
    import logging
    import weakref
    import warnings
    
    import stingray.cell

    
A module-level logger.

::

    logger= logging.getLogger( __name__ )

Exception
===========

..  py:class:: UnsupportedError

    A syntax which expresses an unsupported feature
    of the COBOL language.

::

    class UnsupportedError( Exception ):
        """A COBOL DDE has features not supported by this module."""
        pass
        
The most important unsupported feature may be "separate signs."  These may be
required for decoding bytes in some files.


Cell Subclasses and Conversions
=================================

Rather than tinker too much with the :py:mod:`cell` module, 
it seems better to introduce new :py:class:`cell.Cell` subclasses unique to COBOL, EBCIDC
and COMP-3 data.  

There are three relevant features.

-   Proper conversion from source characters or bytes.

-   Preservation of the source characters (or bytes) for creating
    character-level (or byte-level) structured dumps of a record.
    
-   Preservation of the original DDE attributes, because there is so much
    information required to interpret the bytes.
    
Consequently, even the :py:class:`cell.TextCell` must be extended to include
preservation of raw data.

Further, we have a distinction between text and numbers which are
"USAGE DISPLAY".  

..  code-block:: none

    http://yuml.me/diagram/scruffy;/class/
    #cobol.cell,
    [TextCell]^[NumberCell],
    [NumberCell]^[NumberDisplayCell],
    [NumberCell]^[NumberCompCell],
    [NumberCell]^[NumberComp3Cell],
    [TextCell]^[ErrorCell],


..  image:: cobol_cell.png
    :width: 6in

..  warning:: Non-Polymorphic.

    These classes are profound extensions to the base definitions of :py:mod:`cell`.
    They are not polymorphic with the base classes.
    COBOL processing is not transparently identical to other workbook processing.
    
These cells are conventionally built by the the :py:class:`cobol.COBOL_File` version
of Workbook as a factory. These are rarely built any other way.

..  py:class:: TextCell

    A cell which contains COBOL Alphanumeric data.

::

    class TextCell( stingray.cell.TextCell ):
        """A COBOL TextCell, usually Usage Display."""
        def __init__( self, raw, workbook, attr ):
            self.raw, self.workbook= raw, workbook
            self._value= workbook.text( self.raw, attr )

..  py:class:: NumberCell

    This is an abstraction to simply hold all the standard conversions

::

    class NumberCell( stingray.cell.NumberCell ):
        """A COBOL number."""
        def to_int( self ): return int(self.value)
        def to_float( self ): return float(self.value)
        def to_decimal( self, digits=None ): return self.value
        def to_str( self ): return str(self.value)
    
..  py:class:: NumberDisplayCell

    A COBOL numeric item with USAGE DISPLAY.

::

    class NumberDisplayCell( NumberCell ):
        """A COBOL Usage Display Numeric Cell."""
        def __init__( self, raw, workbook, attr ):
            self.raw, self.workbook= raw, workbook
            self._value= workbook.number_display( self.raw, attr )
            
..  py:class:: NumberCompCell

    A COBOL numeric item with USAGE COMPUTATIONAL.


::

    class NumberCompCell( NumberCell ):
        """A COBOL Usage COMP Numeric Cell.
        Three formats.  Half-word, whole-word and double-word.
        """
        def __init__( self, raw, workbook, attr ):
            self.raw, self.workbook= raw, workbook
            self._value= workbook.number_comp( self.raw, attr )
            
..  py:class:: NumberComp3Cell

    A COBOL numeric item with USAGE COMPUTATIONAL-3.


::

    class NumberComp3Cell( NumberCell ):
        """A COBOL Usage COMP-3 Numeric Cell.."""
        def __init__( self, raw, workbook, attr ):
            self.raw, self.workbook= raw, workbook
            self._value= workbook.number_comp3( self.raw, attr )


..  py:class:: ErrorCell

    A COBOL numeric item with invalid data.


::

    class ErrorCell( stingray.cell.ErrorCell ):
        """A COBOL ErrorCell, bad data bytes with no relevant value."""
        def __init__( self, raw, workbook, attr, exception=None ):
            self.raw, self.workbook= raw, workbook
            self._value= None
            self.exception= exception
        def __repr__( self ):
            return "{0}({1!r}, {2!r})".format(
                self.__class__.__name__, self.exception, self.raw )


Essential Class Definitions
============================
    
The essential class definitions define the DDE we're attempting to build.
We can  separated this structure into a few high-level subject areas:

-   `Usage Strategy Hierarchy`_ defines the various 
    kinds of USAGE options.

-   `Allocation Strategy Hierarchy`_ defines the relationships among DDE's:
    Predecessor/Successor, Group/Elementary or Redefines.
    
-   `Occurs Strategy Hierarchy`_ defines the Occurs options of
    Default (no Occurs), simple Occurs, and more complex Occurs Depending On.
    
-   The `DDE Class`_ itself.

Usage Strategy Hierarchy
--------------------------

The :py:class:`Usage` class combines information in the picture, usage, sign and synchronized clauses.

The **Strategy** design pattern allows a DDE element to delegate 
the :py:meth:`Usage.size` and :py:meth:`Usage.create_func()` operations to this class.

The :py:meth:`Usage.size` method returns the number
of bytes used by the data element. 

-   For usage ``DISPLAY``, the size is computed directly from the picture clause.  

-   For usage ``COMP``, the size is 2, 4 or 8 bytes based on the picture clause.

-   For usage ``COMP-3``, the picture clause digits are packed two per byte  
    with an extra half-byte for sign information. This must be rounded up.
    COMP-3 fields often have an odd number of digits to reflect this.

The :py:meth:`Usage.create_func()` method returns a :py:class:`cell.Cell` type 
that should be built from the raw bytes.

..  code-block:: none

    http://yuml.me/diagram/scruffy;/class/
    #cobol_loader_usage,
    [RecordFactory]<>-[DDE],
    [DDE]<>-[DDE],
    [DDE]-[Usage],
    [Usage]^[UsageDisplay],
    [Usage]^[UsageComp]
    [Usage]^[UsageComp3]
    
..  image:: cobol_usage.png

..  py:class:: Usage

    The Usage class provides detailed representation and conversion support
    for a given DDE. A :py:class:`schema.Attribute` will refer to a 
    :py:class:`cobol.defs.DDE`. This DDE will have a :py:class:`Usage` object that shows
    how to create the underlying ``Cell`` instance from the raw data 
    in the :py:class:`cobol.COBOL_File` subclass of ``Workbook``.

    For numeric types, this may mean a fallback from creating a :py:class:`NumberCell`
    to creating a :py:class:`ErrorCell`. If the number is invalid in some way, then 
    an error is required.

    The superclass of ``Usage`` is abstract and doesn't compute a proper size.

::

    class Usage:
        """Covert numeric data based on Usage clause."""
        def __init__( self, source ):
            self.source_= source
            self.final= source
            self.numeric= None # is the picture all digits?
            self.length= None
            self.scale= None
            self.precision= None
            self.signed= None
            self.decimal= None
        def setTypeInfo( self, picture ):
            """Details from parsing a PICTURE clause."""
            self.final= picture.final
            self.numeric = not picture.alpha
            self.length = picture.length
            self.scale = picture.scale
            self.precision = picture.precision
            self.signed = picture.signed
            self.decimal = picture.decimal
        def source( self ):
            return self.source_

..  py:method:: Usage.create_func()

    Create a CELL object. Use the raw bytes to build an Cell described
    by the given Attribute.

::

        def create_func( self, raw, workbook, attr ):
            """Converts bytes to a proper Cell object.
            NOTE: EBCDIC->ASCII conversion handled by the Workbook object.
            """
            return stingray.cobol.TextCell( raw, workbook, attr )

..  py:method:: Usage.size( picture )

    The count is in bytes.  Not characters.

::

        def size( self, picture ):
            """Default for group-level items."""
            return 0

..  py:class:: UsageDisplay
            
    Usage "DISPLAY" is the COBOL language default.  It's also assumed for group-level items.
         
::

    class UsageDisplay( Usage ):
        """Ordinary character data which is numeric."""
        def __init__( self, source ):
            super().__init__( source )
        def create_func( self, raw, workbook, attr ):
            if self.numeric:
                try:
                    return NumberDisplayCell( raw, workbook, attr )
                except Exception as e:
                    error= ErrorCell( raw, workbook, attr, exception=e )
                    return error
            return stingray.cobol.TextCell( raw, workbook, attr )
        def size( self ):
            """Return the actual size of this data, based on PICTURE and SIGN."""
            return len(self.final)

..  py:class:: UsageComp

    Usage "COMPUTATIONAL" is binary-encoded data.

::
  
    class UsageComp( Usage ):
        """Binary-encoded COMP data which is numeric."""
        def __init__( self, source ):
            super().__init__( source )
        def create_func( self, raw, workbook, attr ):
            try:
                return NumberCompCell( raw, workbook, attr )
            except Exception as e:
                error= ErrorCell( raw, workbook, attr, exception=e )
                return error
        def size( self ):
            """COMP is binary half word, whole word or double word."""
            if len(self.final) <= 4:
                return 2
            elif len(self.final) <= 9:
                return 4
            else:
                return 8

..  py:class:: UsageComp3

    Usage "COMP-3" is packed-decimal encoded data.

::
 
    class UsageComp3( Usage ):
        """Binary-Decimal packed COMP-3 data which is numeric."""
        def __init__( self, source ):
            super().__init__( source )
        def create_func( self, raw, workbook, attr ):
            try:
                return NumberComp3Cell(raw, workbook, attr)
            except Exception as e:
                error= ErrorCell( raw, workbook, attr, exception=e )
                return error
        def size( self ):
            """COMP-3 is packed decimal."""
            return (len(self.final)+2)//2
            
Allocation Strategy Hierarchy
------------------------------

We actually have three kinds of allocation relationships among DDE items.

-   Predecessor/Successor

-   Group/Elementary

-   Redefines

[*Formerly, we had only two subclasses.*]

This leads to a **Strategy** class hierarchy to handle the various algorithmic
choices.

The Pred/Succ strategy computes the offset to a specific item based on the predecessor.
This is the default for non-head items in a group.

The Group/Elem strategy computes the offset based on the offset to the parent group.
This is the default for the head item in a group.

The Redefines strategy depends on another element: not it's immediate predecessor. 
This element will be assigned the same offset as the element on which it depends. 
    
The **Strategy** design pattern allows an element to delegate the 
:py:meth:`Redefines.offset`,
and :py:meth:`Redefines.totalSize` methods.

..  code-block:: none

    http://yuml.me/diagram/scruffy;/class/
    #cobol_loader_redefines,
    [RecordFactory]<>-[DDE],
    [DDE]<>-[DDE],
    [DDE]-[Allocation],
    [Allocation]^[Redefines],
    [Allocation]^[Pred-Succ],
    [Allocation]^[Group-Elem]    

..  image:: cobol_redefines.png

..  py:class:: Allocation

    The :py:class:`Allocation` superclass defines an abstract base
    class for the various allocation strategies.

::

    class Allocation:
        def __init__( self ):
            self.dde= None 
        def resolve( self, aDDE ):
            """Associate back to the owning DDE."""
            self.dde= weakref.ref(aDDE) 

..  py:class:: Redefines

    The :py:class:`Redefines` subclass depends on another element. It uses
    the referenced name to look up the offset and total size information.  

    For this to work, the name must be resolved via the :py:meth:`Redefines.resolve` method.
    The :py:func:`resolver` function applies the :py:meth:`Redefines.resolve` method throughout the structure.

::

    class Redefines(Allocation):
        """Lookup size and offset from another field we refer to."""
        def __init__( self, name, refers_to=None ):
            super().__init__()
            self.name= name
            self.refers_to= refers_to # Used for unit testing
        def source( self ):
            return "REDEFINES {0}".format( self.refers_to.name )

..  py:method:: Redefines.resolve( aDDE )

    Resolve a DDE name. See our ``self.refers_to`` to refer to a DDE within
    the given structure.
    
::

        def resolve( self, aDDE ):
            """Search the structure for the referenced name.
            Must be done before sizing can be done.
            """
            super().resolve( aDDE )
            self.refers_to= aDDE.top().get( self.name )

..  py:method:: Redefines.offset( offset )

    For a redefines, this uses the resolved ``refers_to`` name and fetches 
    the offset.

::

        def offset( self, offset ):
            """:param offset: computed offset for this relative position.
            :return: named DDE element offset instead.
            """
            return self.refers_to.offset

..  py:method:: Redefines.totalSize()

    Returns the total size.

::

        def totalSize( self ):
            """:return: total size of this DDE include all children and occurs.
            """
            warnings.warn("totalSize method is deprecated", DeprecationWarning )
            return 0 

Note that ``01`` level items may have a REDEFINES. 
However, this can never meaningfully redefine anything. 
All  ``01`` level definitions start at an offset of 0 by definition. 
A copybook may include multiple ``01`` levels with REDEFINES clauses; 
an 01-level REDEFINES is irrelevant with respect to offset and size calculations.

..  py:class:: Successor

    The :py:class:`Successor`
    subclass does not depend on a named element, it depends on the immediate
    predecessor. It uses that contextual offset and size information provided by 
    the :py:func:`setSizeAndOffset` function.
            
::

    class Successor(Allocation):
        """More typical case is that the DDE follows it's predecessor.
        It's not first in a group, nor is it a redefines.
        """
        def __init__( self, pred ):
            super().__init__()
            self.refers_to= pred
        def source( self ):
            return ""

..  py:method:: Successor.offset( offset )

    For a successor, we use the predecessor in the ``refers_to`` field
    to track down the offset of the predecessor. 

    This field's offset is predecessor offset + predecessor total size.

    The predecessor may have to do some thinking to get its total size or
    offset because of an Occurs Depending On situation.

::


        def offset( self, offset ):
            """:param offset: computed offset to this point.
            :return: computed offset
            """
            return offset
            
..  py:method:: Successor.totalSize()

    The total size of a field with occurs depending on requires a record with live data.
    Otherwise, the total size is trivially computed from the DDE definition.
    
::

        def totalSize( self ):
            """:return: total size of this DDE include all children and occurs.
            """
            warnings.warn("totalSize method is deprecated", DeprecationWarning )
            return self.dde().totalSize

..  py:class:: Group

    This subclass does not depend on a named element, it depends on the immediate
    parent group. It uses that contextual offset and size information provided by 
    the :py:func:`setSizeAndOffset` function.
            
::

    class Group(Allocation):
        """More typical case is that the DDE is first under a parent."""
        def __init__( self ):
            super().__init__()
            
        def source( self ):
            return ""
            
..  py:method:: Group.offset( offset )

    For the first item in a group, we use the group parent in the ``dde`` field
    to track down the offset of the group we're a member of. 

    This field's offset is the group offset, since this field is first in the group.

    The group may have to do some recursive processing to get its predecessor's total size or
    offset because of an Occurs Depending On situation.

::


        def offset( self, offset ):
            """:param offset: computed offset
            :return: computed offset
            """
            return offset
            
..  py:method:: Group.totalSize()

    This is essentially the same as the successor -- it's merely an item within a DDE,
    we just track the first items separately with this subclass so that they
    can refer to the parent to walk up the tree.

::

        def totalSize( self ):
            """:return: total size of this DDE include all children and occurs.
            """
            warnings.warn("totalSize method is deprecated", DeprecationWarning )
            return self.dde().totalSize

Occurs Strategy Hierarchy
--------------------------

There are three species of Occurs clauses.

-   Format 1. Fixed OCCURS with a number.

-   Format 2. Variable OCCURS DEPENDING ON with a number and a name.
    The :py:func:`resolver` function sorts out the reference.
    Similar to the way REDEFINES is handled.

-   Format 0. No Occurs, effectively OCCURS == 1 with no dimensionality issues.

This means that the ``number`` attribute must be derived EITHER from the definition
or a data record. For ODO, we need to bind the definition to a record.
    
..  warning:: Dependencies between DDE and Attribute

    An OCCURS is a feature of the DDE.
    Data access, however, requires the :py:class:`schema.Attribute`.
    There's no **direct** linkage from :py:class:`cobol.defs.DDE` to :py:class:`schema.Attribute`. 
    There is linkage from :py:class:`schema.Attribute` to :py:class:`cobol.defs.DDE`. 
    
    It seems best to have Attribute independent of any particular 
    source. A schema may not necessarily come from COBOL.
    
    However.
    
    To facilitate a DDE-oriented dump of raw data or specific fields of a COBOL
    file, we include a :py:mod:`weakref` from the ``DDE`` to the ``Attribute`` created
    from that DDE. 
    
The overall top-most parent DDE associated with this object is ``self.dde().top()``.
    
..  py:class:: Occurs

    Abstract superclass for an Occurs clause.

::

    class Occurs:
        """No OCCURS clause present. Data from a row is irrelevant."""
        default= True
        static= True
        def __str__( self ):
            return ""
        def resolve( self, aDDE ):
            self.dde= weakref.ref(aDDE)
        def number( self, aRow ):
            return 1
            
..  py:class:: OccursFixed

    Occurs clause with a simple fixed number of occurrences.


::

    class OccursFixed( Occurs ):
        """OCCURS n TIMES. Data from a row is irrelevant."""
        default= False
        static= True
        def __init__( self, number ):
            self._number= int(number)
        def __str__( self ):
            return "OCCURS {0}".format(self.number)
        def resolve( self, aDDE ):
            super().resolve(aDDE)
        def number( self, aRow ):
            return self._number
            
..  py:class:: OccursDependingOn

    Occurs clause with a DEPENDING ON option.


::

    class OccursDependingOn( Occurs ):
        """OCCURS TO n TIMES DEPENDING ON name. Data from a row is required."""
        default= False
        static= False
        def __init__( self, name, limit ):
            self.name= name
            self.limit= limit
            self.refers_to= None
            self.attr= None
        def __str__( self ):
            return "OCCURS TO {0} DEPENDING ON {1}".format(self.limit, self.name)
        def resolve( self, aDDE ):
            super().resolve(aDDE)
            self.refers_to= aDDE.top().get( self.name )
        def number( self, aRow ):
            """aRow.cell( schema.get(self.name) ) should have a numeric value."""
            if self.attr is None:
                schema_dict= dict( (a.name, a) for a in aRow.sheet.schema )
                self.attr= schema_dict[self.name]
            ## logger.debug( "Getting {0} from {1}".format(self.attr,aRow) )
            value= aRow.cell( self.attr ).to_int()
            return value
            
..  py:class:: OccursDependingOnLimit

    This is an extension to OccursDependingOn. It limits the ODO clause to the defined
    upper bound. 

    If we have ``05 SOMETHING OCCURS 1 TO 5 TIMES DEPENDING ON X`` and
    the value of ``X`` is greater than 5, the maximum defined value, 5, is used.

    This entirely hypothetical as a possible fix to a problem. It's probably a 
    Very Bad Idea, and should be removed.

    See http://pic.dhe.ibm.com/infocenter/ratdevz/v8r0/index.jsp?topic=%2Fcom.ibm.ent.cbl.zos.doc%2Ftopics%2FMG%2Figymch1027.htm

        "When the maximum length is used, it is not necessary to initialize the ODO object before the table receives data."
    
        "When TABLE-GROUP-1 is a receiving item, Enterprise COBOL moves the maximum number of character positions for it (450 bytes for TABLE-1 plus two bytes for ODO-KEY-1). Therefore, you need not initialize the length of TABLE-1 before moving the SEND-ITEM-1 data into the table."

    Based on this (and bad data seen in the wild) we deduce that this upper limit
    clamping **may** be a language  feature.

::

    class OccursDependingOnLimit( OccursDependingOn ):
        """OCCURS TO n TIMES DEPENDING ON name. Data is required.
        This will clamp the result at the given upper limit.
        """
        def number( self, aRow ):
            value= super().number( aRow )
            if value > self.limit: 
                return self.limit
            return value
 
DDE Class
--------------

..  py:class:: DDE

    The :py:class:`DDE` class itself defines a single element (group or elementary) of a
    record.  There are several broad areas of functionality for a DDE:
    (1) construction, (2) reporting and decoration, (3) processing record data.

    The class definition includes the attributes determined at
    parse time, attributes added during decoration time and
    attributes used during decoration processing.

    As noted above, Group-level vs. Elementary-level *could* be separate subclasses of DDE.
    They aren't right now, since group-level items can be used in an application program
    like elementary items. 

    A group-level item contains subsidiary DDE's and has no PICTURE clause.
    An elementary-level DDE is defined by having a PICTURE clause.

    All group-level DDE's are effectively string-type data.
    An elementary-level DDE with a numeric PICTURE is numeric-type data.  It can
    be usage display or usage computational.  An elementary-level DDE with a string PICTURE is string-type data.

    Occurs and Redefines can occur at any level.  

    Each entry is defined by the following attributes:

    ..  py:attribute:: level
    
        COBOL level number 01 to 49, 66 or 88.

    ..  py:attribute:: myName

        COBOL variable name

    ..  py:attribute:: occurs

        An instance of :py:class:`Occurs`.
        the number of occurrences. The default is 1, which we call "format 0".
        There are two defined formats: format 1 has a fixed number of occurrences;
        format 2 is the Occurs Depending On with a variable number of occurrences.

    ..  py:attribute:: picture

        the exploded picture clause, with ()'s expanded

    ..  py:attribute:: initValue

        any initial value provided

    ..  py:attribute:: allocation

        an instance of :py:class:`Allocation` used to compute the offset and total size.

    ..  py:attribute:: usage

        an instance of :py:class:`Usage` to delegate data conversion properly.
        The actual conversion is handled by the workbook.

    ..  py:attribute:: children

        the list of contained fields within a group

    ..  py:attribute:: parent

        A weakref to the immediate parent DDE

    ..  py:attribute:: top

        A weakref to the overall record definition DDE.
    
    The following decorations are applied by functions that traverse the DDE structure.

    ..  py:attribute:: sizeScalePrecision

        ``Picture`` namedtuple with details derived from parsing the PICTURE clause

    ..  py:attribute:: size
    
        the size of an individual occurrence

    The following features may have to be computed lazily if there's an Occurs
    Depending On. Otherwise they can be computed eagerly.

    ..  py:attribute:: variably_located

        Variably Located if any this element or any child has Occurs Depending On.
        Otherwise (no ODO) the DDE is Statically Located.
        Actually, only element **after** the ODO element are variably located. But it's simpler
        to treat the whole record as variable.

    ..  py:attribute:: offset

        offset to this field from start of record.

    ..  py:attribute:: totalSize

        overall size of this item, including all occurrences.

    Additionally, this item -- in a way -- breaks the dependencies between
    a :py:class:`schema.Attribute` and a DDE. It's appropriate for an Attribute
    to depend on a DDE, but the reverse isn't proper. However, we support a DDE
    referring to an attribute because it can be useful.

    ..  py:attribute:: attribute

        weakref to the :py:class:`schema.Attribute` built from this DDE.
    
::

    class DDE:
        """A Data Description Entry.
        """
        def __init__( self, level, name, usage=None, occurs=None, redefines=None, 
            initValue=None, pic=None, sizeScalePrecision=None ):
            """Build this with the results of parsing the various clauses.
            """
            self.level= level
            self.name= name
            self.occurs= occurs if occurs is not None else Occurs() 
            self.picture= pic # source, prior to parsing, below.
            self.allocation= redefines # Redefines or Successor or Group
            self.usage= usage
            self.initValue= initValue
            
            # Parsed picture information
            self.sizeScalePrecision= sizeScalePrecision

            # Relationships
            self.indent= 0 
            self.children= []
            self.top= None # must be a weakref.ref()
            self.parent= None # must be a weakref.ref()

            # Derived property from the picture clause
            self.size= self.usage.size()            
            
            # Because of ODO, these cannot always be computed statically.
            self.offset= 0 # self.allocation.refers_to.offset + self.allocation.refers_to.total_size
            self.totalSize= 0 # self.size * self.occurs.number(aRow)
            
            # Derived attribute created from this DDE.
            self.attribute= None
                        
        def __repr__( self ):
            return "{!s} {!s} {!s}".format( self.level, self.name, map(str,self.children) )
        def __str__( self ):
            oc= str(self.occurs)
            pc= " PIC {0}".format(self.picture) if self.picture else ""
            uc= " USAGE {0}".format( self.usage.source() ) if self.usage.source() else ""
            rc= self.allocation.source()
            return "{:<2s} {:<20s}{!s}{!s}{!s}{!s}.".format( self.level, self.name, rc, oc, pc, uc )

Construction occurs in these general steps: 

(1) the DDE is created,

(2) source attributes are set, 

(3) the DDE is decorated with size, offset and other details.

(4) the DDE is transformed into a :py:class:`schema.Attribute`.
    
::

        def addChild( self, aDDE ):
            """Add a substructure to this DDE.
        
            This is used by RecordFactory to assemble the DDE.
            """
            if aDDE.allocation:
                # has a redefines, leave it alone
                pass
            else:
                if self.children:
                    aDDE.allocation= Successor( self.children[-1] )
                else:
                    aDDE.allocation= Group()
            aDDE.top= self.top # Already a weakref
            aDDE.parent= weakref.ref(self)
            aDDE.indent= self.indent+1
            self.children.append( aDDE )

This iterator does a pre-order depth-first
traversal of the subtree. This provides a single, flat list of all elements.

..  py:class:: DDE.__iter__( )

::

        def __iter__( self ):
            yield self
            for c in self.children:
                for dde in c:
                    yield dde

The process of scanning a record involves methods to locate a specific field,
set the occurrence index of a field, and pick bytes of a record input buffer.

We work with ``"."``\ -separated path names through the hierarchy. This doesn't work
well in the presence of OCCURS clauses and indexes. For that, we need more 
complex navigation.

::

        def pathTo( self ):
            """Return the complete "."-delimited path to this DDE."""
            if self.parent: return self.parent().pathTo() + "." + self.name
            return self.name

        def getPath( self, path ):
            """Given a "."-punctuated Path, locate the field.
            COBOL uses "of" for this.
            """
            context= self.top # In case we're not the top.
            for name in path.split('.'):
                context= self.get( name )
            return context

        def get( self, name ):
            """Find the named field, and return the relevant substructure.
            :param: name of the DDE element
            :return: DDE Object
            :raises: AttributeError if field not found
            """
            found= search( self.top(), name )
            if found:
                return found
            raise AttributeError( "Field {!r} unknown in this record".format(name) )
        
Work with Occurs Depending On. The :meth:`variably_located` question
may only apply to top-level (01, parent=None) DDE's.

::

        @property
        def variably_located( self ):
            if self.occurs.static:
                return any( c.variably_located for c in self.children )
            return True # Not static
        
        def setSizeAndOffset( self, aRow ):
            setSizeAndOffset( self, aRow )
            
DDE Preparation Processing
===========================

There are a number of classes (and functions) to support 
parsing. We need to transform the DDE's to 
a :py:class:`schema.Schema` which is a flattened 
list of :py:class:`schema.Attribute` objects for each element in the DDE.

There are a number of processing steps which are applied to the overall DDE.
These functions depend on the DDE's ability to do it's own recursive pre-order 
traversal as an iterator.

-   :py:func:`source`. Dumps canonical source.

-   :py:func:`report`. Reports on the compiled results.

-   :py:func:`search`. Searches through the hierarchy.

-   :py:func:`resolver`. Resolves REDEFINES and DEPENDING ON.

-   :py:func:`setDimensionality`. Propagates dimensionality down from group
    to elementary items.

Additionally, we want to prepare for size and offset calculation.
Sometimes, there are no ODO's and we can compute these statically.
Other times, there's an ODO and we can't compute size or offset without
data.

..  py:function:: report( top )

    Report the structure of this DDE in COBOL-like notation enriched with
    offsets and sizes.

::

    def report( top ):
        """Report on copybook structure."""
        for aDDE in top:
            if aDDE.variably_located:
                pass # Nothing special (yet)
            if aDDE.sizeScalePrecision and not aDDE.sizeScalePrecision.alpha:
                final, alpha, length, scale, precision, signed, decimal = aDDE.sizeScalePrecision
                nSpec= '{:d}.{:d}'.format( length, precision )
            else:
                nSpec= ""
            print( "{:<65s} {:3d} {:3d} {:5s}".format(aDDE.indent*'  '+str(aDDE), 
                aDDE.offset, aDDE.size, nSpec) )

..  py:function:: source( top )

    Print a version of the source.

::

    def source( top ):
        """Display a canonical version of the source from copybook parsing."""
        for aDDE in top:
            print( aDDE.indent*'  '+str(aDDE) )

..  py:function:: search( top, aName )

    Search the structure for a given name.
    This returns the found value or ``None``.

::

    def search( top, aName ):
        """Search down through the copybook structure."""
        for aDDE in top:
            if aDDE.name == aName: 
                return aDDE
        
..  py:function:: resolver( top )

    Apply :py:meth:`Allocation.resolve` throughout the structure.
    For each ``REDEFINES`` or ``OCCURS DEPENDING ON`` clause, locate the DDE to which
    it refers, saving a repeated searches.  

::

    def resolver( top ):
        """For each DDE.allocation which is based on REDEFINES, locate the referenced
        name.  For each DDE.occurs which has OCCURS DEPENDING ON, locate the
        referenced name.
        """
        for aDDE in top:
            aDDE.allocation.resolve( aDDE )
            aDDE.occurs.resolve( aDDE )

For allocation, we have three relationships:
Successor, Group, and Redefines. 
The first two don't involve names. Only Redefines involves a name that we want
to resolve before proceeding.

We *could* rely on memoization and do name resolution lazily as needed.

For occurs, we also have three versions:
Default (effectively 1), Fixed, and Depends On.
The first two don't involve names. Only Depends On involves a name that we want
to resolve before proceeding. 

       
..  py:function:: setDimensionality( top )

    Set Dimensionality for each DDE.
    This will be the sequence of the non-default ``OCCURS`` clauses that apply to each item. 
    This is the item's, plus any belonging to the parents of the item.

    The order is from top down to the elementary item.

::

    def setDimensionality( top ):
        def dimensions( aDDE ):
            """:returns: A tuple of parental DDE's with non-1 occurs clauses."""
            if aDDE.occurs.default:
                this_level= tuple()
            else:
                this_level= tuple( (aDDE,) )
            if aDDE.parent:
                return dimensions( aDDE.parent() ) + this_level
            else:
                # Reached the top!
                return this_level
        for aDDE in top:
            aDDE.dimensionality = dimensions( aDDE )

Set Size and Offset
====================

Sometimes, we can calculate field sizes and offsets statically.
If ``not top.variably_located`` then the structure is entirely static.
This function walks the structure, setting total size and offset.

Other times the locations are variable. If ``top.variably_located`` then
the structure has at least one ODO.

..  todo:: refactor setSizeAndOffset()

    Refactor :py:func:`setSizeAndOffset` into the :py:class:`Allocation` class methods 
    to remove isinstance() nonsense.

..  py:function:: setSizeAndOffset( aDDE, aRow=None, base=0 )

    This can be used with or without a row. The common case of all statically
    located items does not require a row. 
    It must be used with a row for the case of OCCURS Depending On.


..  todo:: Fix performance.

    This is called once per row: it needs to be simpler and 
    faster. Some refactoring can eliminate the if statements.

::  

    def setSizeAndOffset( aDDE, aRow=None, base=0 ):
        """Given a top-level DDE, a Row of data (or None), 
        assign offset, size, totalSize.
        Also, the case of an 88-level item, copy the parent USAGE to the child.
        
        size is the instance size. For non-group items, it comes from the
        PIC and OCCURS. For group items it's the sum of the children's sizes.
        
        totalSize = size * occurs.
        """
        # Pre-order: handle this item first.
        if isinstance(aDDE.allocation, Redefines):
            # REDEFINES simply copies details from the other item.
            # TODO: Push into Redefines
            # base= aDDE.offset= allocation.offset( base )
            base= aDDE.offset= aDDE.allocation.refers_to.offset
            aDDE.totalSize= aDDE.allocation.refers_to.totalSize
        else:
            # TODO: Push into Allocation as the generic rule.
            # base= aDDE.offset= allocation.offset( base )
            aDDE.offset= aDDE.allocation.offset(base)
            aDDE.totalSize= 0
        # Initialize the size -- it may get updated below for group-level items.
        aDDE.size= aDDE.usage.size()            
                
        ## logger.debug( "{0} Enter {1} offset={2}".format(">"*aDDE.indent, aDDE, aDDE.offset) )
        
        # For non-picture group-level, handle all of the children.
        for child in aDDE.children:
            setSizeAndOffset( child, aRow, base )
            base = child.offset + child.totalSize
            if isinstance(child.allocation, Redefines):
                # TODO: Push into Redefines: does nothing.
                # aDDE.size+= allocation.size()
                pass
            else:
                # TODO: Push into Allocation as the generic rule.
                # aDDE.size+= allocation.size()
                if child.level == '88':
                    pass
                else:
                    aDDE.size+= child.totalSize
        
        # Collect final results from handling the children.
        aDDE.totalSize = aDDE.size * aDDE.occurs.number(aRow)
        
        ## logger.debug( "{0} Exit  {1} size={2}*{3}={4}".format(
        ##     "<"*aDDE.indent, aDDE, aDDE.size, aDDE.occurs.number(aRow), aDDE.totalSize) )
       
This function can be used during DDE load time if there are no Occurs Depending On.
Otherwise, it must be used for each individual row which is read.
See :py:class:`sheet.LazyRow`.