8.4.4. XLS Workbook¶
import logging
import pprint
from stingray.workbook.base import Workbook
import stingray.sheet
import stingray.cell
import xlrd
-
class
workbook.xls.
XLS_Workbook
¶ Extract sheets, rows and cells from an XLS format file.
This definition of a workbook wraps
xlrd
so that it fits the Stingray framework. We’ll use propercell.Cell
subclass instances instead of the defaultxlrd.Cell
values thatxlrd
normally creates.In addition to the superclass attributes, some additional unique attributes are introduced here.
-
wb
¶ A xlrd workbook for this file.
-
datemode
¶ The XLS date mode for this workbook. This is required for converting floating-point values to dates and dates to floating-point values.
-
class XLS_Workbook( Workbook ):
"""Uses ``xlrd``."""
def __init__( self, name, file_object=None, **kw ):
"""Prepare the workbook for reading.
:param name: File name
:param file_object: Optional file-like object. If omitted, the named file is opened.
The kw arguments are passed to :py:func:`xlrd.open_workbook`.
"""
super().__init__( name, file_object )
if self.file_obj:
self.wb= xlrd.open_workbook( self.name, file_contents=self.file_obj.read(), **kw )
else:
self.wb= xlrd.open_workbook( self.name, **kw )
self.datemode= self.wb.datemode
-
XLS_Workbook.
sheets
()¶ Return the list of sheets for this workbook.
def sheets( self ):
"""List of sheet names."""
return self.wb.sheet_names()
We can build an eager sheet.Row
or a sheet.LazyRow
from the available data.
The eager Row includes the conversions. The LazyRow defers the conversions
until the callback to XLS_Workbook.row_get()
.
-
XLS_Workbook.
rows_of
(sheet)¶ Iterate through rows of the given sheet.
def rows_of( self, sheet ):
"""An iterator over all rows of the given sheet."""
self.sheet= self.wb.sheet_by_name(sheet.name)
for n in range(self.sheet.nrows):
data = self.sheet.row(n)
row = stingray.sheet.Row( sheet, *(self.cell(col) for col in data) )
yield row
-
XLS_Workbook.
row_get
(row, attribute)¶ Low-level get of a particular attribute from the given row.
def row_get( self, row, attribute ):
"""Create a Cell from the row's data."""
return row[attribute.position]
-
XLS_Workbook.
cell
(row, xlrd_cell)¶ In
XLS_Workbook.rows_of()
we built a row eagerly. That way, returning an individual Cell is easy.Convert a single
xlrd.Cell
to a proper subclass ofcell.Cell
def cell( self, xlrd_cell ):
if xlrd_cell.ctype == xlrd.XL_CELL_EMPTY:
return stingray.cell.EmptyCell('', self)
elif xlrd_cell.ctype == xlrd.XL_CELL_TEXT:
return stingray.cell.TextCell( xlrd_cell.value, self )
elif xlrd_cell.ctype == xlrd.XL_CELL_NUMBER:
return stingray.cell.NumberCell( xlrd_cell.value, self )
elif xlrd_cell.ctype == xlrd.XL_CELL_DATE:
return stingray.cell.FloatDateCell( xlrd_cell.value, self )
elif xlrd_cell.ctype == xlrd.XL_CELL_BOOLEAN:
return stingray.cell.BooleanCell( xlrd_cell.value, self )
elif xlrd_cell.ctype == xlrd.XL_CELL_ERROR:
return stingray.cell.ErrorCell(
xlrd.error_text_from_code[xlrd_cell.value], self )
elif xlrd_cell.ctype == xlrd.XL_CELL_BLANK:
return stingray.cell.EmptyCell('', self)
else:
raise ValueError( "Damaged Workbook" )
For proper date conversions, we have
two methods that leverage the datemode to properly convert dates
and times in .XLS
workbooks.
-
Workbook.float_to_date( value ):
def float_to_date( self, value ):
try:
dt= xlrd.xldate_as_tuple(value, self.datemode)
except xlrd.xldate.XLDateAmbiguous as e:
ex= ValueError( "Ambiguous Date: {0!r}".format(value) )
raise ex from e
return datetime.datetime(*dt)
-
Workbook.date_to_float( value ):
def date_to_float( value ):
timetuple= self.value.timetuple()[:6]
xl= xlrd.xldate.xldate_from_datetime_tuple(
timetuple,
self.datemode)
return xl