Skip to main content


Showing posts from June, 2017

#processing @Microsoft #office #Excel files with @TheASF POI (part II)

     Apache POI's OPCPackage abstract class represents a container that can store multiple data objects.  It is central to the processing of Excel(*.xlsx) files.  We only need to use its static open method to process an InputStream instance.  Further, we can "read" these Excel files via the XSSFWorkbook class.  This class is a high level representation of a SpreadsheetML workbook.  From an XSSFWorkbook, we can get any existing XSSFSheets within the workbook.  Then, we can further subdivide any XSSFSheet into rows and analyze the cell data within the rows.  In general, given certain assumptions in the format of the Excel document, we can extract data as text  from a cell and perform any number of business processes.

     In the Java function code excerpt below, we assume we have an Excel(*.xlsx) file represented as an InputStream.

    public Iterator<Row> apply(InputStream inputStream) {

        try(OPCPackage pkg =…