Dynamic Excel import loop with TARGIT InMemory

In this example, I loop through a collection of Excel files in a directory and load the data from each Excel file into one TARGIT InMemory table. 

I use two data drivers for this example. FileList to create a table with the Excel files and then the Excel data driver to load each Excel file. 

The Excel data driver is linked to a single Excel file. Therefore I have to create and close the Excel driver for each file in the loop.

The directory with the source Excel files is defines in the variable "filepath" in line 7, so you can change the path and run the script.

LOAD ASSEMBLY 'TARGIT.FileList.dll' //server
LOAD ASSEMBLY 'TARGIT.Excel.dll' //server

DECLARE @filepath AS STRING
DECLARE @fullpath AS STRING

SET @filepath = 'C:\Transactions\Excel'

DATASOURCE [Filelist] = DOTNET CONNECTION 'TARGIT.FileList.FileListConnection' 'loadasstring=false;'
IMPORT [Excelfiles] = [Filelist].{SELECT * FROM [@@filepath] WHERE filename='*.xlsx' AND directory='current'}

WHILE Excelfiles.EOF=FALSE /* Loop over all Excel files */
SET @fullpath = Excelfiles.fullpath
PRINT 'Loading the file ' + @fullpath
DATASOURCE [Excelfile] = DOTNET CONNECTION 'TARGIT.Excel.ExcelConnection' 'ignoreemptyvalues=false;datasource=' + @fullpath + ';detectionrowscount=100;skiplines=0;hasheaders=True;loadasstring=true'
IMPORT [STAGE_Excel_Person] = Excelfile.{SELECT * FROM []}
CLOSE Excelfile
MoveNext Excelfiles
LOOP

// Remove list of Excelfiles from final database
Drop Excelfiles

SAVE

 

Alternative script - Adding filename to the data:

In this example, I add an extra step "STAGE_Excel_Person_001" where I load the data from Excel.

In the loop, I've added another variable "filename" using TARGIT.Excel.filename property.

Then I combine the data from "STAGE_Excel_Person_001" table with the variable "filename" - and drops the "STAGE_Excel_Person_001" table.

LOAD ASSEMBLY 'TARGIT.FileList.dll' //server
LOAD ASSEMBLY 'TARGIT.Excel.dll' //server

DECLARE @filepath AS STRING
DECLARE @filename AS STRING
DECLARE @fullpath AS STRING

SET @filepath = 'C:\Transactions\Excel'

DATASOURCE [Filelist] = DOTNET CONNECTION 'TARGIT.FileList.FileListConnection' 'loadasstring=false;'
IMPORT [Excelfiles] = [Filelist].{SELECT * FROM [@@filepath] WHERE filename='*.xlsx' AND directory='current'}

WHILE Excelfiles.EOF=FALSE /* Loop over all Excel files */ 
  SET @fullpath = Excelfiles.fullpath
  SET @filename = Excelfiles.filename
  PRINT 'Loading the file ' + @filename
  DATASOURCE [Excelfile] = DOTNET CONNECTION 'TARGIT.Excel.ExcelConnection' 'ignoreemptyvalues=false;datasource=' + @fullpath + ';detectionrowscount=100;skiplines=0;hasheaders=True;loadasstring=true'
  IMPORT [STAGE_Excel_Person_001] = Excelfile.{
      SELECT * FROM []
      } 
  CLOSE Excelfile
  
  IMPORT [STAGE_Excel_Person] = ME.{
      SELECT '@@filename' AS [SourceFilename], * FROM STAGE_Excel_Person_001
  }
  
  DROP [STAGE_Excel_Person_001]
  
  MoveNext Excelfiles
LOOP

// Remove list of Excelfiles from final database
Drop Excelfiles

SAVE

Please note:

Each Excel file is expected to have the same data structure. You could consider to add try catch to the loop to avoid the script from stopping, if a file has a different data structure.

Data is loaded as strings, so we expect you to cast data with SAFECAST afterwards, but that is not the focus in this article.

I've added "SAVE" in the end of the script, so it can run on it's own. But you might want to leave SAVE out, if you copy this script into an existing ETL flow.

Demo data:

Below this article you will find "excel_example_data.zip" containing two small Excel files (file1.xlsx and file2.xlsx) that you can use for testing.

Was this article helpful?
1 out of 1 found this helpful

Comments

6 comments
  • Hi Jesper

    Thanks for the answer, it works perfect :-)

    Regards Claus

    0
  • Hi Claus,

    Great question. 
    I tested adding a variable directly to the Excel load script, but the Excel driver does not support this.

    So I added another step to the original script using a temp table ([STAGE_Excel_Person_001]). I've also adjusted the variables, now using the Excel.filename property.

    The script is also added to the main article above, but here you have the adjusted script:

    LOAD ASSEMBLY 'TARGIT.FileList.dll' //server
    LOAD ASSEMBLY 'TARGIT.Excel.dll' //server

    DECLARE @filepath AS STRING
    DECLARE @filename AS STRING
    DECLARE @fullpath AS STRING

    SET @filepath = 'C:\Transactions\Excel'

    DATASOURCE [Filelist] = DOTNET CONNECTION 'TARGIT.FileList.FileListConnection' 'loadasstring=false;'
    IMPORT [Excelfiles] = [Filelist].{SELECT * FROM [@@filepath] WHERE filename='*.xlsx' AND directory='current'}

    WHILE Excelfiles.EOF=FALSE /* Loop over all Excel files */ 
      SET @fullpath = Excelfiles.fullpath
      SET @filename = Excelfiles.filename
      PRINT 'Loading the file ' + @filename
      DATASOURCE [Excelfile] = DOTNET CONNECTION 'TARGIT.Excel.ExcelConnection' 'ignoreemptyvalues=false;datasource=' + @fullpath + ';detectionrowscount=100;skiplines=0;hasheaders=True;loadasstring=true'
      IMPORT [STAGE_Excel_Person_001] = Excelfile.{
          SELECT * FROM []
          } 
      CLOSE Excelfile
      
      IMPORT [STAGE_Excel_Person] = ME.{
          SELECT '@@filename' AS [SourceFilename], * FROM STAGE_Excel_Person_001
      }
      
      DROP [STAGE_Excel_Person_001]
      
      MoveNext Excelfiles
    LOOP

    // Remove list of Excelfiles from final database
    Drop Excelfiles

    SAVE

    0
  • Jesper da Silva Endelt - If u can show how to loop through a collection of Json-files, that would be much appreciated :)

    0
  • Great. Thank you for the update.

    0
  • Hi Jesper

    is there any way you can get the filename in the table as well. Very handy when you need to figure out where a line is originating from. Also if you have several spretsheets with same data but different month ie.

     

    0
  • Nice post Jesper.

    Just created a new In Memory Project, pasted in the script and changed the "filepath".

    Runs out-of-the-box and my 21 files are imported into one table. Thanks :-)

    0

Please sign in to leave a comment.