The Origin Forum
File Exchange
Try Origin for Free
The Origin Forum
Home | Profile | Register | Active Topics | Members | Search | FAQ | Send File to Tech support
Username:
Password:
Save Password
Forgot your Password? | Admin Options

 All Forums
 Origin Forum
 Origin Forum
 Import Multiple ASCII files in batch
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic

AKazak

Russia
1205 Posts

Posted - 10/19/2022 :  03:21:11 AM  Show Profile  Edit Topic  Reply with Quote  View user's IP address  Delete Topic
OriginPro 2022b (64-bit) SR1 9.9.5.167
Windows 7 Pro SP1 x64

Greetings!

I have few thousands of CSV files to import on a single worksheet. I picked 10 sample files, set impASC GUI up, saved the settings as a scheme.

Now, how do I import the whole file set using saved impASC scheme?
Can you share an example, please?

Thank you.

---
Andrey

snowli

USA
1426 Posts

Posted - 10/19/2022 :  09:18:20 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
The new Conenct Multiple Files can only import them into same book, not same sheet.

So I would still recommend using Import CSV.
Start a new book.
In ImpSCV dialog, u can set 1st File Import Mode to Replace Existing Data.
Set Multi-File (Except 1st) Import Mode to be Start New Columns.

U can import them into same sheet.

I am not sure why u need to 1st import 10 files and save theme first. Maybe u have other settings?

Thanks, Snow


Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/21/2022 :  11:34:21 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by snowli

The new Conenct Multiple Files can only import them into same book, not same sheet.

So I would still recommend using Import CSV.
Start a new book.
In ImpSCV dialog, u can set 1st File Import Mode to Replace Existing Data.
Set Multi-File (Except 1st) Import Mode to be Start New Columns.

U can import them into same sheet.

I am not sure why u need to 1st import 10 files and save theme first. Maybe u have other settings?

Thanks, Snow




Dear Snow,

impASC does the same as impCSV.
However, if I select 2.5k CSV files it takes a huge time to pre-read all the files even before GUI window with settings appear.
Even if I wait for long enough, the push import button, then I see the following animation for even longer time:


Is there a way to call impASC or impCSV to ready thousands of file and use a user-saved scheme to parse the data from a LT script to maximizу the import speed?

Can you share a good example, please?

---
Andrey

Edited by - AKazak on 10/21/2022 11:48:27 AM
Go to Top of Page

snowli

USA
1426 Posts

Posted - 10/21/2022 :  3:37:19 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi Andrey,

Try the following with each setting included so u don't have to save a theme file for it. It imports all T*.csv under specified folder.

findFiles path:="C:\Program Files\OriginLab\Origin2023\Samples\Batch Processing" ext:=t*.csv;
sec;
for(int ii=1; ii<=fname.GETNUMTOKENS(CRLF); ii++)
{
string file$ = fname.GetToken(ii, CRLF)$;
impASC fname:=file$ options.FirstMode:=1 options.Mode:=1 options.names.FNameToSht:=0 options.names.FNameToBk:=0 options.names.FNameToBkComm:=0 options.names.FNameToColComm:=1;
}
watch;

This page shows different labtalk examples
https://www.originlab.com/doc/LabTalk/guide/Importing-Data


Also do you mind share a few of your files and also which columns you want to import so we can check if python, etc. will be faster, etc.


Thanks, Snow
Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/24/2022 :  01:04:09 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by snowli

Hi Andrey,

Try the following with each setting included so u don't have to save a theme file for it. It imports all T*.csv under specified folder.

findFiles path:="C:\Program Files\OriginLab\Origin2023\Samples\Batch Processing" ext:=t*.csv;
sec;
for(int ii=1; ii<=fname.GETNUMTOKENS(CRLF); ii++)
{
string file$ = fname.GetToken(ii, CRLF)$;
impASC fname:=file$ options.FirstMode:=1 options.Mode:=1 options.names.FNameToSht:=0 options.names.FNameToBk:=0 options.names.FNameToBkComm:=0 options.names.FNameToColComm:=1;
}
watch;

This page shows different labtalk examples
https://www.originlab.com/doc/LabTalk/guide/Importing-Data


Also do you mind share a few of your files and also which columns you want to import so we can check if python, etc. will be faster, etc.


Thanks, Snow





Dear Snow,

Thank you for providing the suggestion.
I will following these steps and provide feedback in case of issues.

Here is a sample set of the target CSV files to import: https://my.originlab.com/ftp/forum_and_kbase/Images/CSV%20Dataset%202022-10-24.zip
If you gain the import performance using Python, then can you share a snippet, please?

---
Andrey
Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/24/2022 :  02:57:01 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
I Generated the following code using ► command in GUI:

findFiles path:="D:\Sandbox\Test" ext:=*.csv;

sec;

impASC fname:=fname$ options.Sparklines:=0 options.Mode:=1 options.Cols.ColDesignations:=XX(Y) options.names.FNameToSht:=0 options.names.FNameToBk:=0 options.names.FNameToBkComm:=0 options.names.FNameToColComm:=1 options.partials.Partial:=2 options.partials.partial_cols.FirstCol:=3 options.partials.partial_cols.LastCol:=3 options.partials.partial_cols.ColRanges:=<optional> options.scripts.ScriptAfterAllImport:=<[<del col(1);
col(1)[C]$ = "";>]>;

watch;


and found the following after running the code.

1) Basically the script does emulate GUI and have relatively long execution time for just 6 CSV files:
quote:
Elapsed time = 25.546 sec.


2) options.scripts.ScriptAfterAllImport option in the above form does not work at all. How do I properly send a multi-line script as an input to options.scripts.ScriptAfterAllImport?

Thank you.

---
Andrey
Go to Top of Page

YimingChen

1664 Posts

Posted - 10/24/2022 :  09:39:00 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi Andrey,

Can you try the Python script below to import your files? You need to change the folder name.

Btw, you can set the system variable @PBN to 1 to prohibit progress bar from showing.


import originpro as op
import pandas as pd
import os

wks = op.new_sheet()
fd = r'E:\CSV Dataset 2022-10-24'
li = []
for file in os.listdir(fd):
    df = pd.read_table(os.path.join(fd, file), sep=',') 
    df.rename(columns = {list(df)[0]:file}, inplace=True)
    li.append(df)

frame = pd.concat(li, axis=1)
wks.from_df(frame)


James

Edited by - YimingChen on 10/24/2022 09:42:10 AM
Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/24/2022 :  09:44:49 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by YimingChen

Hi Andrey,

Can you try the Python script below to import your files? You need to change the folder name.

Btw, you can set the system variable @PBN to 1 to prohibit progress bar from showing.


import originpro as op
import pandas as pd
import os

wks = op.new_sheet()
fd = r'E:\CSV Dataset 2022-10-24'
li = []
for file in os.listdir(fd):
    df = pd.read_table(os.path.join(fd, file), sep=',') 
    df.rename(columns = {list(df)[0]:file}, inplace=True)
    li.append(df)

frame = pd.concat(li, axis=1)
wks.from_df(frame)


James



Dear James,

I will try using the Python and provide feedback.

In the meantime, can you answer two questions on LT above, please?

---
Andrey
Go to Top of Page

snowli

USA
1426 Posts

Posted - 10/24/2022 :  10:10:53 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi Andrey,

Regarding to options.scripts.ScriptAfterAllImport, we will check.

But since it's script that will run after all files imported, for now you can take it out and put it after the ASCII import part.

Thanks, Snow
Go to Top of Page

snowli

USA
1426 Posts

Posted - 10/24/2022 :  10:31:08 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi Andrey,

I added ORG-25864 for developer.

I found that if u put the script into " ", it will work. Somehow our Generate Script menu adds <[<, >]> around it if there is [] already in script.

E.g. change the corresponding part to
options.scripts.ScriptAfterAllImport:="del col(1);
col(1)[C]$ = "";";

It will work.

Thanks, Snow
Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/24/2022 :  1:34:24 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by snowli

Hi Andrey,

I added ORG-25864 for developer.

I found that if u put the script into " ", it will work. Somehow our Generate Script menu adds <[<, >]> around it if there is [] already in script.

E.g. change the corresponding part to
options.scripts.ScriptAfterAllImport:="del col(1);
col(1)[C]$ = "";";

It will work.

Thanks, Snow



Dear Snow,

Got it!

Dear James,

How do I modify the Python code to simulate the following options?
options.Mode:=1
options.Cols.ColDesignations:=XX(Y)
options.names.FNameToColComm:=1
options.partials.Partial:=2
options.partials.partial_cols.FirstCol:=3
options.partials.partial_cols.LastCol:=3

---
Andrey

Edited by - AKazak on 10/24/2022 1:34:46 PM
Go to Top of Page

YimingChen

1664 Posts

Posted - 10/24/2022 :  4:53:56 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply

import originpro as op
import pandas as pd
import os
fd = r'C:\Users\chris\Desktop\CSV Datasets 1000 files'
wks = op.new_sheet()
col = 0
for i, file in enumerate(os.listdir(fd)):
    if i == 0:
        df = pd.read_table(os.path.join(fd, file), sep=',')    
        df.rename(columns = {list(df)[2]:file}, inplace=True)
    else:
        df = pd.read_table(os.path.join(fd, file), sep=',', usecols=['Count'])    
        df.rename(columns = {list(df)[0]:file}, inplace=True)        
    wks.from_df(df, col)
    col = wks.cols
wks.cols_axis('x',1,1)
Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/24/2022 :  10:19:29 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by YimingChen


import originpro as op
import pandas as pd
import os
fd = r'C:\Users\chris\Desktop\CSV Datasets 1000 files'
wks = op.new_sheet()
col = 0
for i, file in enumerate(os.listdir(fd)):
    if i == 0:
        df = pd.read_table(os.path.join(fd, file), sep=',')    
        df.rename(columns = {list(df)[2]:file}, inplace=True)
    else:
        df = pd.read_table(os.path.join(fd, file), sep=',', usecols=['Count'])    
        df.rename(columns = {list(df)[0]:file}, inplace=True)        
    wks.from_df(df, col)
    col = wks.cols
wks.cols_axis('x',1,1)




I will try using the suggested code.
How do I set column designations from Python?

---
Andrey
Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/25/2022 :  02:53:31 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
findFiles path:="%(InputDataDirPath$)" ext:=*.csv;
impASC fname:=fname$
options.Sparklines:=0
options.Mode:=1
options.Cols.ColDesignations:=XX(Y)
options.names.FNameToSht:=0
options.names.FNameToBk:=0
options.names.FNameToBkComm:=0
options.names.FNameToColComm:=1
options.partials.Partial:=2
options.partials.partial_cols.FirstCol:=3
options.partials.partial_cols.LastCol:=3
options.partials.partial_cols.ColRanges:=<optional>
options.scripts.ScriptAfterAllImport:="del col(1); 
col(1)[C]$ = "";";

It seems that on practice if InputDataDirPath$ points to a directory with few thousands of CSV files, then the script execution hangs Origin.

How do I monitor script execution in time, that is get some feed back from a script during execution of a LT/Python script, for example output percentage or display a progress bar window?

---
Andrey

Edited by - AKazak on 10/25/2022 03:19:23 AM
Go to Top of Page

YimingChen

1664 Posts

Posted - 10/25/2022 :  08:29:47 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
You can use wks.cols_axis() to set column designation.
https://docs.originlab.com/originpro/classoriginpro_1_1worksheet_1_1WSheet.html#ae9a1df2c0ac08ea40a8826af4e2207c6




quote:
Originally posted by AKazak

quote:
Originally posted by YimingChen


import originpro as op
import pandas as pd
import os
fd = r'C:\Users\chris\Desktop\CSV Datasets 1000 files'
wks = op.new_sheet()
col = 0
for i, file in enumerate(os.listdir(fd)):
    if i == 0:
        df = pd.read_table(os.path.join(fd, file), sep=',')    
        df.rename(columns = {list(df)[2]:file}, inplace=True)
    else:
        df = pd.read_table(os.path.join(fd, file), sep=',', usecols=['Count'])    
        df.rename(columns = {list(df)[0]:file}, inplace=True)        
    wks.from_df(df, col)
    col = wks.cols
wks.cols_axis('x',1,1)




I will try using the suggested code.
How do I set column designations from Python?

---
Andrey

Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/25/2022 :  08:43:04 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by YimingChen

You can use wks.cols_axis() to set column designation.
https://docs.originlab.com/originpro/classoriginpro_1_1worksheet_1_1WSheet.html#ae9a1df2c0ac08ea40a8826af4e2207c6



Great!

How do I add a new sheet to the currently active workbook?

---
Andrey
Go to Top of Page

YimingChen

1664 Posts

Posted - 10/25/2022 :  08:52:40 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
wb = op.find_book()
wb.add_sheet()


quote:
Originally posted by AKazak

quote:
Originally posted by YimingChen

You can use wks.cols_axis() to set column designation.
https://docs.originlab.com/originpro/classoriginpro_1_1worksheet_1_1WSheet.html#ae9a1df2c0ac08ea40a8826af4e2207c6



Great!

How do I add a new sheet to the currently active workbook?

---
Andrey

Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/25/2022 :  08:57:13 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by AKazak

quote:
Originally posted by YimingChen


import originpro as op
import pandas as pd
import os
fd = r'C:\Users\chris\Desktop\CSV Datasets 1000 files'
wks = op.new_sheet()
col = 0
for i, file in enumerate(os.listdir(fd)):
    if i == 0:
        df = pd.read_table(os.path.join(fd, file), sep=',')    
        df.rename(columns = {list(df)[2]:file}, inplace=True)
    else:
        df = pd.read_table(os.path.join(fd, file), sep=',', usecols=['Count'])    
        df.rename(columns = {list(df)[0]:file}, inplace=True)        
    wks.from_df(df, col)
    col = wks.cols
wks.cols_axis('x',1,1)




I will try using the suggested code.
How do I set column designations from Python?

---
Andrey



The script executes much much faster than its LT counterpart.
Great improvement in performance!

However after running the script I experience the following GUI deformation:



---
Andrey
Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/25/2022 :  08:59:58 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by YimingChen

wb = op.find_book()
wb.add_sheet()

---
Andrey



Got it.

How do I output directly to the currently active sheet?

---
Andrey
Go to Top of Page

YimingChen

1664 Posts

Posted - 10/25/2022 :  09:04:49 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Can you attach your code here? I couldn't reproduce the issue.

Btw, can you start a new forum item as it's a different issue?

Thank you

James
Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/25/2022 :  09:10:50 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by YimingChen

Can you attach your code here? I couldn't reproduce the issue.

Btw, can you start a new forum item as it's a different issue?

Thank you

James



Dear James,

The code is the same:
import originpro as op
import pandas as pd
import os
fd = r'C:\Users\chris\Desktop\CSV Datasets 1000 files'
wks = op.new_sheet()
col = 0
for i, file in enumerate(os.listdir(fd)):
    if i == 0:
        df = pd.read_table(os.path.join(fd, file), sep=',')    
        df.rename(columns = {list(df)[2]:file}, inplace=True)
    else:
        df = pd.read_table(os.path.join(fd, file), sep=',', usecols=['Count'])    
        df.rename(columns = {list(df)[0]:file}, inplace=True)        
    wks.from_df(df, col)
    col = wks.cols
wks.cols_axis('x',1,1)


---
Andrey
Go to Top of Page

YimingChen

1664 Posts

Posted - 10/25/2022 :  09:15:01 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
To choose the active worksheet, use:
wks = op.find_sheet()

To choose worksheet by index, use:
wks = wb[0]




quote:
Originally posted by AKazak

quote:
Originally posted by YimingChen

wb = op.find_book()
wb.add_sheet()

---
Andrey



Got it.

How do I output directly to the currently active sheet?

---
Andrey


Edited by - YimingChen on 10/25/2022 09:15:13 AM
Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/27/2022 :  05:56:20 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by YimingChen

To choose the active worksheet, use:
wks = op.find_sheet()

To choose worksheet by index, use:
wks = wb[0]




Got it!
Surprisingly, Python code works much much faster than the LT analog.
When do you suggest to use LT with advantage over Python?

---
Andrey
Go to Top of Page

YimingChen

1664 Posts

Posted - 10/27/2022 :  10:03:50 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
We introduced Python several versions ago and it lacks some functionality of Labtalk. But if it meets your own needs, I would suggest using Python.

James
Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 10/31/2022 :  05:09:07 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by YimingChen

We introduced Python several versions ago and it lacks some functionality of Labtalk. But if it meets your own needs, I would suggest using Python.

James



Dear James,

Got it.
Can you share an insight about performance comparison between LT an Python for the corresponding functions, please?

---
Andrey
Go to Top of Page

minimax

357 Posts

Posted - 11/08/2022 :  10:49:25 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi Andrey,

Origin's script does some additional manipulations like
resize columns width,
write import file info into workbook (mainly for re-import purpose)
etc.
thus results in extra time.

You can toggle some system variables to skip those actions like below to get a faster speed.



@pbn=1; // turn off progress bar
@npi=1; // limit import info as 1 only
@isc=0; // turn off column width autoresize
@ias=0; // turn on multi-threads for all file size

findFiles path:="E:\CSV Datasets 1000 files" ext:=*.csv; 
newbook;
sec;  

impASC options.Sparklines:=0 options.Mode:=1 
options.names.FNameToSht:=0 options.names.FNameToBk:=0 
options.names.FNameToBkComm:=0 options.names.FNameToColComm:=1 
options.partials.Partial:=2 options.partials.partial_cols.FirstCol:=3 
options.partials.partial_cols.LastCol:=3
options.Miscs.SaveFileInfo:=0;  

del col(1);
wks.col1.type=4;
watch;

//reset default
@pbn=0;
@npi=0;
@isc=40;
@ias=10;


In my PC it is 50 sec(LT) vs. 30 sec (py) for 1000 files in Origin 2023.

PS, we did find some other bottlenecks in 2023 and will improve it in next version.
Go to Top of Page

snowli

USA
1426 Posts

Posted - 11/09/2022 :  09:08:41 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi Andrey,

Just FYI, this generating script issue has been fixed in our official Origin 2023.

Thanks, Snow

quote:
Originally posted by snowli

Hi Andrey,

I added ORG-25864 for developer.

I found that if u put the script into " ", it will work. Somehow our Generate Script menu adds <[<, >]> around it if there is [] already in script.

E.g. change the corresponding part to
options.scripts.ScriptAfterAllImport:="del col(1);
col(1)[C]$ = "";";

It will work.

Thanks, Snow

Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 11/09/2022 :  1:39:34 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by snowli

Hi Andrey,

Just FYI, this generating script issue has been fixed in our official Origin 2023.

Thanks, Snow

quote:
Originally posted by snowli

Hi Andrey,

I added ORG-25864 for developer.

I found that if u put the script into " ", it will work. Somehow our Generate Script menu adds <[<, >]> around it if there is [] already in script.

E.g. change the corresponding part to
options.scripts.ScriptAfterAllImport:="del col(1);
col(1)[C]$ = "";";

It will work.

Thanks, Snow





Dear Snow,

Got it!
Thank you for notifying.

---
Andrey
Go to Top of Page

Shirley_GZ

China
Posts

Posted - 05/05/2023 :  02:57:09 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Dear AKazak,

Now Origin 2023b is available!
And the issue about Toolbar Spacer Off by default (ORG-25993) has been fixed/supported in this new version.

You can get the Origin 2023b in the following webpage, and then try to install and use it.
https://www.originlab.com/demodownload.aspx

Thanks,
Shirley
OriginLab

Originlab Technical Service Team
Go to Top of Page

AKazak

Russia
1205 Posts

Posted - 05/08/2023 :  10:18:16 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by Shirley_GZ

Dear AKazak,

Now Origin 2023b is available!
And the issue about �Toolbar Spacer Off by default� (ORG-25993) has been fixed/supported in this new version.

You can get the Origin 2023b in the following webpage, and then try to install and use it.
https://www.originlab.com/demodownload.aspx

Thanks,
Shirley
OriginLab

Originlab Technical Service Team



Dear Shirley,

Great news!

Thank you.

---
Andrey
Go to Top of Page
  Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
The Origin Forum © 2020 Originlab Corporation Go To Top Of Page
Snitz Forums 2000