The Origin Forum
File Exchange
Try Origin for Free
The Origin Forum
Home | Profile | Register | Active Topics | Members | Search | FAQ | Send File to Tech support
Username:
Password:
Save Password
Forgot your Password? | Admin Options

 All Forums
 Origin Forum for Programming
 Forum for Origin C
 averaging data in multiple columns
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic

NorthwestLee

USA
Posts

Posted - 04/16/2005 :  2:28:02 PM  Show Profile  Edit Topic  Reply with Quote  View user's IP address  Delete Topic
Origin Version (Select Help-->About Origin): 7.0
Operating System: XP


I am new to Origin and I was reading through the automating analysis demonstrations, and the included Automation file in the samples folder, but I am rather lost in the code.

I need to take a series of about 100 project files (with the ability to add more) which each contain a worksheet that has a standard format. I need to extract the data from Column E and average it for each row in that column for every file. E is 6 rows long and so I want to calculate an average for row 1 of E for all the projects, average for row 2 etc. I need to do the same for row G as well. On some of them the row length will not be 6 and so I want that when a row is not present in E for it to not go into the average. The X values in row D which correspond to the Y values in E will not change. Then I want to graph the average Y's vs. the respective X values.

Any help or suggestions of where to read and start would be appreciated

Thanks

easwar

USA
1965 Posts

Posted - 04/16/2005 :  3:01:32 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi,

Here is an Origin C code segment that could be a starting point for you. Here it is assumed the column is col(E) in wks Data1 in every project.

Easwar
OriginLab


void average_rows()
{
// Bring up file dialog and get user to select all OPJs to be processed
int iNumFiles;
StringArray saFilePaths;
iNumFiles = GetMultiOpenBox( saFilePaths, "*.OPJ");

// If user selected less than 2 files, quit
if( iNumFiles < 2 )
{
out_str("Less than 2 OPJs selected!");
return;
}

// Declare vector to hold data and also counts
// Initialize size to be large enough and set to zero
vector vecEData(100);
vector<int> vecECount(100);
vecEData = 0;
vecECount = 0;

// Loop over all files
for(int ii = 0; ii < iNumFiles; ii++)
{
// If successful in opening project...
if( Project.Open(saFilePaths[ii]) )
{
printf("Processing: %s\n", saFilePaths[ii]);
// Declare dataset for data1 wks col E
// Change this as desired to point to your specific wks and col name
Dataset dsE("data1_e");
// If dataset is valid...
if( dsE.IsValid() )
{
// Loop over all elements and add to data vector and count vector
for(int jj = 0; jj < dsE.GetSize(); jj++)
{
vecEData[jj] += dsE[jj];
vecECount[jj] += 1;
}
}
else
out_str("Data1_E column not found in current project!")
}
}

// Done with all files - output the results to script window
// One could instead create a new OPJ and place results in a worksheet
printf("Row #\tCount\tAverage\n");
for(ii = 0; ii < 100; ii++)
{
if( 0 == vecECount[ii]) break;
printf("%d\t%d\t%f\n", ii, vecECount[ii], vecEData[ii] / vecECount[ii]);
}
}




Edited by - easwar on 04/16/2005 3:04:20 PM
Go to Top of Page

NorthwestLee

USA
Posts

Posted - 04/17/2005 :  10:27:45 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Thank you so much for the help.

On building the code I get the error Error, "function or variable GetMultiOpenBox not found", do I need to add that function to my work space or what do I do?

How would I go about saving the information to its own OPJ?

Thanks again.
Go to Top of Page

easwar

USA
1965 Posts

Posted - 04/18/2005 :  5:10:40 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi,

My apologies...that function is only avaiable in ver 7.5 (which is what I was using) and not available in v 7.0

I have a couple of questions so we could post an alternate solution:
1> The data in the OPJ - are they a result of just importing some file? If that is the case, it may be better to read the data files directly instead of the OPJs?

2> If you do need to read OPJs and not data files, would it be okay for user to pick just a subfolder and all the OPJs inside that subfolder are read?

Easwar
OriginLab

Go to Top of Page

NorthwestLee

USA
Posts

Posted - 04/18/2005 :  7:31:49 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Yes, they are from importing data files, but I am not the one who did that and so I do not know what kind of filter was used. Also each project has some other information and graphs made.

Yes, it would work to just have the user pick a subfolder.

Thank you so very much. I am trying to get familiar with the Origin C language, is there a site or some source you would recommend for the basics to supplement the manual. I have some Java experience and Matlab, but I am very rusty with the Java and I have no C experience.

I was thinking it might also work if I could get Origin to just spit the columns I want into an ASCII file since from there I could attack it with Matlab. The problem with the current data is it has combinations of text and numbers that it wasn?t feasible to import into Matlab. But I think an Origin solution would probably be better.

Anyway, thanks again for all your help.
Go to Top of Page

easwar

USA
1965 Posts

Posted - 04/18/2005 :  10:18:10 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi,

You should not have to save things as ASCII and then use MATLAB. Such things can be coded in Origin either with LabTalk script or Origin C code.

The following Origin C code should work in ver 7. Once compiled, go to the script window and type for example:
average_rows "c:\temp"
and it will process all OPJs found in the specified subfolder, c:\temp in the above example command.

As for Origin C examples, there are various sources:
1> Programming guide in the help files
2> Code snippets under each class method etc, found by browsing the Origin C language reference help files
3> The forum - you can do a search on all forums with your key words to find code segments. You can include a { in your search string and that will mostly ensure that you get threads that have code posted in them
4> The following web page on our site:
http://www.originlab.com/index.aspx?s=9&lm=71&pid=268
Some of the examples in the above site may only work in 7.5

You can always post here for further help, or directly contact tech support.

Easwar
OriginLab

P.S. Please double check the computation by processing say 2 or 3 files. Also there could be ways to make this more efficient etc.


#define MAX_ROWS 100

void average_rows(string strFolderPath)
{
// Add a \ to the path if necessary
if( 0 != strFolderPath.Right(1).Compare("\\") )
strFolderPath += "\\";

// Get list of all OPJ files in specified folder
StringArray saOPJFiles;
bool bRet = FindFiles(saOPJFiles, strFolderPath, "OPJ");
if( !bRet )
{
out_str("Failed to find OPJ files in specified path!");
return;
}

// Get file count
int nFileCount = saOPJFiles.GetSize();
if( nFileCount < 2 )
{
out_str("Less than 2 OPJs found!");
return;
}

// Declare vector to hold X values
vector vecX();
// Declare vector to hold summed E column data and counts
// Initialize size to be large enough and set to zero
vector vecEData(MAX_ROWS);
vector<int> vecECount(MAX_ROWS);
vecEData = 0;
vecECount = 0;

// Loop over all files
for(int ii = 0; ii < nFileCount; ii++)
{
string strOPJFileName = strFolderPath + saOPJFiles[ii];
// If successful in opening project...
if( Project.Open(strOPJFileName) )
{
printf("Processing: %s\n", strOPJFileName);
// Assume X data is in col A
// and there is a col E
Dataset dsA("data1_a");
Dataset dsE("data1_e");
if( dsA && dsE )
{
// If dsA is larger than current X vector, then update X vector
if( dsA.GetSize() > vecX.GetSize() )
vecX = dsA;
// Loop over all elements of dsE and add to E data vector and count vector
for(int jj = 0; jj < dsE.GetSize(); jj++)
{
vecEData[jj] += dsE[jj];
vecECount[jj] += 1;
}
}
else
out_str(" Dataset Data1_A or Data1_E not found in this project!");
}
else
printf("Failed to open %s\n", strOPJFileName);
}

// Done with all files

// Create a new OPJ in same folder for storing result
Project.Open();
// Create a new worksheet and add cols for output
WorksheetPage wpg;
wpg.Create("Origin");
Worksheet wks = wpg.Layers(0);
while( wks.DeleteCol(0) );
wks.AddCol("XValue");
wks.Columns(0).SetType(OKDATAOBJ_DESIGNATION_X);
wks.AddCol("AverageE");
Dataset dsXValue(wks, 0);
Dataset dsEAverage(wks, 1);
for(ii = 0; ii < MAX_ROWS; ii++)
{
if( 0 == vecECount[ii]) break;
dsXValue.Add(vecX[ii]);
dsEAverage.Add( vecEData[ii] / vecECount[ii] );
}
// Save this result OPJ
string strResultOPJ = strFolderPath + "AverageResult.OPJ";
if( !Project.Save(strResultOPJ) )
out_str("Failed to save result OPJ!");
}



Go to Top of Page

NorthwestLee

USA
Posts

Posted - 04/25/2005 :  11:27:15 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Thanks, it didn't seem to work in Origin 7 (the FindFiles had problems), but I downloaded origin 7.5 evaluation and I believe our lab will upgrade.

I need to get the columns from a data sheet that is the name of the OPJ, I tried doing this by creating a string from the OPJ name, but I couldn't figure out how to create a data set that would look for the column in that varying name for the worksheet. I also needed to remove the .OPJ from the name.

Lee
Go to Top of Page

NorthwestLee

USA
Posts

Posted - 05/01/2005 :  6:13:26 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
I am trying to plot the data at the same time as well before it saves. I tried incorporating other code I found on this site for plotting a simple data set.

// Create a new OPJ in same folder for storing result
Project.Open();
// Create a new worksheet and add cols for output
WorksheetPage wpg;
wpg.Create("Origin");
Worksheet wks = wpg.Layers(0);
while( wks.DeleteCol(0) );
wks.AddCol("XValue");
wks.Columns(0).SetType(OKDATAOBJ_DESIGNATION_X);
wks.AddCol("AverageE");
Dataset dsXValue(wks, 0);
Dataset dsEAverage(wks, 1);
for(ii = 0; ii < MAX_ROWS; ii++)
{
if( 0 == vecECount[ii]) break;
dsXValue.Add(vecX[ii]);
dsEAverage.Add( vecEData[ii] / vecECount[ii] );
}

GraphPage pg;
if ( !pg.Create("VectXY"))
{
out_str("Cannot create page");
return;
}
GraphLayer gl = Project.ActiveLayer();
Curve cuvFit1("Data2_XValue[X]", "Data2_AverageE[Y}");
int nPlot = gl.AddPlot(cuvFit1, IDM_PLOT_FLOWVECTOR);

// Save this result OPJ
string strResultOPJ = strFolderPath + "/average/AverageResult.OPJ";
if( !Project.Save(strResultOPJ) )
out_str("Failed to save result OPJ!");


but the graph opens but doesn't plot anything. I think it is not finding the data sheet or something. Also then the project fails to save.

Thanks for the help
Go to Top of Page

easwar

USA
1965 Posts

Posted - 05/01/2005 :  11:47:32 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:

Curve cuvFit1("Data2_XValue[X]", "Data2_AverageE[Y}");
int nPlot = gl.AddPlot(cuvFit1, IDM_PLOT_FLOWVECTOR);



The first line here does not look right. If your column name is "Data2_XValue", just use that and drop the additional "[X]".

Also, instead of hard coded names you should change to a different constructor for curve that uses wks and the column indices:

//....
Curve cuvFit(wks, 0, 1);

Also note that when you add a curve to a graph layer, you can check the return int value to see if it succeeded. If success you should get a number that is 0 or a larger plot index. If fail, it will be -1.

quote:

Also then the project fails to save.



Did you check to make sure your path is valid - such a folder exists, you have write previlege etc?

Please contact tech support directly with your entire code segment if you need further help.

Easwar
OriginLab

Go to Top of Page

NorthwestLee

USA
Posts

Posted - 05/03/2005 :  01:01:31 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
the save feature works now with the graph fixed, although it graphs 3 lines now instead of making one plot, I can't quite figure it out.
Go to Top of Page

mynameisok

China
Posts

Posted - 04/11/2007 :  04:19:56 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:


vecEData[jj] += dsE[jj];
vecECount[jj] += 1;

printf("%d\t%d\t%f\n", ii, vecECount[ii], vecEData[ii] / vecECount[ii]);





i have a question about this.

Supposing that there is 1 missing data in one cell of a Column, when vecEData[ii] / vecECount[ii] happends,vecEData[ii] is composed of the data which has vecECount[ii]-1 Columns.(or vecEData[ii] is missing data too. ) that's not the average i want.


how to avoid this?




thank you!

Edited by - mynameisok on 04/11/2007 04:22:35 AM
Go to Top of Page

Mike Buess

USA
3037 Posts

Posted - 04/11/2007 :  09:00:22 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
You can skip the missing values in your computations like this...

if( vecEData[ii]!=NANUM || vecECount[ii]!=NANUM )
{
// do math if values are not missing
}
else
{
// set result to some default value
}

Mike Buess
Origin WebRing Member
Go to Top of Page

mynameisok

China
Posts

Posted - 04/11/2007 :  10:26:09 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:

You can skip the missing values in your computations like this...

if( vecEData[ii]!=NANUM || vecECount[ii]!=NANUM )
{
// do math if values are not missing
}
else
{
// set result to some default value
}

Mike Buess
Origin WebRing Member





sorry i do not understand.

take this for example.


data1

A B C
1 1
2 -
3 3
4 4

data1_c is the average of A and B
and i hope c is like this

C
1
2
3
4

not

C
1
-
3
4



Edited by - mynameisok on 04/11/2007 10:44:22 AM
Go to Top of Page

Mike Buess

USA
3037 Posts

Posted - 04/11/2007 :  10:48:26 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Simple case: only col B contains missing values...

Worksheet wks = Project.ActiveLayer();
Dataset ds1(wks,0);
Dataset ds2(wks,1);
Dataset ds3(wks,2);
ds3.SetSize(ds1.GetSize());
for(int i=0;i<ds1.GetSize();i++)
{
if( ds2[i]==NANUM )
{
ds3[i] = ds1[i];
}
else
{
ds3[i] = (ds1[i] + ds2[i]) / 2.0;
}
}

Mike Buess
Origin WebRing Member

Edited by - Mike Buess on 04/11/2007 10:52:30 AM
Go to Top of Page

mynameisok

China
Posts

Posted - 04/11/2007 :  11:18:07 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:

Simple case: only col B contains missing values...

Worksheet wks = Project.ActiveLayer();
Dataset ds1(wks,0);
Dataset ds2(wks,1);
Dataset ds3(wks,2);
ds3.SetSize(ds1.GetSize());
for(int i=0;i<ds1.GetSize();i++)
{
if( ds2[i]==NANUM )
{
ds3[i] = ds1[i];
}
else
{
ds3[i] = (ds1[i] + ds2[i]) / 2.0;
}
}

Mike Buess
Origin WebRing Member

Edited by - Mike Buess on 04/11/2007 10:52:30 AM




get it!

but the example is too simple.

the number of Columns is not fixed and each Column has lots of mising data.

it will be much more easy in excel sheet using the function average,but it's restricted to manual work

Go to Top of Page

Mike Buess

USA
3037 Posts

Posted - 04/11/2007 :  11:50:38 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Actually, Origin does have a manual method to average like you want. Just select the columns you want to average and select Statistics > Descriptive Statistics > Statistics on Rows. This will create a new worksheet with the desired averages in the second column.

To find out the LabTalk script that is run by Statistics on Rows just press the Ctrl and Shift keys while selecting the menu item. The following simple LabTalk script averages all columns in the active worksheet and puts the results in a new column.

_sum=sum(%H,1,wks.ncols); // stats to temporary datasets
wo -a 1; // add column
wcol(wks.ncols)=_MEAN; // save means to new column
del -a; // del temporary datasets

Mike Buess
Origin WebRing Member

Edited by - Mike Buess on 04/11/2007 12:26:44 PM
Go to Top of Page

mynameisok

China
Posts

Posted - 04/11/2007 :  12:10:13 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:

Statistics > Descriptive Statistics > Statistics on Rows.


nice

thanks !
Go to Top of Page
  Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
The Origin Forum © 2020 Originlab Corporation Go To Top Of Page
Snitz Forums 2000