Author |
Topic |
|
Pascal_S
Germany
4 Posts |
Posted - 10/12/2023 : 4:10:31 PM
|
Hello,
I am currently trying to automate a spectral data evaluation. For this purpose I am writing a small Origin Python script. Here, two columns of 14000 and 20000 entries are filled with NaN. The numerical values of the first column represent recorded spectral lines. I would like to compare these with the values of another column, which contain spectral lines from a database. To make this possible, I consider the following: I use two for-loops for the recorded values and the database values. The first loop is to go through the recorded values. The second loop is in the body of the first. This loop should go through the database values for the first, second etc. recorded value and calculate the difference between the recorded value and all database values. If the difference is less than, for example, 0.2 nm, the loop is to be interrupted and the database value is to be attributed to the recorded value.
I determine the number of rows with a shape[0] function. However, this is based on the first column, which unfortunately contains over 20000 entries. The spectral lines, however, are limited to about 100 entries. The values of the database are limited to 6000 entries. The remaining values are filled with NaN up to the row 20000, as already mentioned above. Now I have the following question: Is it possible to determine the number of entries without NaN for each individual column? shape[0] looks at the whole sheet and set the row number to the sheets maximum. Is it possible to stop the for-Loop if we detect two consecutive NaN values? The condition" if result2.iat[row_Peaks, column_Peaks] == NaN: break " is not even working for a single NaN value. If I dont include that the program will try to calculate a diference for all 20000 entries instead of the 100 given. Maybe there is an operator instead of break which will only skip the NaN calucaltions?
Some Code: . . . column_Peaks = 4 column_NIST = 3 row = 0 max_row = result2.shape[0] for row_Peaks in range(0, max_row): if result2.iat[row_Peaks, column_Peaks] == nan: break
for row_NIST in range(0, max_row): dif = result2.iat[row_Peaks, column_Peaks] - result2.iat[row_NIST, column_NIST] #print(dif) #if dif < 0.2: #Code to create three new columns. One with the recorded value, one with the matched database value and one for the diference. #break
---
max_row should be split in two: max_row_Peaks and max_row_NIST. But I have no clue how two access the invidual column to determine the length.
Maybe there is a simplyfied solution to compare two columns even if the values do not fit 100%. It has to be like 0,98 == 1. Do you have any ideas of a simpler way?
Thank you for your help
Lg Pascal
Edit: Sorry I dont know if its possible to include proper Code in the forum |
Edited by - Pascal_S on 10/12/2023 4:11:09 PM |
|
Castiel
343 Posts |
Posted - 10/15/2023 : 12:46:58 PM
|
quote: Originally posted by Pascal_S
Hello,
I am currently trying to automate a spectral data evaluation. For this purpose I am writing a small Origin Python script. Here, two columns of 14000 and 20000 entries are filled with NaN. The numerical values of the first column represent recorded spectral lines. I would like to compare these with the values of another column, which contain spectral lines from a database. To make this possible, I consider the following: I use two for-loops for the recorded values and the database values. The first loop is to go through the recorded values. The second loop is in the body of the first. This loop should go through the database values for the first, second etc. recorded value and calculate the difference between the recorded value and all database values. If the difference is less than, for example, 0.2 nm, the loop is to be interrupted and the database value is to be attributed to the recorded value.
I determine the number of rows with a shape[0] function. However, this is based on the first column, which unfortunately contains over 20000 entries. The spectral lines, however, are limited to about 100 entries. The values of the database are limited to 6000 entries. The remaining values are filled with NaN up to the row 20000, as already mentioned above. Now I have the following question: Is it possible to determine the number of entries without NaN for each individual column? shape[0] looks at the whole sheet and set the row number to the sheets maximum. Is it possible to stop the for-Loop if we detect two consecutive NaN values? The condition" if result2.iat[row_Peaks, column_Peaks] == NaN: break " is not even working for a single NaN value. If I dont include that the program will try to calculate a diference for all 20000 entries instead of the 100 given. Maybe there is an operator instead of break which will only skip the NaN calucaltions?
Some Code: . . . column_Peaks = 4 column_NIST = 3 row = 0 max_row = result2.shape[0] for row_Peaks in range(0, max_row): if result2.iat[row_Peaks, column_Peaks] == nan: break
for row_NIST in range(0, max_row): dif = result2.iat[row_Peaks, column_Peaks] - result2.iat[row_NIST, column_NIST] #print(dif) #if dif < 0.2: #Code to create three new columns. One with the recorded value, one with the matched database value and one for the diference. #break
---
max_row should be split in two: max_row_Peaks and max_row_NIST. But I have no clue how two access the invidual column to determine the length.
Maybe there is a simplyfied solution to compare two columns even if the values do not fit 100%. It has to be like 0,98 == 1. Do you have any ideas of a simpler way?
Thank you for your help
Lg Pascal
Edit: Sorry I dont know if its possible to include proper Code in the forum
You should have described the problem in a simple way.
Don't compare a double value with nan as you did. If x is nan, x != x is true. Google 'python isnan' for more details.
------------------------------------------
Be The Change
You Want To See
In The World
------------------------------------------
|
|
|
|
Topic |
|
|
|