The Origin Forum
File Exchange
Try Origin for Free
The Origin Forum
Home | Profile | Register | Active Topics | Members | Search | FAQ | Send File to Tech support
Username:
Password:
Save Password
Forgot your Password? | Admin Options

 All Forums
 Origin Forum for Programming
 Forum for Python
 Cancel all nans in pandas
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic

Pascal_S

Germany
4 Posts

Posted - 10/12/2023 :  4:10:31 PM  Show Profile  Edit Topic  Reply with Quote  View user's IP address  Delete Topic
Hello,

I am currently trying to automate a spectral data evaluation. For this purpose I am writing a small Origin Python script. Here, two columns of 14000 and 20000 entries are filled with NaN. The numerical values of the first column represent recorded spectral lines. I would like to compare these with the values of another column, which contain spectral lines from a database. To make this possible, I consider the following: I use two for-loops for the recorded values and the database values. The first loop is to go through the recorded values. The second loop is in the body of the first. This loop should go through the database values for the first, second etc. recorded value and calculate the difference between the recorded value and all database values. If the difference is less than, for example, 0.2 nm, the loop is to be interrupted and the database value is to be attributed to the recorded value.

I determine the number of rows with a shape[0] function. However, this is based on the first column, which unfortunately contains over 20000 entries. The spectral lines, however, are limited to about 100 entries. The values of the database are limited to 6000 entries. The remaining values are filled with NaN up to the row 20000, as already mentioned above. Now I have the following question:
Is it possible to determine the number of entries without NaN for each individual column? shape[0] looks at the whole sheet and set the row number to the sheets maximum.
Is it possible to stop the for-Loop if we detect two consecutive NaN values? The condition" if result2.iat[row_Peaks, column_Peaks] == NaN: break " is not even working for a single NaN value. If I dont include that the program will try to calculate a diference for all 20000 entries instead of the 100 given. Maybe there is an operator instead of break which will only skip the NaN calucaltions?

Some Code:
.
.
.
column_Peaks = 4
column_NIST = 3
row = 0
max_row = result2.shape[0]

for row_Peaks in range(0, max_row):
if result2.iat[row_Peaks, column_Peaks] == nan: break

for row_NIST in range(0, max_row):
dif = result2.iat[row_Peaks, column_Peaks] - result2.iat[row_NIST, column_NIST]
#print(dif)
#if dif < 0.2:
#Code to create three new columns. One with the recorded value, one with the matched database value and one for the diference.
#break

---

max_row should be split in two: max_row_Peaks and max_row_NIST. But I have no clue how two access the invidual column to determine the length.

Maybe there is a simplyfied solution to compare two columns even if the values do not fit 100%. It has to be like 0,98 == 1. Do you have any ideas of a simpler way?

Thank you for your help

Lg Pascal


Edit: Sorry I dont know if its possible to include proper Code in the forum

Edited by - Pascal_S on 10/12/2023 4:11:09 PM

Castiel

343 Posts

Posted - 10/15/2023 :  12:46:58 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
quote:
Originally posted by Pascal_S

Hello,

I am currently trying to automate a spectral data evaluation. For this purpose I am writing a small Origin Python script. Here, two columns of 14000 and 20000 entries are filled with NaN. The numerical values of the first column represent recorded spectral lines. I would like to compare these with the values of another column, which contain spectral lines from a database. To make this possible, I consider the following: I use two for-loops for the recorded values and the database values. The first loop is to go through the recorded values. The second loop is in the body of the first. This loop should go through the database values for the first, second etc. recorded value and calculate the difference between the recorded value and all database values. If the difference is less than, for example, 0.2 nm, the loop is to be interrupted and the database value is to be attributed to the recorded value.

I determine the number of rows with a shape[0] function. However, this is based on the first column, which unfortunately contains over 20000 entries. The spectral lines, however, are limited to about 100 entries. The values of the database are limited to 6000 entries. The remaining values are filled with NaN up to the row 20000, as already mentioned above. Now I have the following question:
Is it possible to determine the number of entries without NaN for each individual column? shape[0] looks at the whole sheet and set the row number to the sheets maximum.
Is it possible to stop the for-Loop if we detect two consecutive NaN values? The condition" if result2.iat[row_Peaks, column_Peaks] == NaN: break " is not even working for a single NaN value. If I dont include that the program will try to calculate a diference for all 20000 entries instead of the 100 given. Maybe there is an operator instead of break which will only skip the NaN calucaltions?

Some Code:
.
.
.
column_Peaks = 4
column_NIST = 3
row = 0
max_row = result2.shape[0]

for row_Peaks in range(0, max_row):
if result2.iat[row_Peaks, column_Peaks] == nan: break

for row_NIST in range(0, max_row):
dif = result2.iat[row_Peaks, column_Peaks] - result2.iat[row_NIST, column_NIST]
#print(dif)
#if dif < 0.2:
#Code to create three new columns. One with the recorded value, one with the matched database value and one for the diference.
#break

---

max_row should be split in two: max_row_Peaks and max_row_NIST. But I have no clue how two access the invidual column to determine the length.

Maybe there is a simplyfied solution to compare two columns even if the values do not fit 100%. It has to be like 0,98 == 1. Do you have any ideas of a simpler way?

Thank you for your help

Lg Pascal


Edit: Sorry I dont know if its possible to include proper Code in the forum



You should have described the problem in a simple way.

Don't compare a double value with nan as you did. If x is nan, x != x is true. Google 'python isnan' for more details.


------------------------------------------
       Be The Change
             You Want To See
                   In The World
------------------------------------------
Go to Top of Page
  Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
The Origin Forum © 2020 Originlab Corporation Go To Top Of Page
Snitz Forums 2000