The Origin Forum
File Exchange
Try Origin for Free
The Origin Forum
Home | Profile | Register | Active Topics | Members | Search | FAQ | Send File to Tech support
Username:
Password:
Save Password
Forgot your Password? | Admin Options

 All Forums
 Origin Forum
 Origin Forum
 Automatic Binning Algorithm (equation) ?
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic

vasiukov

Italy
3 Posts

Posted - 12/02/2022 :  06:56:51 AM  Show Profile  Edit Topic  Reply with Quote  View user's IP address  Delete Topic
Origin Ver. and Service Release (Select Help-->About Origin): OriginPro 2018 (64-bit) SR1
Operating System: Windows 10 (64-bit)

Dear Colleagues,

I need to understand how the program determines the maximum and minimum value of the x-axis and the bin size when drawing a histogram in automatic binning mode.
Who knows the algorithm?

Why does automatic binning give a more extensive range than the max-min values in the dataset?
1st example: there is a dataset [-1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] (all points are unique).
The minimum value is -1, the maximum value is 10.
However, automatic binning builds a histogram from -3 to 12 with a bin size of 3.

2nd example: there is a dataset [-10, -5, 0, 0, 11, 5, 0, 2, 1, -1, 3, -2, 0, 10].
The minimum value is -10, the maximum value is 11.
However, automatic binning builds a histogram from -15 to 15 with a bin size of 5.

Why? What determines the range and bin size?

I really appreciate any help you can provide.

snowli

USA
1426 Posts

Posted - 12/09/2022 :  10:44:12 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hello,
See this page of how auto bin size is decided.

https://www.originlab.com/doc/origin-help/create-histogram
Programming Notes:
Bin size and number are controllable via these system variables:
To set bin size, @HBS = value; value = -1 if not set.
To set bin number, @HBN = value; rounding is used and value = 0 if not set.
To force bin number, @HBM = value; value may be a non-integer (rounding is not used).
Priority sequence: @HBS > @HBN > @HBM
If neither @HBS and @HBN are specified, @HBF = value will determine bin number automatically as per the following expression:
number of bins = 1 + nint(value* log10(npts))

I suppose you didn't set any system variable, so we can run @hbf= in Script window to find out the value is 4.
for your dataset, -1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, the number of points is 11. Run the following in Script window.
1+nint(4*log(11))= //press enter

it gives bin number 5

And usually when plotting graphs in Origin, we leave some margin %8 so data will not show at the axis frame and we try to find a good bin start and end to fit all these requirements so get -3 and 12.

U can uncheck the auto bin to customize your bins.

Or if u have uneven bins etc. u can use Statistics: Descriptive Statistics: Frequency Count and then plot column/bar plot from it.

Thanks, Snow
Go to Top of Page

vasiukov

Italy
3 Posts

Posted - 12/16/2022 :  04:48:32 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Dear Snow,

Thanks for your reply.
Indeed, the formula for calculating the bin size clears things up a bit. However, I still do not understand how to determine the minimum and maximum values of the scale.
Please clarify this part of your comment: "... we leave some margin %8 so data will not show...". What exactly does "%8" mean? Is it 8 percents of max-min?

Sergii

quote:
Originally posted by snowli

Hello,
See this page of how auto bin size is decided.

https://www.originlab.com/doc/origin-help/create-histogram
Programming Notes:
Bin size and number are controllable via these system variables:
To set bin size, @HBS = value; value = -1 if not set.
To set bin number, @HBN = value; rounding is used and value = 0 if not set.
To force bin number, @HBM = value; value may be a non-integer (rounding is not used).
Priority sequence: @HBS > @HBN > @HBM
If neither @HBS and @HBN are specified, @HBF = value will determine bin number automatically as per the following expression:
number of bins = 1 + nint(value* log10(npts))

I suppose you didn't set any system variable, so we can run @hbf= in Script window to find out the value is 4.
for your dataset, -1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, the number of points is 11. Run the following in Script window.
1+nint(4*log(11))= //press enter

it gives bin number 5

And usually when plotting graphs in Origin, we leave some margin %8 so data will not show at the axis frame and we try to find a good bin start and end to fit all these requirements so get -3 and 12.

U can uncheck the auto bin to customize your bins.

Or if u have uneven bins etc. u can use Statistics: Descriptive Statistics: Frequency Count and then plot column/bar plot from it.

Thanks, Snow


Go to Top of Page

snowli

USA
1426 Posts

Posted - 12/16/2022 :  09:09:26 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi Sergii,

Here is some doc i found
https://www.originlab.com/doc/Origin-Help/AxesRef-Scale?f=dl#Rescale_Margin.28.25.29

It's hard to explain how we calculate it since we also need to consider tick increment, etc. to get a good value to show for ticks.

It looks for line/scatter plots,
E.g. default @RRT=4, while our Rescale Margin default is 8% so it auto calculate the axis begin & end.
If u set the rescale margin to be e.g. 3% (smaller than @RRT), it will add 3%* (max-min) as padding.
If u set margin to be 0%, it should be min & max of data.

For histogram/column bar/box chart, we need to add some padding anyway so the full bar or box can show.

E.g. for your case, do u want bars center at -1, 0, 1, ... or do u want to show bar between -1, 0, 1, ... If u wants bars to center at -1, the axis will need to start at maybe -2 or -1.5.

I will check with developer how we decide bin begin, end, etc. since e.g. for your case, not sure why it pick to start from -3 instead of -2.

I may be wrong for histogram since when i play with the example 1, the axis begins and ends at auto calculated bin begin & end instead of adding margin.

But if i have many data from 0 to 100, if i set bin begin & end to be from 0 to 110, when i rescale axis. it will still use 0 to 120 so again, it was doing some auto margin.



Thanks, Snow
Go to Top of Page

Echo_Chu

China
Posts

Posted - 12/19/2022 :  06:09:20 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi, Sergii

We first find the min (minimum value), max(maximum value) and count (data size) from dataset. then calculate the min, max, bin and size again from the equations below

bins = int(4*log(count))+1;
size = int((max - min)/bins)+1;
min1 = size * int(min/size) - size;
max1 = size * int(max/size) + size;

Echo
OriginLab Technical Service
Go to Top of Page

vasiukov

Italy
3 Posts

Posted - 12/19/2022 :  7:11:51 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Dear Snow and Echo,

Thanks a lot! Looks like you gave me a complete explanation. So, I can continue my work now!

Merry Christmas and Happy New Year!
Sergii
Go to Top of Page

snowli

USA
1426 Posts

Posted - 12/22/2022 :  10:12:04 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
You are very welcome Sergii.

I also suggested our developer to improve on this since i can see the confusion.

I guess for user who has a lot of data, it doesn't matter that much where the axis begin, end, bin center is. But for user with less data, it is important.

In 2023, when user click on histogram, there are mini toolbars to increase/decrease number of bins. I am suggesting them to add a button to open a dialog to set bin begin, end, increment, etc. ORG-26197

Hope it will be implemented in coming version.

Thanks, Snow
Go to Top of Page

snowli

USA
1426 Posts

Posted - 05/09/2023 :  10:08:02 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi Sergii,
Just FYI. In Origin2023b we just released, when clicking on histogram, a new button is added on the mini toolbar to set Bin Settings so user doesn't need to go to Plot Details, find the tab and edit.

Thanks, Snow
Go to Top of Page
  Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
The Origin Forum © 2020 Originlab Corporation Go To Top Of Page
Snitz Forums 2000