Curve fitting when data has a sharp initial slope and then tapers offCurve fitting and max slope calculationBayesian inference and curve fittingFitting Inspection Time Data to CurveFitting a parameterized 2D distributionDifference between regression analysis and curve fittingLinear fitting using re-expressing the data or directly curve fitting?Better way to fit/model data with high & low density areas (and with a geometric fit?)Update model parameter with new data, discarding old dataFitting a left-skewed curve to dataFitting curve to non-decreasing data
What's the minimum number of sensors for a hobby GPS waypoint-following UAV?
CentOS 7 -> find: missing Argument for "-exec"
ESTA: "Is your travel to the US occurring in transit to another country?" when going on a cruise
Can fluent English speakers distinguish “steel”, “still” and “steal”?
Shortest distance around a pyramid?
What does it mean that the mass of a propellant tank is at premium?
How to hide what's behind an object in a non destructive way / give it an "invisibility cloak"
Matchmaker, Matchmaker, make me a match
Who has taken "my" Managed package namespace? Can we find out?
Is it ethical to inform clients the Store is closing down soon?
Why are Hobbits so fond of mushrooms?
Is there a word for a message that is intended to be intercepted by an adversary?
Cops: The Hidden OEIS Substring
Was I subtly told to resign?
Is mono-devel needed?
Referring to different instances of the same character in time travel
What would be the ideal melee weapon made of "Phase Metal"?
How can I get a player to accept that they should stop trying to pull stunts without thinking them through first?
Do you know your 'KVZ's?
Why isn't pressure filtration popular compared to vacuum filtration?
When casting Eldritch Blast with the Agonizing Blast eldritch invocation, what do I add to my damage roll?
What species of wasp is this? And how to get rid of them?
How does a Potion of Poison work?
Get ids only where one id is null and other isn't
Curve fitting when data has a sharp initial slope and then tapers off
Curve fitting and max slope calculationBayesian inference and curve fittingFitting Inspection Time Data to CurveFitting a parameterized 2D distributionDifference between regression analysis and curve fittingLinear fitting using re-expressing the data or directly curve fitting?Better way to fit/model data with high & low density areas (and with a geometric fit?)Update model parameter with new data, discarding old dataFitting a left-skewed curve to dataFitting curve to non-decreasing data
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
I have a dataset that I'm currently smoothing using a 6th order least-squares polynomial which models the initial sharp curve well but as the data tapers out some noise at the end causes a downwards curve at the end. It's probably easiest to show a plot of my current data followed by the results of fitting extrapolated by a few extra points:
I'm expecting the data to taper out as it moves along so my initial thoughts were to maybe to break the data into thirds and apply lower order polynomials as it goes along. But I don't have a good background in statistics and I'm guessing this problem has some elegant solution I'm not aware of. Because the dataset is fairly short I thought I'd include a copy in case anyone was inspired to demonstrate some techniques:
0 0.632696209
1 0.954847196
2 1.211956632
3 1.282479631
4 1.348572899
5 1.36480867
6 1.411463353
7 1.413356854
8 1.443252208
9 1.5086791
10 1.519056572
11 1.515427082
12 1.545547955
13 1.567970692
14 1.554558155
15 1.539432376
16 1.662380739
17 1.630363821
18 1.627295705
19 1.574494008
20 1.651035552
21 1.59834609
22 1.612328401
23 1.634990079
24 1.668922299
25 1.622188079
26 1.759410212
27 1.849993968
28 1.78653232
29 1.7872419
30 1.639521297
31 1.70151144
32 1.868248296
33 1.707963085
34 1.703480801
35 1.811524105
36 1.790136202
37 1.79970863
38 1.944272907
39 1.818103305
40 1.750744767
43 1.77700684
44 1.80647612
regression curve-fitting
$endgroup$
|
show 1 more comment
$begingroup$
I have a dataset that I'm currently smoothing using a 6th order least-squares polynomial which models the initial sharp curve well but as the data tapers out some noise at the end causes a downwards curve at the end. It's probably easiest to show a plot of my current data followed by the results of fitting extrapolated by a few extra points:
I'm expecting the data to taper out as it moves along so my initial thoughts were to maybe to break the data into thirds and apply lower order polynomials as it goes along. But I don't have a good background in statistics and I'm guessing this problem has some elegant solution I'm not aware of. Because the dataset is fairly short I thought I'd include a copy in case anyone was inspired to demonstrate some techniques:
0 0.632696209
1 0.954847196
2 1.211956632
3 1.282479631
4 1.348572899
5 1.36480867
6 1.411463353
7 1.413356854
8 1.443252208
9 1.5086791
10 1.519056572
11 1.515427082
12 1.545547955
13 1.567970692
14 1.554558155
15 1.539432376
16 1.662380739
17 1.630363821
18 1.627295705
19 1.574494008
20 1.651035552
21 1.59834609
22 1.612328401
23 1.634990079
24 1.668922299
25 1.622188079
26 1.759410212
27 1.849993968
28 1.78653232
29 1.7872419
30 1.639521297
31 1.70151144
32 1.868248296
33 1.707963085
34 1.703480801
35 1.811524105
36 1.790136202
37 1.79970863
38 1.944272907
39 1.818103305
40 1.750744767
43 1.77700684
44 1.80647612
regression curve-fitting
$endgroup$
$begingroup$
Is this time series data with equal intervals between measurements ?
$endgroup$
– IrishStat
Jul 3 at 12:50
$begingroup$
@IrishStat It's not a time series as such, the X is how many times an event has occurred over a period that is the same for all samples
$endgroup$
– PeterJ
Jul 3 at 13:02
$begingroup$
can you please detail precisely what .632696209 is at X=0
$endgroup$
– IrishStat
Jul 3 at 13:05
$begingroup$
@IrishStat it's a probability, I don't want to be vague but someone asked me not to disclose full details but it means based on the X value the event has a 0.63 chance of occurring.
$endgroup$
– PeterJ
Jul 3 at 13:10
$begingroup$
Do you have an idea of when the break in slope occurs, e.g., at what horizontal coordinate?
$endgroup$
– Stephan Kolassa
Jul 3 at 13:12
|
show 1 more comment
$begingroup$
I have a dataset that I'm currently smoothing using a 6th order least-squares polynomial which models the initial sharp curve well but as the data tapers out some noise at the end causes a downwards curve at the end. It's probably easiest to show a plot of my current data followed by the results of fitting extrapolated by a few extra points:
I'm expecting the data to taper out as it moves along so my initial thoughts were to maybe to break the data into thirds and apply lower order polynomials as it goes along. But I don't have a good background in statistics and I'm guessing this problem has some elegant solution I'm not aware of. Because the dataset is fairly short I thought I'd include a copy in case anyone was inspired to demonstrate some techniques:
0 0.632696209
1 0.954847196
2 1.211956632
3 1.282479631
4 1.348572899
5 1.36480867
6 1.411463353
7 1.413356854
8 1.443252208
9 1.5086791
10 1.519056572
11 1.515427082
12 1.545547955
13 1.567970692
14 1.554558155
15 1.539432376
16 1.662380739
17 1.630363821
18 1.627295705
19 1.574494008
20 1.651035552
21 1.59834609
22 1.612328401
23 1.634990079
24 1.668922299
25 1.622188079
26 1.759410212
27 1.849993968
28 1.78653232
29 1.7872419
30 1.639521297
31 1.70151144
32 1.868248296
33 1.707963085
34 1.703480801
35 1.811524105
36 1.790136202
37 1.79970863
38 1.944272907
39 1.818103305
40 1.750744767
43 1.77700684
44 1.80647612
regression curve-fitting
$endgroup$
I have a dataset that I'm currently smoothing using a 6th order least-squares polynomial which models the initial sharp curve well but as the data tapers out some noise at the end causes a downwards curve at the end. It's probably easiest to show a plot of my current data followed by the results of fitting extrapolated by a few extra points:
I'm expecting the data to taper out as it moves along so my initial thoughts were to maybe to break the data into thirds and apply lower order polynomials as it goes along. But I don't have a good background in statistics and I'm guessing this problem has some elegant solution I'm not aware of. Because the dataset is fairly short I thought I'd include a copy in case anyone was inspired to demonstrate some techniques:
0 0.632696209
1 0.954847196
2 1.211956632
3 1.282479631
4 1.348572899
5 1.36480867
6 1.411463353
7 1.413356854
8 1.443252208
9 1.5086791
10 1.519056572
11 1.515427082
12 1.545547955
13 1.567970692
14 1.554558155
15 1.539432376
16 1.662380739
17 1.630363821
18 1.627295705
19 1.574494008
20 1.651035552
21 1.59834609
22 1.612328401
23 1.634990079
24 1.668922299
25 1.622188079
26 1.759410212
27 1.849993968
28 1.78653232
29 1.7872419
30 1.639521297
31 1.70151144
32 1.868248296
33 1.707963085
34 1.703480801
35 1.811524105
36 1.790136202
37 1.79970863
38 1.944272907
39 1.818103305
40 1.750744767
43 1.77700684
44 1.80647612
regression curve-fitting
regression curve-fitting
edited Jul 3 at 12:55
Nick Cox
40.2k5 gold badges89 silver badges133 bronze badges
40.2k5 gold badges89 silver badges133 bronze badges
asked Jul 3 at 12:38
PeterJPeterJ
1182 gold badges3 silver badges8 bronze badges
1182 gold badges3 silver badges8 bronze badges
$begingroup$
Is this time series data with equal intervals between measurements ?
$endgroup$
– IrishStat
Jul 3 at 12:50
$begingroup$
@IrishStat It's not a time series as such, the X is how many times an event has occurred over a period that is the same for all samples
$endgroup$
– PeterJ
Jul 3 at 13:02
$begingroup$
can you please detail precisely what .632696209 is at X=0
$endgroup$
– IrishStat
Jul 3 at 13:05
$begingroup$
@IrishStat it's a probability, I don't want to be vague but someone asked me not to disclose full details but it means based on the X value the event has a 0.63 chance of occurring.
$endgroup$
– PeterJ
Jul 3 at 13:10
$begingroup$
Do you have an idea of when the break in slope occurs, e.g., at what horizontal coordinate?
$endgroup$
– Stephan Kolassa
Jul 3 at 13:12
|
show 1 more comment
$begingroup$
Is this time series data with equal intervals between measurements ?
$endgroup$
– IrishStat
Jul 3 at 12:50
$begingroup$
@IrishStat It's not a time series as such, the X is how many times an event has occurred over a period that is the same for all samples
$endgroup$
– PeterJ
Jul 3 at 13:02
$begingroup$
can you please detail precisely what .632696209 is at X=0
$endgroup$
– IrishStat
Jul 3 at 13:05
$begingroup$
@IrishStat it's a probability, I don't want to be vague but someone asked me not to disclose full details but it means based on the X value the event has a 0.63 chance of occurring.
$endgroup$
– PeterJ
Jul 3 at 13:10
$begingroup$
Do you have an idea of when the break in slope occurs, e.g., at what horizontal coordinate?
$endgroup$
– Stephan Kolassa
Jul 3 at 13:12
$begingroup$
Is this time series data with equal intervals between measurements ?
$endgroup$
– IrishStat
Jul 3 at 12:50
$begingroup$
Is this time series data with equal intervals between measurements ?
$endgroup$
– IrishStat
Jul 3 at 12:50
$begingroup$
@IrishStat It's not a time series as such, the X is how many times an event has occurred over a period that is the same for all samples
$endgroup$
– PeterJ
Jul 3 at 13:02
$begingroup$
@IrishStat It's not a time series as such, the X is how many times an event has occurred over a period that is the same for all samples
$endgroup$
– PeterJ
Jul 3 at 13:02
$begingroup$
can you please detail precisely what .632696209 is at X=0
$endgroup$
– IrishStat
Jul 3 at 13:05
$begingroup$
can you please detail precisely what .632696209 is at X=0
$endgroup$
– IrishStat
Jul 3 at 13:05
$begingroup$
@IrishStat it's a probability, I don't want to be vague but someone asked me not to disclose full details but it means based on the X value the event has a 0.63 chance of occurring.
$endgroup$
– PeterJ
Jul 3 at 13:10
$begingroup$
@IrishStat it's a probability, I don't want to be vague but someone asked me not to disclose full details but it means based on the X value the event has a 0.63 chance of occurring.
$endgroup$
– PeterJ
Jul 3 at 13:10
$begingroup$
Do you have an idea of when the break in slope occurs, e.g., at what horizontal coordinate?
$endgroup$
– Stephan Kolassa
Jul 3 at 13:12
$begingroup$
Do you have an idea of when the break in slope occurs, e.g., at what horizontal coordinate?
$endgroup$
– Stephan Kolassa
Jul 3 at 13:12
|
show 1 more comment
2 Answers
2
active
oldest
votes
$begingroup$
My equation search on your data turned up a simple two-parameter power equation, "y = a * pow(x, b)", with parameters a = 1.0769014059561925E+00 and b = 1.4153886539395866E-01 yielding R-squared = 0.789 and RMSE = 0.112
$endgroup$
$begingroup$
Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
$endgroup$
– James Phillips
Jul 3 at 17:52
add a comment |
$begingroup$
You have a specific kind of structural change in your data generating model. Specifically, your straight regression line seems to change in slope at a particular point. In order that the two line segments meet up, the intercept needs to change in a specific way at that same point.
You should look at "piecewise linear regression". Our tags change-point and structural-change may also be helpful. For instance, here is a tutorial in R.
If you already know when (at what predictor value) the change occurs, then you can set up a specific design matrix with constraints to model this. However, if you also need to detect the change point, you need specific tools. For instance, in R, the segmented
package will fit a piecewise linear regression where you specify the number of breaks (and optionally guesses for where they occur). Here is how segmented
deals with your data, where we specify either one or two change points:
If you need to decide whether you have one, two or more break points, I would recommend something like cross validation.
xx <- c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 43, 44)
yy <- c(0.632696209, 0.954847196, 1.211956632, 1.282479631, 1.348572899,
1.36480867, 1.411463353, 1.413356854, 1.443252208, 1.5086791,
1.519056572, 1.515427082, 1.545547955, 1.567970692, 1.554558155,
1.539432376, 1.662380739, 1.630363821, 1.627295705, 1.574494008,
1.651035552, 1.59834609, 1.612328401, 1.634990079, 1.668922299,
1.622188079, 1.759410212, 1.849993968, 1.78653232, 1.7872419,
1.639521297, 1.70151144, 1.868248296, 1.707963085, 1.703480801,
1.811524105, 1.790136202, 1.79970863, 1.944272907, 1.818103305,
1.750744767, 1.77700684, 1.80647612)
library(segmented)
model_1 <- segmented(lm(yy~xx),npsi=1)
model_2 <- segmented(lm(yy~xx),npsi=2)
xx_fit <- seq(min(xx),max(xx),by=.01)
plot(xx,yy)
lines(xx_fit,predict(model_1,newdata=data.frame(xx=xx_fit)))
lines(xx_fit,predict(model_2,newdata=data.frame(xx=xx_fit)),col="red")
$endgroup$
$begingroup$
Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
$endgroup$
– PeterJ
Jul 3 at 13:39
4
$begingroup$
This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
$endgroup$
– Frank Harrell
Jul 3 at 13:48
$begingroup$
A simple power equation fits this data well, see my answer to this question.
$endgroup$
– James Phillips
Jul 3 at 15:36
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f415859%2fcurve-fitting-when-data-has-a-sharp-initial-slope-and-then-tapers-off%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
My equation search on your data turned up a simple two-parameter power equation, "y = a * pow(x, b)", with parameters a = 1.0769014059561925E+00 and b = 1.4153886539395866E-01 yielding R-squared = 0.789 and RMSE = 0.112
$endgroup$
$begingroup$
Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
$endgroup$
– James Phillips
Jul 3 at 17:52
add a comment |
$begingroup$
My equation search on your data turned up a simple two-parameter power equation, "y = a * pow(x, b)", with parameters a = 1.0769014059561925E+00 and b = 1.4153886539395866E-01 yielding R-squared = 0.789 and RMSE = 0.112
$endgroup$
$begingroup$
Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
$endgroup$
– James Phillips
Jul 3 at 17:52
add a comment |
$begingroup$
My equation search on your data turned up a simple two-parameter power equation, "y = a * pow(x, b)", with parameters a = 1.0769014059561925E+00 and b = 1.4153886539395866E-01 yielding R-squared = 0.789 and RMSE = 0.112
$endgroup$
My equation search on your data turned up a simple two-parameter power equation, "y = a * pow(x, b)", with parameters a = 1.0769014059561925E+00 and b = 1.4153886539395866E-01 yielding R-squared = 0.789 and RMSE = 0.112
answered Jul 3 at 15:31
James PhillipsJames Phillips
7002 gold badges6 silver badges7 bronze badges
7002 gold badges6 silver badges7 bronze badges
$begingroup$
Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
$endgroup$
– James Phillips
Jul 3 at 17:52
add a comment |
$begingroup$
Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
$endgroup$
– James Phillips
Jul 3 at 17:52
$begingroup$
Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
$endgroup$
– James Phillips
Jul 3 at 17:52
$begingroup$
Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
$endgroup$
– James Phillips
Jul 3 at 17:52
add a comment |
$begingroup$
You have a specific kind of structural change in your data generating model. Specifically, your straight regression line seems to change in slope at a particular point. In order that the two line segments meet up, the intercept needs to change in a specific way at that same point.
You should look at "piecewise linear regression". Our tags change-point and structural-change may also be helpful. For instance, here is a tutorial in R.
If you already know when (at what predictor value) the change occurs, then you can set up a specific design matrix with constraints to model this. However, if you also need to detect the change point, you need specific tools. For instance, in R, the segmented
package will fit a piecewise linear regression where you specify the number of breaks (and optionally guesses for where they occur). Here is how segmented
deals with your data, where we specify either one or two change points:
If you need to decide whether you have one, two or more break points, I would recommend something like cross validation.
xx <- c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 43, 44)
yy <- c(0.632696209, 0.954847196, 1.211956632, 1.282479631, 1.348572899,
1.36480867, 1.411463353, 1.413356854, 1.443252208, 1.5086791,
1.519056572, 1.515427082, 1.545547955, 1.567970692, 1.554558155,
1.539432376, 1.662380739, 1.630363821, 1.627295705, 1.574494008,
1.651035552, 1.59834609, 1.612328401, 1.634990079, 1.668922299,
1.622188079, 1.759410212, 1.849993968, 1.78653232, 1.7872419,
1.639521297, 1.70151144, 1.868248296, 1.707963085, 1.703480801,
1.811524105, 1.790136202, 1.79970863, 1.944272907, 1.818103305,
1.750744767, 1.77700684, 1.80647612)
library(segmented)
model_1 <- segmented(lm(yy~xx),npsi=1)
model_2 <- segmented(lm(yy~xx),npsi=2)
xx_fit <- seq(min(xx),max(xx),by=.01)
plot(xx,yy)
lines(xx_fit,predict(model_1,newdata=data.frame(xx=xx_fit)))
lines(xx_fit,predict(model_2,newdata=data.frame(xx=xx_fit)),col="red")
$endgroup$
$begingroup$
Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
$endgroup$
– PeterJ
Jul 3 at 13:39
4
$begingroup$
This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
$endgroup$
– Frank Harrell
Jul 3 at 13:48
$begingroup$
A simple power equation fits this data well, see my answer to this question.
$endgroup$
– James Phillips
Jul 3 at 15:36
add a comment |
$begingroup$
You have a specific kind of structural change in your data generating model. Specifically, your straight regression line seems to change in slope at a particular point. In order that the two line segments meet up, the intercept needs to change in a specific way at that same point.
You should look at "piecewise linear regression". Our tags change-point and structural-change may also be helpful. For instance, here is a tutorial in R.
If you already know when (at what predictor value) the change occurs, then you can set up a specific design matrix with constraints to model this. However, if you also need to detect the change point, you need specific tools. For instance, in R, the segmented
package will fit a piecewise linear regression where you specify the number of breaks (and optionally guesses for where they occur). Here is how segmented
deals with your data, where we specify either one or two change points:
If you need to decide whether you have one, two or more break points, I would recommend something like cross validation.
xx <- c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 43, 44)
yy <- c(0.632696209, 0.954847196, 1.211956632, 1.282479631, 1.348572899,
1.36480867, 1.411463353, 1.413356854, 1.443252208, 1.5086791,
1.519056572, 1.515427082, 1.545547955, 1.567970692, 1.554558155,
1.539432376, 1.662380739, 1.630363821, 1.627295705, 1.574494008,
1.651035552, 1.59834609, 1.612328401, 1.634990079, 1.668922299,
1.622188079, 1.759410212, 1.849993968, 1.78653232, 1.7872419,
1.639521297, 1.70151144, 1.868248296, 1.707963085, 1.703480801,
1.811524105, 1.790136202, 1.79970863, 1.944272907, 1.818103305,
1.750744767, 1.77700684, 1.80647612)
library(segmented)
model_1 <- segmented(lm(yy~xx),npsi=1)
model_2 <- segmented(lm(yy~xx),npsi=2)
xx_fit <- seq(min(xx),max(xx),by=.01)
plot(xx,yy)
lines(xx_fit,predict(model_1,newdata=data.frame(xx=xx_fit)))
lines(xx_fit,predict(model_2,newdata=data.frame(xx=xx_fit)),col="red")
$endgroup$
$begingroup$
Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
$endgroup$
– PeterJ
Jul 3 at 13:39
4
$begingroup$
This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
$endgroup$
– Frank Harrell
Jul 3 at 13:48
$begingroup$
A simple power equation fits this data well, see my answer to this question.
$endgroup$
– James Phillips
Jul 3 at 15:36
add a comment |
$begingroup$
You have a specific kind of structural change in your data generating model. Specifically, your straight regression line seems to change in slope at a particular point. In order that the two line segments meet up, the intercept needs to change in a specific way at that same point.
You should look at "piecewise linear regression". Our tags change-point and structural-change may also be helpful. For instance, here is a tutorial in R.
If you already know when (at what predictor value) the change occurs, then you can set up a specific design matrix with constraints to model this. However, if you also need to detect the change point, you need specific tools. For instance, in R, the segmented
package will fit a piecewise linear regression where you specify the number of breaks (and optionally guesses for where they occur). Here is how segmented
deals with your data, where we specify either one or two change points:
If you need to decide whether you have one, two or more break points, I would recommend something like cross validation.
xx <- c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 43, 44)
yy <- c(0.632696209, 0.954847196, 1.211956632, 1.282479631, 1.348572899,
1.36480867, 1.411463353, 1.413356854, 1.443252208, 1.5086791,
1.519056572, 1.515427082, 1.545547955, 1.567970692, 1.554558155,
1.539432376, 1.662380739, 1.630363821, 1.627295705, 1.574494008,
1.651035552, 1.59834609, 1.612328401, 1.634990079, 1.668922299,
1.622188079, 1.759410212, 1.849993968, 1.78653232, 1.7872419,
1.639521297, 1.70151144, 1.868248296, 1.707963085, 1.703480801,
1.811524105, 1.790136202, 1.79970863, 1.944272907, 1.818103305,
1.750744767, 1.77700684, 1.80647612)
library(segmented)
model_1 <- segmented(lm(yy~xx),npsi=1)
model_2 <- segmented(lm(yy~xx),npsi=2)
xx_fit <- seq(min(xx),max(xx),by=.01)
plot(xx,yy)
lines(xx_fit,predict(model_1,newdata=data.frame(xx=xx_fit)))
lines(xx_fit,predict(model_2,newdata=data.frame(xx=xx_fit)),col="red")
$endgroup$
You have a specific kind of structural change in your data generating model. Specifically, your straight regression line seems to change in slope at a particular point. In order that the two line segments meet up, the intercept needs to change in a specific way at that same point.
You should look at "piecewise linear regression". Our tags change-point and structural-change may also be helpful. For instance, here is a tutorial in R.
If you already know when (at what predictor value) the change occurs, then you can set up a specific design matrix with constraints to model this. However, if you also need to detect the change point, you need specific tools. For instance, in R, the segmented
package will fit a piecewise linear regression where you specify the number of breaks (and optionally guesses for where they occur). Here is how segmented
deals with your data, where we specify either one or two change points:
If you need to decide whether you have one, two or more break points, I would recommend something like cross validation.
xx <- c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 43, 44)
yy <- c(0.632696209, 0.954847196, 1.211956632, 1.282479631, 1.348572899,
1.36480867, 1.411463353, 1.413356854, 1.443252208, 1.5086791,
1.519056572, 1.515427082, 1.545547955, 1.567970692, 1.554558155,
1.539432376, 1.662380739, 1.630363821, 1.627295705, 1.574494008,
1.651035552, 1.59834609, 1.612328401, 1.634990079, 1.668922299,
1.622188079, 1.759410212, 1.849993968, 1.78653232, 1.7872419,
1.639521297, 1.70151144, 1.868248296, 1.707963085, 1.703480801,
1.811524105, 1.790136202, 1.79970863, 1.944272907, 1.818103305,
1.750744767, 1.77700684, 1.80647612)
library(segmented)
model_1 <- segmented(lm(yy~xx),npsi=1)
model_2 <- segmented(lm(yy~xx),npsi=2)
xx_fit <- seq(min(xx),max(xx),by=.01)
plot(xx,yy)
lines(xx_fit,predict(model_1,newdata=data.frame(xx=xx_fit)))
lines(xx_fit,predict(model_2,newdata=data.frame(xx=xx_fit)),col="red")
edited Jul 3 at 13:37
answered Jul 3 at 13:28
Stephan KolassaStephan Kolassa
52.4k9 gold badges105 silver badges197 bronze badges
52.4k9 gold badges105 silver badges197 bronze badges
$begingroup$
Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
$endgroup$
– PeterJ
Jul 3 at 13:39
4
$begingroup$
This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
$endgroup$
– Frank Harrell
Jul 3 at 13:48
$begingroup$
A simple power equation fits this data well, see my answer to this question.
$endgroup$
– James Phillips
Jul 3 at 15:36
add a comment |
$begingroup$
Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
$endgroup$
– PeterJ
Jul 3 at 13:39
4
$begingroup$
This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
$endgroup$
– Frank Harrell
Jul 3 at 13:48
$begingroup$
A simple power equation fits this data well, see my answer to this question.
$endgroup$
– James Phillips
Jul 3 at 15:36
$begingroup$
Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
$endgroup$
– PeterJ
Jul 3 at 13:39
$begingroup$
Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
$endgroup$
– PeterJ
Jul 3 at 13:39
4
4
$begingroup$
This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
$endgroup$
– Frank Harrell
Jul 3 at 13:48
$begingroup$
This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
$endgroup$
– Frank Harrell
Jul 3 at 13:48
$begingroup$
A simple power equation fits this data well, see my answer to this question.
$endgroup$
– James Phillips
Jul 3 at 15:36
$begingroup$
A simple power equation fits this data well, see my answer to this question.
$endgroup$
– James Phillips
Jul 3 at 15:36
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f415859%2fcurve-fitting-when-data-has-a-sharp-initial-slope-and-then-tapers-off%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Is this time series data with equal intervals between measurements ?
$endgroup$
– IrishStat
Jul 3 at 12:50
$begingroup$
@IrishStat It's not a time series as such, the X is how many times an event has occurred over a period that is the same for all samples
$endgroup$
– PeterJ
Jul 3 at 13:02
$begingroup$
can you please detail precisely what .632696209 is at X=0
$endgroup$
– IrishStat
Jul 3 at 13:05
$begingroup$
@IrishStat it's a probability, I don't want to be vague but someone asked me not to disclose full details but it means based on the X value the event has a 0.63 chance of occurring.
$endgroup$
– PeterJ
Jul 3 at 13:10
$begingroup$
Do you have an idea of when the break in slope occurs, e.g., at what horizontal coordinate?
$endgroup$
– Stephan Kolassa
Jul 3 at 13:12