Curve fitting when data has a sharp initial slope and then tapers offCurve fitting and max slope calculationBayesian inference and curve fittingFitting Inspection Time Data to CurveFitting a parameterized 2D distributionDifference between regression analysis and curve fittingLinear fitting using re-expressing the data or directly curve fitting?Better way to fit/model data with high & low density areas (and with a geometric fit?)Update model parameter with new data, discarding old dataFitting a left-skewed curve to dataFitting curve to non-decreasing data

What's the minimum number of sensors for a hobby GPS waypoint-following UAV?

CentOS 7 -> find: missing Argument for "-exec"

ESTA: "Is your travel to the US occurring in transit to another country?" when going on a cruise

Can fluent English speakers distinguish “steel”, “still” and “steal”?

Shortest distance around a pyramid?

What does it mean that the mass of a propellant tank is at premium?

How to hide what's behind an object in a non destructive way / give it an "invisibility cloak"

Matchmaker, Matchmaker, make me a match

Who has taken "my" Managed package namespace? Can we find out?

Is it ethical to inform clients the Store is closing down soon?

Why are Hobbits so fond of mushrooms?

Is there a word for a message that is intended to be intercepted by an adversary?

Cops: The Hidden OEIS Substring

Was I subtly told to resign?

Is mono-devel needed?

Referring to different instances of the same character in time travel

What would be the ideal melee weapon made of "Phase Metal"?

How can I get a player to accept that they should stop trying to pull stunts without thinking them through first?

Do you know your 'KVZ's?

Why isn't pressure filtration popular compared to vacuum filtration?

When casting Eldritch Blast with the Agonizing Blast eldritch invocation, what do I add to my damage roll?

What species of wasp is this? And how to get rid of them?

How does a Potion of Poison work?

Get ids only where one id is null and other isn't



Curve fitting when data has a sharp initial slope and then tapers off


Curve fitting and max slope calculationBayesian inference and curve fittingFitting Inspection Time Data to CurveFitting a parameterized 2D distributionDifference between regression analysis and curve fittingLinear fitting using re-expressing the data or directly curve fitting?Better way to fit/model data with high & low density areas (and with a geometric fit?)Update model parameter with new data, discarding old dataFitting a left-skewed curve to dataFitting curve to non-decreasing data






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








3












$begingroup$


I have a dataset that I'm currently smoothing using a 6th order least-squares polynomial which models the initial sharp curve well but as the data tapers out some noise at the end causes a downwards curve at the end. It's probably easiest to show a plot of my current data followed by the results of fitting extrapolated by a few extra points:



Polynomial fitting



I'm expecting the data to taper out as it moves along so my initial thoughts were to maybe to break the data into thirds and apply lower order polynomials as it goes along. But I don't have a good background in statistics and I'm guessing this problem has some elegant solution I'm not aware of. Because the dataset is fairly short I thought I'd include a copy in case anyone was inspired to demonstrate some techniques:



0 0.632696209
1 0.954847196
2 1.211956632
3 1.282479631
4 1.348572899
5 1.36480867
6 1.411463353
7 1.413356854
8 1.443252208
9 1.5086791
10 1.519056572
11 1.515427082
12 1.545547955
13 1.567970692
14 1.554558155
15 1.539432376
16 1.662380739
17 1.630363821
18 1.627295705
19 1.574494008
20 1.651035552
21 1.59834609
22 1.612328401
23 1.634990079
24 1.668922299
25 1.622188079
26 1.759410212
27 1.849993968
28 1.78653232
29 1.7872419
30 1.639521297
31 1.70151144
32 1.868248296
33 1.707963085
34 1.703480801
35 1.811524105
36 1.790136202
37 1.79970863
38 1.944272907
39 1.818103305
40 1.750744767
43 1.77700684
44 1.80647612









share|cite|improve this question











$endgroup$











  • $begingroup$
    Is this time series data with equal intervals between measurements ?
    $endgroup$
    – IrishStat
    Jul 3 at 12:50










  • $begingroup$
    @IrishStat It's not a time series as such, the X is how many times an event has occurred over a period that is the same for all samples
    $endgroup$
    – PeterJ
    Jul 3 at 13:02










  • $begingroup$
    can you please detail precisely what .632696209 is at X=0
    $endgroup$
    – IrishStat
    Jul 3 at 13:05










  • $begingroup$
    @IrishStat it's a probability, I don't want to be vague but someone asked me not to disclose full details but it means based on the X value the event has a 0.63 chance of occurring.
    $endgroup$
    – PeterJ
    Jul 3 at 13:10










  • $begingroup$
    Do you have an idea of when the break in slope occurs, e.g., at what horizontal coordinate?
    $endgroup$
    – Stephan Kolassa
    Jul 3 at 13:12

















3












$begingroup$


I have a dataset that I'm currently smoothing using a 6th order least-squares polynomial which models the initial sharp curve well but as the data tapers out some noise at the end causes a downwards curve at the end. It's probably easiest to show a plot of my current data followed by the results of fitting extrapolated by a few extra points:



Polynomial fitting



I'm expecting the data to taper out as it moves along so my initial thoughts were to maybe to break the data into thirds and apply lower order polynomials as it goes along. But I don't have a good background in statistics and I'm guessing this problem has some elegant solution I'm not aware of. Because the dataset is fairly short I thought I'd include a copy in case anyone was inspired to demonstrate some techniques:



0 0.632696209
1 0.954847196
2 1.211956632
3 1.282479631
4 1.348572899
5 1.36480867
6 1.411463353
7 1.413356854
8 1.443252208
9 1.5086791
10 1.519056572
11 1.515427082
12 1.545547955
13 1.567970692
14 1.554558155
15 1.539432376
16 1.662380739
17 1.630363821
18 1.627295705
19 1.574494008
20 1.651035552
21 1.59834609
22 1.612328401
23 1.634990079
24 1.668922299
25 1.622188079
26 1.759410212
27 1.849993968
28 1.78653232
29 1.7872419
30 1.639521297
31 1.70151144
32 1.868248296
33 1.707963085
34 1.703480801
35 1.811524105
36 1.790136202
37 1.79970863
38 1.944272907
39 1.818103305
40 1.750744767
43 1.77700684
44 1.80647612









share|cite|improve this question











$endgroup$











  • $begingroup$
    Is this time series data with equal intervals between measurements ?
    $endgroup$
    – IrishStat
    Jul 3 at 12:50










  • $begingroup$
    @IrishStat It's not a time series as such, the X is how many times an event has occurred over a period that is the same for all samples
    $endgroup$
    – PeterJ
    Jul 3 at 13:02










  • $begingroup$
    can you please detail precisely what .632696209 is at X=0
    $endgroup$
    – IrishStat
    Jul 3 at 13:05










  • $begingroup$
    @IrishStat it's a probability, I don't want to be vague but someone asked me not to disclose full details but it means based on the X value the event has a 0.63 chance of occurring.
    $endgroup$
    – PeterJ
    Jul 3 at 13:10










  • $begingroup$
    Do you have an idea of when the break in slope occurs, e.g., at what horizontal coordinate?
    $endgroup$
    – Stephan Kolassa
    Jul 3 at 13:12













3












3








3





$begingroup$


I have a dataset that I'm currently smoothing using a 6th order least-squares polynomial which models the initial sharp curve well but as the data tapers out some noise at the end causes a downwards curve at the end. It's probably easiest to show a plot of my current data followed by the results of fitting extrapolated by a few extra points:



Polynomial fitting



I'm expecting the data to taper out as it moves along so my initial thoughts were to maybe to break the data into thirds and apply lower order polynomials as it goes along. But I don't have a good background in statistics and I'm guessing this problem has some elegant solution I'm not aware of. Because the dataset is fairly short I thought I'd include a copy in case anyone was inspired to demonstrate some techniques:



0 0.632696209
1 0.954847196
2 1.211956632
3 1.282479631
4 1.348572899
5 1.36480867
6 1.411463353
7 1.413356854
8 1.443252208
9 1.5086791
10 1.519056572
11 1.515427082
12 1.545547955
13 1.567970692
14 1.554558155
15 1.539432376
16 1.662380739
17 1.630363821
18 1.627295705
19 1.574494008
20 1.651035552
21 1.59834609
22 1.612328401
23 1.634990079
24 1.668922299
25 1.622188079
26 1.759410212
27 1.849993968
28 1.78653232
29 1.7872419
30 1.639521297
31 1.70151144
32 1.868248296
33 1.707963085
34 1.703480801
35 1.811524105
36 1.790136202
37 1.79970863
38 1.944272907
39 1.818103305
40 1.750744767
43 1.77700684
44 1.80647612









share|cite|improve this question











$endgroup$




I have a dataset that I'm currently smoothing using a 6th order least-squares polynomial which models the initial sharp curve well but as the data tapers out some noise at the end causes a downwards curve at the end. It's probably easiest to show a plot of my current data followed by the results of fitting extrapolated by a few extra points:



Polynomial fitting



I'm expecting the data to taper out as it moves along so my initial thoughts were to maybe to break the data into thirds and apply lower order polynomials as it goes along. But I don't have a good background in statistics and I'm guessing this problem has some elegant solution I'm not aware of. Because the dataset is fairly short I thought I'd include a copy in case anyone was inspired to demonstrate some techniques:



0 0.632696209
1 0.954847196
2 1.211956632
3 1.282479631
4 1.348572899
5 1.36480867
6 1.411463353
7 1.413356854
8 1.443252208
9 1.5086791
10 1.519056572
11 1.515427082
12 1.545547955
13 1.567970692
14 1.554558155
15 1.539432376
16 1.662380739
17 1.630363821
18 1.627295705
19 1.574494008
20 1.651035552
21 1.59834609
22 1.612328401
23 1.634990079
24 1.668922299
25 1.622188079
26 1.759410212
27 1.849993968
28 1.78653232
29 1.7872419
30 1.639521297
31 1.70151144
32 1.868248296
33 1.707963085
34 1.703480801
35 1.811524105
36 1.790136202
37 1.79970863
38 1.944272907
39 1.818103305
40 1.750744767
43 1.77700684
44 1.80647612






regression curve-fitting






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Jul 3 at 12:55









Nick Cox

40.2k5 gold badges89 silver badges133 bronze badges




40.2k5 gold badges89 silver badges133 bronze badges










asked Jul 3 at 12:38









PeterJPeterJ

1182 gold badges3 silver badges8 bronze badges




1182 gold badges3 silver badges8 bronze badges











  • $begingroup$
    Is this time series data with equal intervals between measurements ?
    $endgroup$
    – IrishStat
    Jul 3 at 12:50










  • $begingroup$
    @IrishStat It's not a time series as such, the X is how many times an event has occurred over a period that is the same for all samples
    $endgroup$
    – PeterJ
    Jul 3 at 13:02










  • $begingroup$
    can you please detail precisely what .632696209 is at X=0
    $endgroup$
    – IrishStat
    Jul 3 at 13:05










  • $begingroup$
    @IrishStat it's a probability, I don't want to be vague but someone asked me not to disclose full details but it means based on the X value the event has a 0.63 chance of occurring.
    $endgroup$
    – PeterJ
    Jul 3 at 13:10










  • $begingroup$
    Do you have an idea of when the break in slope occurs, e.g., at what horizontal coordinate?
    $endgroup$
    – Stephan Kolassa
    Jul 3 at 13:12
















  • $begingroup$
    Is this time series data with equal intervals between measurements ?
    $endgroup$
    – IrishStat
    Jul 3 at 12:50










  • $begingroup$
    @IrishStat It's not a time series as such, the X is how many times an event has occurred over a period that is the same for all samples
    $endgroup$
    – PeterJ
    Jul 3 at 13:02










  • $begingroup$
    can you please detail precisely what .632696209 is at X=0
    $endgroup$
    – IrishStat
    Jul 3 at 13:05










  • $begingroup$
    @IrishStat it's a probability, I don't want to be vague but someone asked me not to disclose full details but it means based on the X value the event has a 0.63 chance of occurring.
    $endgroup$
    – PeterJ
    Jul 3 at 13:10










  • $begingroup$
    Do you have an idea of when the break in slope occurs, e.g., at what horizontal coordinate?
    $endgroup$
    – Stephan Kolassa
    Jul 3 at 13:12















$begingroup$
Is this time series data with equal intervals between measurements ?
$endgroup$
– IrishStat
Jul 3 at 12:50




$begingroup$
Is this time series data with equal intervals between measurements ?
$endgroup$
– IrishStat
Jul 3 at 12:50












$begingroup$
@IrishStat It's not a time series as such, the X is how many times an event has occurred over a period that is the same for all samples
$endgroup$
– PeterJ
Jul 3 at 13:02




$begingroup$
@IrishStat It's not a time series as such, the X is how many times an event has occurred over a period that is the same for all samples
$endgroup$
– PeterJ
Jul 3 at 13:02












$begingroup$
can you please detail precisely what .632696209 is at X=0
$endgroup$
– IrishStat
Jul 3 at 13:05




$begingroup$
can you please detail precisely what .632696209 is at X=0
$endgroup$
– IrishStat
Jul 3 at 13:05












$begingroup$
@IrishStat it's a probability, I don't want to be vague but someone asked me not to disclose full details but it means based on the X value the event has a 0.63 chance of occurring.
$endgroup$
– PeterJ
Jul 3 at 13:10




$begingroup$
@IrishStat it's a probability, I don't want to be vague but someone asked me not to disclose full details but it means based on the X value the event has a 0.63 chance of occurring.
$endgroup$
– PeterJ
Jul 3 at 13:10












$begingroup$
Do you have an idea of when the break in slope occurs, e.g., at what horizontal coordinate?
$endgroup$
– Stephan Kolassa
Jul 3 at 13:12




$begingroup$
Do you have an idea of when the break in slope occurs, e.g., at what horizontal coordinate?
$endgroup$
– Stephan Kolassa
Jul 3 at 13:12










2 Answers
2






active

oldest

votes


















4












$begingroup$

My equation search on your data turned up a simple two-parameter power equation, "y = a * pow(x, b)", with parameters a = 1.0769014059561925E+00 and b = 1.4153886539395866E-01 yielding R-squared = 0.789 and RMSE = 0.112



plot






share|cite|improve this answer









$endgroup$












  • $begingroup$
    Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
    $endgroup$
    – James Phillips
    Jul 3 at 17:52


















3












$begingroup$

You have a specific kind of structural change in your data generating model. Specifically, your straight regression line seems to change in slope at a particular point. In order that the two line segments meet up, the intercept needs to change in a specific way at that same point.



You should look at "piecewise linear regression". Our tags change-point and structural-change may also be helpful. For instance, here is a tutorial in R.



If you already know when (at what predictor value) the change occurs, then you can set up a specific design matrix with constraints to model this. However, if you also need to detect the change point, you need specific tools. For instance, in R, the segmented package will fit a piecewise linear regression where you specify the number of breaks (and optionally guesses for where they occur). Here is how segmented deals with your data, where we specify either one or two change points:



segmented



If you need to decide whether you have one, two or more break points, I would recommend something like cross validation.



xx <- c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 43, 44)
yy <- c(0.632696209, 0.954847196, 1.211956632, 1.282479631, 1.348572899,
1.36480867, 1.411463353, 1.413356854, 1.443252208, 1.5086791,
1.519056572, 1.515427082, 1.545547955, 1.567970692, 1.554558155,
1.539432376, 1.662380739, 1.630363821, 1.627295705, 1.574494008,
1.651035552, 1.59834609, 1.612328401, 1.634990079, 1.668922299,
1.622188079, 1.759410212, 1.849993968, 1.78653232, 1.7872419,
1.639521297, 1.70151144, 1.868248296, 1.707963085, 1.703480801,
1.811524105, 1.790136202, 1.79970863, 1.944272907, 1.818103305,
1.750744767, 1.77700684, 1.80647612)

library(segmented)
model_1 <- segmented(lm(yy~xx),npsi=1)
model_2 <- segmented(lm(yy~xx),npsi=2)

xx_fit <- seq(min(xx),max(xx),by=.01)

plot(xx,yy)
lines(xx_fit,predict(model_1,newdata=data.frame(xx=xx_fit)))
lines(xx_fit,predict(model_2,newdata=data.frame(xx=xx_fit)),col="red")





share|cite|improve this answer











$endgroup$












  • $begingroup$
    Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
    $endgroup$
    – PeterJ
    Jul 3 at 13:39






  • 4




    $begingroup$
    This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
    $endgroup$
    – Frank Harrell
    Jul 3 at 13:48










  • $begingroup$
    A simple power equation fits this data well, see my answer to this question.
    $endgroup$
    – James Phillips
    Jul 3 at 15:36













Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f415859%2fcurve-fitting-when-data-has-a-sharp-initial-slope-and-then-tapers-off%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









4












$begingroup$

My equation search on your data turned up a simple two-parameter power equation, "y = a * pow(x, b)", with parameters a = 1.0769014059561925E+00 and b = 1.4153886539395866E-01 yielding R-squared = 0.789 and RMSE = 0.112



plot






share|cite|improve this answer









$endgroup$












  • $begingroup$
    Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
    $endgroup$
    – James Phillips
    Jul 3 at 17:52















4












$begingroup$

My equation search on your data turned up a simple two-parameter power equation, "y = a * pow(x, b)", with parameters a = 1.0769014059561925E+00 and b = 1.4153886539395866E-01 yielding R-squared = 0.789 and RMSE = 0.112



plot






share|cite|improve this answer









$endgroup$












  • $begingroup$
    Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
    $endgroup$
    – James Phillips
    Jul 3 at 17:52













4












4








4





$begingroup$

My equation search on your data turned up a simple two-parameter power equation, "y = a * pow(x, b)", with parameters a = 1.0769014059561925E+00 and b = 1.4153886539395866E-01 yielding R-squared = 0.789 and RMSE = 0.112



plot






share|cite|improve this answer









$endgroup$



My equation search on your data turned up a simple two-parameter power equation, "y = a * pow(x, b)", with parameters a = 1.0769014059561925E+00 and b = 1.4153886539395866E-01 yielding R-squared = 0.789 and RMSE = 0.112



plot







share|cite|improve this answer












share|cite|improve this answer



share|cite|improve this answer










answered Jul 3 at 15:31









James PhillipsJames Phillips

7002 gold badges6 silver badges7 bronze badges




7002 gold badges6 silver badges7 bronze badges











  • $begingroup$
    Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
    $endgroup$
    – James Phillips
    Jul 3 at 17:52
















  • $begingroup$
    Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
    $endgroup$
    – James Phillips
    Jul 3 at 17:52















$begingroup$
Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
$endgroup$
– James Phillips
Jul 3 at 17:52




$begingroup$
Note that parameter "a" is almost equal to 1.0, and I tried the one-parameter equation "y = pow(x, b)" and this two-parameter version is slightly better. If you have a large number of data sets to be fitted, it is possible that the one-parameter version is all that you need.
$endgroup$
– James Phillips
Jul 3 at 17:52













3












$begingroup$

You have a specific kind of structural change in your data generating model. Specifically, your straight regression line seems to change in slope at a particular point. In order that the two line segments meet up, the intercept needs to change in a specific way at that same point.



You should look at "piecewise linear regression". Our tags change-point and structural-change may also be helpful. For instance, here is a tutorial in R.



If you already know when (at what predictor value) the change occurs, then you can set up a specific design matrix with constraints to model this. However, if you also need to detect the change point, you need specific tools. For instance, in R, the segmented package will fit a piecewise linear regression where you specify the number of breaks (and optionally guesses for where they occur). Here is how segmented deals with your data, where we specify either one or two change points:



segmented



If you need to decide whether you have one, two or more break points, I would recommend something like cross validation.



xx <- c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 43, 44)
yy <- c(0.632696209, 0.954847196, 1.211956632, 1.282479631, 1.348572899,
1.36480867, 1.411463353, 1.413356854, 1.443252208, 1.5086791,
1.519056572, 1.515427082, 1.545547955, 1.567970692, 1.554558155,
1.539432376, 1.662380739, 1.630363821, 1.627295705, 1.574494008,
1.651035552, 1.59834609, 1.612328401, 1.634990079, 1.668922299,
1.622188079, 1.759410212, 1.849993968, 1.78653232, 1.7872419,
1.639521297, 1.70151144, 1.868248296, 1.707963085, 1.703480801,
1.811524105, 1.790136202, 1.79970863, 1.944272907, 1.818103305,
1.750744767, 1.77700684, 1.80647612)

library(segmented)
model_1 <- segmented(lm(yy~xx),npsi=1)
model_2 <- segmented(lm(yy~xx),npsi=2)

xx_fit <- seq(min(xx),max(xx),by=.01)

plot(xx,yy)
lines(xx_fit,predict(model_1,newdata=data.frame(xx=xx_fit)))
lines(xx_fit,predict(model_2,newdata=data.frame(xx=xx_fit)),col="red")





share|cite|improve this answer











$endgroup$












  • $begingroup$
    Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
    $endgroup$
    – PeterJ
    Jul 3 at 13:39






  • 4




    $begingroup$
    This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
    $endgroup$
    – Frank Harrell
    Jul 3 at 13:48










  • $begingroup$
    A simple power equation fits this data well, see my answer to this question.
    $endgroup$
    – James Phillips
    Jul 3 at 15:36















3












$begingroup$

You have a specific kind of structural change in your data generating model. Specifically, your straight regression line seems to change in slope at a particular point. In order that the two line segments meet up, the intercept needs to change in a specific way at that same point.



You should look at "piecewise linear regression". Our tags change-point and structural-change may also be helpful. For instance, here is a tutorial in R.



If you already know when (at what predictor value) the change occurs, then you can set up a specific design matrix with constraints to model this. However, if you also need to detect the change point, you need specific tools. For instance, in R, the segmented package will fit a piecewise linear regression where you specify the number of breaks (and optionally guesses for where they occur). Here is how segmented deals with your data, where we specify either one or two change points:



segmented



If you need to decide whether you have one, two or more break points, I would recommend something like cross validation.



xx <- c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 43, 44)
yy <- c(0.632696209, 0.954847196, 1.211956632, 1.282479631, 1.348572899,
1.36480867, 1.411463353, 1.413356854, 1.443252208, 1.5086791,
1.519056572, 1.515427082, 1.545547955, 1.567970692, 1.554558155,
1.539432376, 1.662380739, 1.630363821, 1.627295705, 1.574494008,
1.651035552, 1.59834609, 1.612328401, 1.634990079, 1.668922299,
1.622188079, 1.759410212, 1.849993968, 1.78653232, 1.7872419,
1.639521297, 1.70151144, 1.868248296, 1.707963085, 1.703480801,
1.811524105, 1.790136202, 1.79970863, 1.944272907, 1.818103305,
1.750744767, 1.77700684, 1.80647612)

library(segmented)
model_1 <- segmented(lm(yy~xx),npsi=1)
model_2 <- segmented(lm(yy~xx),npsi=2)

xx_fit <- seq(min(xx),max(xx),by=.01)

plot(xx,yy)
lines(xx_fit,predict(model_1,newdata=data.frame(xx=xx_fit)))
lines(xx_fit,predict(model_2,newdata=data.frame(xx=xx_fit)),col="red")





share|cite|improve this answer











$endgroup$












  • $begingroup$
    Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
    $endgroup$
    – PeterJ
    Jul 3 at 13:39






  • 4




    $begingroup$
    This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
    $endgroup$
    – Frank Harrell
    Jul 3 at 13:48










  • $begingroup$
    A simple power equation fits this data well, see my answer to this question.
    $endgroup$
    – James Phillips
    Jul 3 at 15:36













3












3








3





$begingroup$

You have a specific kind of structural change in your data generating model. Specifically, your straight regression line seems to change in slope at a particular point. In order that the two line segments meet up, the intercept needs to change in a specific way at that same point.



You should look at "piecewise linear regression". Our tags change-point and structural-change may also be helpful. For instance, here is a tutorial in R.



If you already know when (at what predictor value) the change occurs, then you can set up a specific design matrix with constraints to model this. However, if you also need to detect the change point, you need specific tools. For instance, in R, the segmented package will fit a piecewise linear regression where you specify the number of breaks (and optionally guesses for where they occur). Here is how segmented deals with your data, where we specify either one or two change points:



segmented



If you need to decide whether you have one, two or more break points, I would recommend something like cross validation.



xx <- c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 43, 44)
yy <- c(0.632696209, 0.954847196, 1.211956632, 1.282479631, 1.348572899,
1.36480867, 1.411463353, 1.413356854, 1.443252208, 1.5086791,
1.519056572, 1.515427082, 1.545547955, 1.567970692, 1.554558155,
1.539432376, 1.662380739, 1.630363821, 1.627295705, 1.574494008,
1.651035552, 1.59834609, 1.612328401, 1.634990079, 1.668922299,
1.622188079, 1.759410212, 1.849993968, 1.78653232, 1.7872419,
1.639521297, 1.70151144, 1.868248296, 1.707963085, 1.703480801,
1.811524105, 1.790136202, 1.79970863, 1.944272907, 1.818103305,
1.750744767, 1.77700684, 1.80647612)

library(segmented)
model_1 <- segmented(lm(yy~xx),npsi=1)
model_2 <- segmented(lm(yy~xx),npsi=2)

xx_fit <- seq(min(xx),max(xx),by=.01)

plot(xx,yy)
lines(xx_fit,predict(model_1,newdata=data.frame(xx=xx_fit)))
lines(xx_fit,predict(model_2,newdata=data.frame(xx=xx_fit)),col="red")





share|cite|improve this answer











$endgroup$



You have a specific kind of structural change in your data generating model. Specifically, your straight regression line seems to change in slope at a particular point. In order that the two line segments meet up, the intercept needs to change in a specific way at that same point.



You should look at "piecewise linear regression". Our tags change-point and structural-change may also be helpful. For instance, here is a tutorial in R.



If you already know when (at what predictor value) the change occurs, then you can set up a specific design matrix with constraints to model this. However, if you also need to detect the change point, you need specific tools. For instance, in R, the segmented package will fit a piecewise linear regression where you specify the number of breaks (and optionally guesses for where they occur). Here is how segmented deals with your data, where we specify either one or two change points:



segmented



If you need to decide whether you have one, two or more break points, I would recommend something like cross validation.



xx <- c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 43, 44)
yy <- c(0.632696209, 0.954847196, 1.211956632, 1.282479631, 1.348572899,
1.36480867, 1.411463353, 1.413356854, 1.443252208, 1.5086791,
1.519056572, 1.515427082, 1.545547955, 1.567970692, 1.554558155,
1.539432376, 1.662380739, 1.630363821, 1.627295705, 1.574494008,
1.651035552, 1.59834609, 1.612328401, 1.634990079, 1.668922299,
1.622188079, 1.759410212, 1.849993968, 1.78653232, 1.7872419,
1.639521297, 1.70151144, 1.868248296, 1.707963085, 1.703480801,
1.811524105, 1.790136202, 1.79970863, 1.944272907, 1.818103305,
1.750744767, 1.77700684, 1.80647612)

library(segmented)
model_1 <- segmented(lm(yy~xx),npsi=1)
model_2 <- segmented(lm(yy~xx),npsi=2)

xx_fit <- seq(min(xx),max(xx),by=.01)

plot(xx,yy)
lines(xx_fit,predict(model_1,newdata=data.frame(xx=xx_fit)))
lines(xx_fit,predict(model_2,newdata=data.frame(xx=xx_fit)),col="red")






share|cite|improve this answer














share|cite|improve this answer



share|cite|improve this answer








edited Jul 3 at 13:37

























answered Jul 3 at 13:28









Stephan KolassaStephan Kolassa

52.4k9 gold badges105 silver badges197 bronze badges




52.4k9 gold badges105 silver badges197 bronze badges











  • $begingroup$
    Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
    $endgroup$
    – PeterJ
    Jul 3 at 13:39






  • 4




    $begingroup$
    This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
    $endgroup$
    – Frank Harrell
    Jul 3 at 13:48










  • $begingroup$
    A simple power equation fits this data well, see my answer to this question.
    $endgroup$
    – James Phillips
    Jul 3 at 15:36
















  • $begingroup$
    Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
    $endgroup$
    – PeterJ
    Jul 3 at 13:39






  • 4




    $begingroup$
    This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
    $endgroup$
    – Frank Harrell
    Jul 3 at 13:48










  • $begingroup$
    A simple power equation fits this data well, see my answer to this question.
    $endgroup$
    – James Phillips
    Jul 3 at 15:36















$begingroup$
Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
$endgroup$
– PeterJ
Jul 3 at 13:39




$begingroup$
Thanks you've brought up a few useful techniques / keywords I wasn't aware of which I'll check out further but it seems like a good way to go.
$endgroup$
– PeterJ
Jul 3 at 13:39




4




4




$begingroup$
This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
$endgroup$
– Frank Harrell
Jul 3 at 13:48




$begingroup$
This is a good application for a regression spline, e.g., piecewise cubic polynomial. The join points for the various polynomials are called knot. Put a couple of knots near where you think rapid changes may be, from prior knowledge.
$endgroup$
– Frank Harrell
Jul 3 at 13:48












$begingroup$
A simple power equation fits this data well, see my answer to this question.
$endgroup$
– James Phillips
Jul 3 at 15:36




$begingroup$
A simple power equation fits this data well, see my answer to this question.
$endgroup$
– James Phillips
Jul 3 at 15:36

















draft saved

draft discarded
















































Thanks for contributing an answer to Cross Validated!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f415859%2fcurve-fitting-when-data-has-a-sharp-initial-slope-and-then-tapers-off%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Get product attribute by attribute group code in magento 2get product attribute by product attribute group in magento 2Magento 2 Log Bundle Product Data in List Page?How to get all product attribute of a attribute group of Default attribute set?Magento 2.1 Create a filter in the product grid by new attributeMagento 2 : Get Product Attribute values By GroupMagento 2 How to get all existing values for one attributeMagento 2 get custom attribute of a single product inside a pluginMagento 2.3 How to get all the Multi Source Inventory (MSI) locations collection in custom module?Magento2: how to develop rest API to get new productsGet product attribute by attribute group code ( [attribute_group_code] ) in magento 2

Category:9 (number) SubcategoriesMedia in category "9 (number)"Navigation menuUpload mediaGND ID: 4485639-8Library of Congress authority ID: sh85091979ReasonatorScholiaStatistics

Magento 2.3: How do i solve this, Not registered handle, on custom form?How can i rewrite TierPrice Block in Magento2magento 2 captcha not rendering if I override layout xmlmain.CRITICAL: Plugin class doesn't existMagento 2 : Problem while adding custom button order view page?Magento 2.2.5: Overriding Admin Controller sales/orderMagento 2.2.5: Add, Update and Delete existing products Custom OptionsMagento 2.3 : File Upload issue in UI Component FormMagento2 Not registered handleHow to configured Form Builder Js in my custom magento 2.3.0 module?Magento 2.3. How to create image upload field in an admin form