Thesis on avalanche prediction using One Class SVMLinearly increasing data with manual resetsklearn - overfitting problemDate prediction - periodic recurrenceIdentifying Waveform Segments Using Training WaveformsTime series prediction without sliding windowMulti-label classification of text with variable tag distribution in KerasNLP how to go beyond simple intent finding--using context and targeting objectsHow can I improve the accuracy of my neural network on a very unbalanced dataset?Word classification (not text classification) using NLPAnomaly detection on text data using one Class SVM
If 1. e4 c6 is considered as a sound defense for black, why is 1. c3 so rare?
Disabling Resource Governor in SQL Server
Which skill should be used for secret doors or traps: Perception or Investigation?
Would "lab meat" be able to feed a much larger global population
Is it cheaper to drop cargo than to land it?
How could a planet have most of its water in the atmosphere?
What was the state of the German rail system in 1944?
Unexpected email from Yorkshire Bank
Why was Germany not as successful as other Europeans in establishing overseas colonies?
What are the spoon bit of a spoon and fork bit of a fork called?
Airbnb - host wants to reduce rooms, can we get refund?
What happens if I start too many background jobs?
LT Spice Voltage Output
Map one pandas column using two dictionaries
Why is Thanos so tough at the beginning of "Avengers: Endgame"?
Field Length Validation for Desktop Application which has maximum 1000 characters
Does hiding behind 5-ft-wide cover give full cover?
Why debootstrap can only run as root?
How to get SEEK accessing converted ID via view
Can fracking help reduce CO2?
Transfer over $10k
Short story about people living in a different time streams
How to reply this mail from potential PhD professor?
How to scale a verbatim environment on a minipage?
Thesis on avalanche prediction using One Class SVM
Linearly increasing data with manual resetsklearn - overfitting problemDate prediction - periodic recurrenceIdentifying Waveform Segments Using Training WaveformsTime series prediction without sliding windowMulti-label classification of text with variable tag distribution in KerasNLP how to go beyond simple intent finding--using context and targeting objectsHow can I improve the accuracy of my neural network on a very unbalanced dataset?Word classification (not text classification) using NLPAnomaly detection on text data using one Class SVM
$begingroup$
I'm doing my thesis on avalanche prediction using machine learning.
For my input features i'm using avalanche accidents with features like slope, altitude, facing direction of the slope, combined with according weather data from the day the avalanche occurred.
I want to predict an avalanche when certain variables combine and create a deadly avalanche situation. So 1: the avalanche occurs. 0: The avalanche does not occur. The only data in my database are occured avalanches, i got around 200 samples. So I don't have any data of a non deadly avalanche situation, which is mostly the case.
My question is if a One Class SVM is a good approach to take on this clasification?
machine-learning python
New contributor
$endgroup$
add a comment |
$begingroup$
I'm doing my thesis on avalanche prediction using machine learning.
For my input features i'm using avalanche accidents with features like slope, altitude, facing direction of the slope, combined with according weather data from the day the avalanche occurred.
I want to predict an avalanche when certain variables combine and create a deadly avalanche situation. So 1: the avalanche occurs. 0: The avalanche does not occur. The only data in my database are occured avalanches, i got around 200 samples. So I don't have any data of a non deadly avalanche situation, which is mostly the case.
My question is if a One Class SVM is a good approach to take on this clasification?
machine-learning python
New contributor
$endgroup$
add a comment |
$begingroup$
I'm doing my thesis on avalanche prediction using machine learning.
For my input features i'm using avalanche accidents with features like slope, altitude, facing direction of the slope, combined with according weather data from the day the avalanche occurred.
I want to predict an avalanche when certain variables combine and create a deadly avalanche situation. So 1: the avalanche occurs. 0: The avalanche does not occur. The only data in my database are occured avalanches, i got around 200 samples. So I don't have any data of a non deadly avalanche situation, which is mostly the case.
My question is if a One Class SVM is a good approach to take on this clasification?
machine-learning python
New contributor
$endgroup$
I'm doing my thesis on avalanche prediction using machine learning.
For my input features i'm using avalanche accidents with features like slope, altitude, facing direction of the slope, combined with according weather data from the day the avalanche occurred.
I want to predict an avalanche when certain variables combine and create a deadly avalanche situation. So 1: the avalanche occurs. 0: The avalanche does not occur. The only data in my database are occured avalanches, i got around 200 samples. So I don't have any data of a non deadly avalanche situation, which is mostly the case.
My question is if a One Class SVM is a good approach to take on this clasification?
machine-learning python
machine-learning python
New contributor
New contributor
New contributor
asked Apr 26 at 9:20
Pieter De MalschePieter De Malsche
161
161
New contributor
New contributor
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
$begingroup$
Your problem seems to belong to novelty detection in the general area of OCC problems.
So, the short answer is: yes. You can apply SVDD (Support Vector Data Description) method to get the smallest hypersphere containing samples in the dataset and then assess whether a new observation is an outlier or not.
Of course, the less representative your dataset is, the less accurate your classifier will be.
$endgroup$
add a comment |
$begingroup$
You can use a method of data mining to predict avalanches, however, there are some pit falls which I can provide you based on my basic avalanche knowledge from mountaineering.
- What do you want to predict? Spontanous avalanches (mainly threating villages and roads) or human triggered avalanches (mainly affecting skiers). The factors for these are completely different
- Getting data has already been mentioned. There are some data sets related to avalanche incidents, for example at the swiss avlanche research institute: https://www.slf.ch/de/lawinen/unfaelle-und-schadenlawinen/alle-gemeldeten-lawinenunfaelle-aktuell.html
However, there is naturally little data about cases where no avalanche was triggered and where an avalanche was triggered but nobody harmed and therefore not reported. There have been some tries to estimate the number of people on tour based on touring reports in the internet. - Getting precise data is even more of a problem. Consider figure 2 in this weeks report: https://www.slf.ch/de/lawinenbulletin-und-schneesituation/wochen-und-winterberichte/201819/wob-18-25-april.html
It compares the same slope at a time difference of 45 minutes and it looks completely different. - Feature selection is another big issue. You mention that you want to use weather data from the day of the incident. I think this is drawing the wrong conclusions as most skiing avalanches happen during the weekends and probably in slightly better weather. Also most people will be sensible and not go ski touring on risky tours on risky days. This has a big potential to skew your data and your model
New contributor
$endgroup$
add a comment |
$begingroup$
Could you look for any possible way to get non-avalanche data?
1) Avalanches happened in a mountains chain. Could you add to your data neighboring peaks data from the same day avalanche happened?
2) You may have good insights from data exploration. For instance, what is the minimal slope that mountain should have to be able to produce avalanches? Range of temperatures?
3) Could you look for other data sets (with non-avalanches entries) that you could combine with your data?
$endgroup$
1
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
Apr 26 at 19:36
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
Apr 26 at 21:06
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Pieter De Malsche is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f50968%2fthesis-on-avalanche-prediction-using-one-class-svm%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Your problem seems to belong to novelty detection in the general area of OCC problems.
So, the short answer is: yes. You can apply SVDD (Support Vector Data Description) method to get the smallest hypersphere containing samples in the dataset and then assess whether a new observation is an outlier or not.
Of course, the less representative your dataset is, the less accurate your classifier will be.
$endgroup$
add a comment |
$begingroup$
Your problem seems to belong to novelty detection in the general area of OCC problems.
So, the short answer is: yes. You can apply SVDD (Support Vector Data Description) method to get the smallest hypersphere containing samples in the dataset and then assess whether a new observation is an outlier or not.
Of course, the less representative your dataset is, the less accurate your classifier will be.
$endgroup$
add a comment |
$begingroup$
Your problem seems to belong to novelty detection in the general area of OCC problems.
So, the short answer is: yes. You can apply SVDD (Support Vector Data Description) method to get the smallest hypersphere containing samples in the dataset and then assess whether a new observation is an outlier or not.
Of course, the less representative your dataset is, the less accurate your classifier will be.
$endgroup$
Your problem seems to belong to novelty detection in the general area of OCC problems.
So, the short answer is: yes. You can apply SVDD (Support Vector Data Description) method to get the smallest hypersphere containing samples in the dataset and then assess whether a new observation is an outlier or not.
Of course, the less representative your dataset is, the less accurate your classifier will be.
answered Apr 26 at 11:44
sentencesentence
1735
1735
add a comment |
add a comment |
$begingroup$
You can use a method of data mining to predict avalanches, however, there are some pit falls which I can provide you based on my basic avalanche knowledge from mountaineering.
- What do you want to predict? Spontanous avalanches (mainly threating villages and roads) or human triggered avalanches (mainly affecting skiers). The factors for these are completely different
- Getting data has already been mentioned. There are some data sets related to avalanche incidents, for example at the swiss avlanche research institute: https://www.slf.ch/de/lawinen/unfaelle-und-schadenlawinen/alle-gemeldeten-lawinenunfaelle-aktuell.html
However, there is naturally little data about cases where no avalanche was triggered and where an avalanche was triggered but nobody harmed and therefore not reported. There have been some tries to estimate the number of people on tour based on touring reports in the internet. - Getting precise data is even more of a problem. Consider figure 2 in this weeks report: https://www.slf.ch/de/lawinenbulletin-und-schneesituation/wochen-und-winterberichte/201819/wob-18-25-april.html
It compares the same slope at a time difference of 45 minutes and it looks completely different. - Feature selection is another big issue. You mention that you want to use weather data from the day of the incident. I think this is drawing the wrong conclusions as most skiing avalanches happen during the weekends and probably in slightly better weather. Also most people will be sensible and not go ski touring on risky tours on risky days. This has a big potential to skew your data and your model
New contributor
$endgroup$
add a comment |
$begingroup$
You can use a method of data mining to predict avalanches, however, there are some pit falls which I can provide you based on my basic avalanche knowledge from mountaineering.
- What do you want to predict? Spontanous avalanches (mainly threating villages and roads) or human triggered avalanches (mainly affecting skiers). The factors for these are completely different
- Getting data has already been mentioned. There are some data sets related to avalanche incidents, for example at the swiss avlanche research institute: https://www.slf.ch/de/lawinen/unfaelle-und-schadenlawinen/alle-gemeldeten-lawinenunfaelle-aktuell.html
However, there is naturally little data about cases where no avalanche was triggered and where an avalanche was triggered but nobody harmed and therefore not reported. There have been some tries to estimate the number of people on tour based on touring reports in the internet. - Getting precise data is even more of a problem. Consider figure 2 in this weeks report: https://www.slf.ch/de/lawinenbulletin-und-schneesituation/wochen-und-winterberichte/201819/wob-18-25-april.html
It compares the same slope at a time difference of 45 minutes and it looks completely different. - Feature selection is another big issue. You mention that you want to use weather data from the day of the incident. I think this is drawing the wrong conclusions as most skiing avalanches happen during the weekends and probably in slightly better weather. Also most people will be sensible and not go ski touring on risky tours on risky days. This has a big potential to skew your data and your model
New contributor
$endgroup$
add a comment |
$begingroup$
You can use a method of data mining to predict avalanches, however, there are some pit falls which I can provide you based on my basic avalanche knowledge from mountaineering.
- What do you want to predict? Spontanous avalanches (mainly threating villages and roads) or human triggered avalanches (mainly affecting skiers). The factors for these are completely different
- Getting data has already been mentioned. There are some data sets related to avalanche incidents, for example at the swiss avlanche research institute: https://www.slf.ch/de/lawinen/unfaelle-und-schadenlawinen/alle-gemeldeten-lawinenunfaelle-aktuell.html
However, there is naturally little data about cases where no avalanche was triggered and where an avalanche was triggered but nobody harmed and therefore not reported. There have been some tries to estimate the number of people on tour based on touring reports in the internet. - Getting precise data is even more of a problem. Consider figure 2 in this weeks report: https://www.slf.ch/de/lawinenbulletin-und-schneesituation/wochen-und-winterberichte/201819/wob-18-25-april.html
It compares the same slope at a time difference of 45 minutes and it looks completely different. - Feature selection is another big issue. You mention that you want to use weather data from the day of the incident. I think this is drawing the wrong conclusions as most skiing avalanches happen during the weekends and probably in slightly better weather. Also most people will be sensible and not go ski touring on risky tours on risky days. This has a big potential to skew your data and your model
New contributor
$endgroup$
You can use a method of data mining to predict avalanches, however, there are some pit falls which I can provide you based on my basic avalanche knowledge from mountaineering.
- What do you want to predict? Spontanous avalanches (mainly threating villages and roads) or human triggered avalanches (mainly affecting skiers). The factors for these are completely different
- Getting data has already been mentioned. There are some data sets related to avalanche incidents, for example at the swiss avlanche research institute: https://www.slf.ch/de/lawinen/unfaelle-und-schadenlawinen/alle-gemeldeten-lawinenunfaelle-aktuell.html
However, there is naturally little data about cases where no avalanche was triggered and where an avalanche was triggered but nobody harmed and therefore not reported. There have been some tries to estimate the number of people on tour based on touring reports in the internet. - Getting precise data is even more of a problem. Consider figure 2 in this weeks report: https://www.slf.ch/de/lawinenbulletin-und-schneesituation/wochen-und-winterberichte/201819/wob-18-25-april.html
It compares the same slope at a time difference of 45 minutes and it looks completely different. - Feature selection is another big issue. You mention that you want to use weather data from the day of the incident. I think this is drawing the wrong conclusions as most skiing avalanches happen during the weekends and probably in slightly better weather. Also most people will be sensible and not go ski touring on risky tours on risky days. This has a big potential to skew your data and your model
New contributor
New contributor
answered Apr 26 at 16:16
ManzielManziel
1212
1212
New contributor
New contributor
add a comment |
add a comment |
$begingroup$
Could you look for any possible way to get non-avalanche data?
1) Avalanches happened in a mountains chain. Could you add to your data neighboring peaks data from the same day avalanche happened?
2) You may have good insights from data exploration. For instance, what is the minimal slope that mountain should have to be able to produce avalanches? Range of temperatures?
3) Could you look for other data sets (with non-avalanches entries) that you could combine with your data?
$endgroup$
1
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
Apr 26 at 19:36
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
Apr 26 at 21:06
add a comment |
$begingroup$
Could you look for any possible way to get non-avalanche data?
1) Avalanches happened in a mountains chain. Could you add to your data neighboring peaks data from the same day avalanche happened?
2) You may have good insights from data exploration. For instance, what is the minimal slope that mountain should have to be able to produce avalanches? Range of temperatures?
3) Could you look for other data sets (with non-avalanches entries) that you could combine with your data?
$endgroup$
1
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
Apr 26 at 19:36
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
Apr 26 at 21:06
add a comment |
$begingroup$
Could you look for any possible way to get non-avalanche data?
1) Avalanches happened in a mountains chain. Could you add to your data neighboring peaks data from the same day avalanche happened?
2) You may have good insights from data exploration. For instance, what is the minimal slope that mountain should have to be able to produce avalanches? Range of temperatures?
3) Could you look for other data sets (with non-avalanches entries) that you could combine with your data?
$endgroup$
Could you look for any possible way to get non-avalanche data?
1) Avalanches happened in a mountains chain. Could you add to your data neighboring peaks data from the same day avalanche happened?
2) You may have good insights from data exploration. For instance, what is the minimal slope that mountain should have to be able to produce avalanches? Range of temperatures?
3) Could you look for other data sets (with non-avalanches entries) that you could combine with your data?
edited Apr 26 at 21:02
answered Apr 26 at 11:07
TatyanaTatyana
414
414
1
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
Apr 26 at 19:36
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
Apr 26 at 21:06
add a comment |
1
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
Apr 26 at 19:36
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
Apr 26 at 21:06
1
1
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
Apr 26 at 19:36
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
Apr 26 at 19:36
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
Apr 26 at 21:06
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
Apr 26 at 21:06
add a comment |
Pieter De Malsche is a new contributor. Be nice, and check out our Code of Conduct.
Pieter De Malsche is a new contributor. Be nice, and check out our Code of Conduct.
Pieter De Malsche is a new contributor. Be nice, and check out our Code of Conduct.
Pieter De Malsche is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f50968%2fthesis-on-avalanche-prediction-using-one-class-svm%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown