Ordering German special characters and those from other languages when sortingWhat are good online dictionaries for translation between German and English?Which languages are “nnl.”, “altn.” and “schw.”?German dictionary with detailed declensions, audio pronunciations, and IPAWhich German word contains the most ä, ö, ü, and ß in any variation?How to perform an advanced search of German nouns in Wiktionary restricting both gender and ending?How to type the ß and capital ß (ẞ) on a Windows 8 German keyboard?When to use 'ß' and 'ss'?Why does the German dictionary show only 2nd and 3rd person conjugation?Why are “tomorrow” and “morning” the same in German?How many masculine, feminine, neuter and plural nouns are there in German language?

Why cruise at 7000' in an A319?

In the Marvel universe, can a human have a baby with any non-human?

How often can a PC check with passive perception during a combat turn?

Is there any set of 2-6 notes that doesn't have a chord name?

Was touching your nose a greeting in second millenium Mesopotamia?

Fitting a mixture of two normal distributions for a data set?

Going to get married soon, should I do it on Dec 31 or Jan 1?

How to append a matrix element by element?

MH370 blackbox - is it still possible to retrieve data from it?

STM Microcontroller burns every time

Why isn’t the tax system continuous rather than bracketed?

Counting occurrence of words in table is slow

Short story with brother-sister conjoined twins as protagonists?

What is this opening trap called, and how should I play afterwards? How can I refute the gambit, and play if I accept it?

Should I hide continue button until tasks are completed?

Fedora boot screen shows both Fedora logo and Lenovo logo. Why and How?

Does anycast addressing add additional latency in any way?

Could Sauron have read Tom Bombadil's mind if Tom had held the Palantir?

How many satellites can stay in a Lagrange point?

A player is constantly pestering me about rules, what do I do as a DM?

Why does the A-4 Skyhawk sit nose-up when on ground?

Why does the numerical solution of an ODE move away from an unstable equilibrium?

Do equal angles necessarily mean a polygon is regular?

Is it okay to visually align the elements in a logo?



Ordering German special characters and those from other languages when sorting


What are good online dictionaries for translation between German and English?Which languages are “nnl.”, “altn.” and “schw.”?German dictionary with detailed declensions, audio pronunciations, and IPAWhich German word contains the most ä, ö, ü, and ß in any variation?How to perform an advanced search of German nouns in Wiktionary restricting both gender and ending?How to type the ß and capital ß (ẞ) on a Windows 8 German keyboard?When to use 'ß' and 'ss'?Why does the German dictionary show only 2nd and 3rd person conjugation?Why are “tomorrow” and “morning” the same in German?How many masculine, feminine, neuter and plural nouns are there in German language?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








16















I want to sort strings (text) in a software project of mine. I'm planning to do this in the lexically best way.
My set of possible characters consist of the full alphabet (a–z and A–Z) and of the typical Latin 1 Umlauts, like Ä, ö, ß, and also characters from other Latin 1 languages like à, á, â, ã.
(It’s technically impossible to order the data by expanding characters like Ä to Ae.)



How would one sort those characters so that also humans could look them up fast?



  • Would one look for Ä after A (I guess). And for é after e?

  • In which order would à, á, â, ã, and ä be sorted in between a and b?

  • Is there some kind of ISO standard defining such things? How would those characters be arranged?









share|improve this question









New contributor



Matthias is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.














  • 12





    In most programming languages a collation function is available which compares strings according to a locale. In C, this function is strcoll(). Java has a Collator class.

    – RHa
    Jun 16 at 19:12












  • The right way to do this is to use strxfrm() to convert your user-visible strings into encoded strings that can be compared using strcmp(). If you're sorting user records, say, and each has a "name" field, you'd have to then create a "nameSortable" field and use strxfrm to populate it. That resulting field may or may not look much like the ASCII you put in but is guaranteed to sort in the right order for the current locale. And since you do the transformation once, it should then sort quickly.

    – Swiss Frank
    Jun 17 at 13:45












  • A easier but typically lower-performance method is to use strcoll() to compare. This may be slower because in effect it's internally creating buffers, calling strxfrm to get comperable strings, comparing them, and throwing away the results of the conversion, only leaving the result. strxform() requires you to manage the caches, but pays you back by allowing far faster compares.

    – Swiss Frank
    Jun 17 at 13:45







  • 3





    Humans? Unfortunately, it depends which humans. Swedish humans looking for these characters will look in a different place from German humans. There are many standards for collating sequences, some country-specific, some industry-specific. Whatever you do, don't go inventing another one.

    – Michael Kay
    Jun 17 at 16:02

















16















I want to sort strings (text) in a software project of mine. I'm planning to do this in the lexically best way.
My set of possible characters consist of the full alphabet (a–z and A–Z) and of the typical Latin 1 Umlauts, like Ä, ö, ß, and also characters from other Latin 1 languages like à, á, â, ã.
(It’s technically impossible to order the data by expanding characters like Ä to Ae.)



How would one sort those characters so that also humans could look them up fast?



  • Would one look for Ä after A (I guess). And for é after e?

  • In which order would à, á, â, ã, and ä be sorted in between a and b?

  • Is there some kind of ISO standard defining such things? How would those characters be arranged?









share|improve this question









New contributor



Matthias is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.














  • 12





    In most programming languages a collation function is available which compares strings according to a locale. In C, this function is strcoll(). Java has a Collator class.

    – RHa
    Jun 16 at 19:12












  • The right way to do this is to use strxfrm() to convert your user-visible strings into encoded strings that can be compared using strcmp(). If you're sorting user records, say, and each has a "name" field, you'd have to then create a "nameSortable" field and use strxfrm to populate it. That resulting field may or may not look much like the ASCII you put in but is guaranteed to sort in the right order for the current locale. And since you do the transformation once, it should then sort quickly.

    – Swiss Frank
    Jun 17 at 13:45












  • A easier but typically lower-performance method is to use strcoll() to compare. This may be slower because in effect it's internally creating buffers, calling strxfrm to get comperable strings, comparing them, and throwing away the results of the conversion, only leaving the result. strxform() requires you to manage the caches, but pays you back by allowing far faster compares.

    – Swiss Frank
    Jun 17 at 13:45







  • 3





    Humans? Unfortunately, it depends which humans. Swedish humans looking for these characters will look in a different place from German humans. There are many standards for collating sequences, some country-specific, some industry-specific. Whatever you do, don't go inventing another one.

    – Michael Kay
    Jun 17 at 16:02













16












16








16


2






I want to sort strings (text) in a software project of mine. I'm planning to do this in the lexically best way.
My set of possible characters consist of the full alphabet (a–z and A–Z) and of the typical Latin 1 Umlauts, like Ä, ö, ß, and also characters from other Latin 1 languages like à, á, â, ã.
(It’s technically impossible to order the data by expanding characters like Ä to Ae.)



How would one sort those characters so that also humans could look them up fast?



  • Would one look for Ä after A (I guess). And for é after e?

  • In which order would à, á, â, ã, and ä be sorted in between a and b?

  • Is there some kind of ISO standard defining such things? How would those characters be arranged?









share|improve this question









New contributor



Matthias is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I want to sort strings (text) in a software project of mine. I'm planning to do this in the lexically best way.
My set of possible characters consist of the full alphabet (a–z and A–Z) and of the typical Latin 1 Umlauts, like Ä, ö, ß, and also characters from other Latin 1 languages like à, á, â, ã.
(It’s technically impossible to order the data by expanding characters like Ä to Ae.)



How would one sort those characters so that also humans could look them up fast?



  • Would one look for Ä after A (I guess). And for é after e?

  • In which order would à, á, â, ã, and ä be sorted in between a and b?

  • Is there some kind of ISO standard defining such things? How would those characters be arranged?






dictionary umlaut eszett






share|improve this question









New contributor



Matthias is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










share|improve this question









New contributor



Matthias is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








share|improve this question




share|improve this question








edited yesterday









Wrzlprmft

18.4k5 gold badges49 silver badges114 bronze badges




18.4k5 gold badges49 silver badges114 bronze badges






New contributor



Matthias is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








asked Jun 16 at 15:24









MatthiasMatthias

1836 bronze badges




1836 bronze badges




New contributor



Matthias is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




New contributor




Matthias is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









  • 12





    In most programming languages a collation function is available which compares strings according to a locale. In C, this function is strcoll(). Java has a Collator class.

    – RHa
    Jun 16 at 19:12












  • The right way to do this is to use strxfrm() to convert your user-visible strings into encoded strings that can be compared using strcmp(). If you're sorting user records, say, and each has a "name" field, you'd have to then create a "nameSortable" field and use strxfrm to populate it. That resulting field may or may not look much like the ASCII you put in but is guaranteed to sort in the right order for the current locale. And since you do the transformation once, it should then sort quickly.

    – Swiss Frank
    Jun 17 at 13:45












  • A easier but typically lower-performance method is to use strcoll() to compare. This may be slower because in effect it's internally creating buffers, calling strxfrm to get comperable strings, comparing them, and throwing away the results of the conversion, only leaving the result. strxform() requires you to manage the caches, but pays you back by allowing far faster compares.

    – Swiss Frank
    Jun 17 at 13:45







  • 3





    Humans? Unfortunately, it depends which humans. Swedish humans looking for these characters will look in a different place from German humans. There are many standards for collating sequences, some country-specific, some industry-specific. Whatever you do, don't go inventing another one.

    – Michael Kay
    Jun 17 at 16:02












  • 12





    In most programming languages a collation function is available which compares strings according to a locale. In C, this function is strcoll(). Java has a Collator class.

    – RHa
    Jun 16 at 19:12












  • The right way to do this is to use strxfrm() to convert your user-visible strings into encoded strings that can be compared using strcmp(). If you're sorting user records, say, and each has a "name" field, you'd have to then create a "nameSortable" field and use strxfrm to populate it. That resulting field may or may not look much like the ASCII you put in but is guaranteed to sort in the right order for the current locale. And since you do the transformation once, it should then sort quickly.

    – Swiss Frank
    Jun 17 at 13:45












  • A easier but typically lower-performance method is to use strcoll() to compare. This may be slower because in effect it's internally creating buffers, calling strxfrm to get comperable strings, comparing them, and throwing away the results of the conversion, only leaving the result. strxform() requires you to manage the caches, but pays you back by allowing far faster compares.

    – Swiss Frank
    Jun 17 at 13:45







  • 3





    Humans? Unfortunately, it depends which humans. Swedish humans looking for these characters will look in a different place from German humans. There are many standards for collating sequences, some country-specific, some industry-specific. Whatever you do, don't go inventing another one.

    – Michael Kay
    Jun 17 at 16:02







12




12





In most programming languages a collation function is available which compares strings according to a locale. In C, this function is strcoll(). Java has a Collator class.

– RHa
Jun 16 at 19:12






In most programming languages a collation function is available which compares strings according to a locale. In C, this function is strcoll(). Java has a Collator class.

– RHa
Jun 16 at 19:12














The right way to do this is to use strxfrm() to convert your user-visible strings into encoded strings that can be compared using strcmp(). If you're sorting user records, say, and each has a "name" field, you'd have to then create a "nameSortable" field and use strxfrm to populate it. That resulting field may or may not look much like the ASCII you put in but is guaranteed to sort in the right order for the current locale. And since you do the transformation once, it should then sort quickly.

– Swiss Frank
Jun 17 at 13:45






The right way to do this is to use strxfrm() to convert your user-visible strings into encoded strings that can be compared using strcmp(). If you're sorting user records, say, and each has a "name" field, you'd have to then create a "nameSortable" field and use strxfrm to populate it. That resulting field may or may not look much like the ASCII you put in but is guaranteed to sort in the right order for the current locale. And since you do the transformation once, it should then sort quickly.

– Swiss Frank
Jun 17 at 13:45














A easier but typically lower-performance method is to use strcoll() to compare. This may be slower because in effect it's internally creating buffers, calling strxfrm to get comperable strings, comparing them, and throwing away the results of the conversion, only leaving the result. strxform() requires you to manage the caches, but pays you back by allowing far faster compares.

– Swiss Frank
Jun 17 at 13:45






A easier but typically lower-performance method is to use strcoll() to compare. This may be slower because in effect it's internally creating buffers, calling strxfrm to get comperable strings, comparing them, and throwing away the results of the conversion, only leaving the result. strxform() requires you to manage the caches, but pays you back by allowing far faster compares.

– Swiss Frank
Jun 17 at 13:45





3




3





Humans? Unfortunately, it depends which humans. Swedish humans looking for these characters will look in a different place from German humans. There are many standards for collating sequences, some country-specific, some industry-specific. Whatever you do, don't go inventing another one.

– Michael Kay
Jun 17 at 16:02





Humans? Unfortunately, it depends which humans. Swedish humans looking for these characters will look in a different place from German humans. There are many standards for collating sequences, some country-specific, some industry-specific. Whatever you do, don't go inventing another one.

– Michael Kay
Jun 17 at 16:02










3 Answers
3






active

oldest

votes


















23














Short answer:



Take a look at MySQL and different character-collations. Choose one and follow its rules. Or as @RHa and @cbeleites suggest find a library that provides locale-dependent sorting.



Long answer:



There are 3 different solutions for your problem (actually there are 4, but believe me, you don't want to realize the 4th ;) )




  1. Rewrite every Umlaut to its base (German dictionary rules - DIN 5007-1 var. 1)



    Every Umlaut and Diacritic results in the same char.
    e.g.




    áàâäã = a



    ß = ss




    and so on.



    Sort them.




  2. Rewrite every Umlaut by adding an e, diacritics are removed (German phone book rules - DIN 5007-1 var. 2)




    ä = ae



    áàâã = a



    ü = ue



    ß = ss




    Sort them.




    1. Umlaute are new chars added to the alphabet (Swedish/Finnish collation rules)



      Every Umlaut and å are treated like new chars, which are added after the z of the alphabet. Other chars with diacretics are converted to their base.



      So sort like




      aáàâãbcd [...] xyzåäü





    2. Umlaute are new chars added to the alphabet (Austrian phone book order - kudos to @rexkogitans from the comments)



      It's as 3.1., but Umlaute and chars with Diacretics are appended to their base characters.




      aäáà [...] bcdeèé [...] uü ...




      But watch out, wiki says this is true for the Austrian White Pages, but the Austrian Yellow Pages are sorted like DIN 5007-1 var. 1.





In addition. According to EN 13710 the order of diacritics is



  1. Acute accent (á)

  2. Grave accent (à)

  3. Breve (ă)

  4. Circumflex (â)

  5. Hacek (háček) (š)

  6. Ring (å)

  7. Trema (ä)

  8. Double acute accent (ő)

  9. Tilde (ã)

  10. Dot (ż)

  11. Cedilla (ş)

  12. Ogonek (ą)

  13. Macron (ā)

  14. With stroke through (ø)

  15. Modified letter(s) (æ)


For further information take a look at



  • European ordering rules (EOR / EN 13710)


  • Unicode collation algorithm


  • ISO 14651 (download)



One very last comment:



There are a lot of countries and there are a lot of different languages and a lot of different characters. Some are frequently used in one country, but unknown in another.



Therefore there are a lot of different standards how to compare strings alphabetically. Even though there are international standards, you have to choose which one to follow.



How? Well, there's a saying: When in Rome, do as the Romans do.



Take a look at your target group and their expectations. Then choose the collation most of them will accept.



  • Spanish people? Well, ñ is an independent character sorted in
    between n and o.


  • Germans? Take a look above.


  • Russians and western europeans? Oh boy, you are in trouble sorting
    all these cyrillic and latin characters ;)


  • Or choose the one you are most comfortable with.


Also ... one of my professors once said: "The first thing I do before coding is searching the internet if there is already a solution." And as @RHa and @cbeleites said in the comments there are solutions (C, Java, PHP, etc.), so unless you insist ... use one of them and you only have to worry about choosing the right locale (and looking up which sorting rules they follow).



Last and least: The 4th sorting method



As I mentioned ealier there is a 4th (horrible) solution. German Wikipedia describes it as




Gleichordnung von Grundbuchstaben, Doppelbuchstaben und Umlaut, wenn Doppelbuchstabe wie Umlaut gesprochen wird. Mull wird wie Muell oder Müll sortiert. Duell dagegen zwischen Duden und Dugast.




To explain it in English I would like to quote @FabioTurati from the comments




It says that an approach is treating plain letters, double letters and Umlauts the same way, but only when the double letters are pronounced as an Umlaut. For example, "Mull", "Muell" and "Müll" are treated the same way (note that Mull and Müll are different words!), and the "ue" in Muell here is a way to avoid typing "ü". In other cases, the letters "ue" are pronounced separately: a "u" followed by an "e", for example in "Duell", which is why Duell is placed between Duden and Dugast.



Why is this a problem?



Consider it from the other point of view: if you find a word containing "ue", how do you treat it? For Muell you should pretend the "e" is not there, and treat it as "Mull". For "Duell", doing it would lead to "Dull", which would be sorted after "Dugast", which would be wrong. So, to know how to sort words you need to have some more info (that is, how they are pronounced), which a sorting algorithm normally doesn't have. That's why this approach is troublesome!







share|improve this answer




















  • 1





    Another sorting rule: Treat ä, ö, ü as additional characters, but not appended after z, but instead after a, o, u respectively: a, ä, b, c, ..., m, n, o, ö, p, ... This is called Austrian Order.

    – rexkogitans
    Jun 17 at 6:11







  • 12





    So... what's the fourth?

    – sgf
    Jun 17 at 7:47






  • 2





    @sgf from wikipedia: "Gleichordnung von Grundbuchstaben, Doppelbuchstaben und Umlaut, wenn Doppelbuchstabe wie Umlaut gesprochen wird. Mull wird wie Muell oder Müll sortiert. Duell dagegen zwischen Duden und Dugast." which is horrifying to code and gets nasty with names like "Schröder / Schroeder", etc

    – mtwde
    Jun 17 at 9:50







  • 2





    Your description of the Swedish collation order is incorrect. You are right that å, ä and ö are treated as full-worthy letters and sorted after z, but all other diacritics are ignored. E.g. á is sorted as a and not after a. Until some 10 years ago, v and w were also considered equal when sorting, but w is now usually considered a 'proper' letter in itself and sorted after v. AFAIK, the same collation rules are used in Finnish.

    – jarnbjo
    Jun 17 at 13:06






  • 1





    @mtwde My German isn't good enough to understand that Wikipedia excerpt. Could you translate it, please?

    – Nzall
    Jun 18 at 7:08



















6














If it's not names you are dealing with, it would be best to ignore all diacritics when sorting (and count ß as ss).



The only reason to deviate from this simple system lies in the unfortunate fact that German names show unpredictable variation between ä, ö, ü and ae, oe, ue. This has lead to phone books and library catalogues sorting e.g. Räder as Raeder, Örtel as Oertel, Hüber as Hueber.



Wikipedia has a good write-up.






share|improve this answer






























    0














    I can answer you only regarding the German characters. "Ä" is considered equivalent to "Ae", "Ö" to "Oe", "Ü" to "Ue" and "ß" to "ss". This is how those characters are sorted in a phonebook.






    share|improve this answer








    New contributor



    ziganotschka is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.



















    • Thank you for your answer. Sadly I cannot implement this behaviour. I'm sorry. I removed the phone book reference.

      – Matthias
      Jun 16 at 16:06






    • 1





      This does mean that it is sorted like this not that it is written like this.

      – Kami Kaze
      Jun 17 at 6:43











    • This is how German users would expect it.

      – Simon Richter
      Jun 17 at 8:56






    • 3





      @SimonRichter: There are (at least) two different widely-used collation orders in Germany, and even more if you also consider other German-speaking countries such as Austria and Switzerland. So, whether or not "German users would expect it" this way depends very much on context and on whether those German users are actually from Germany or German-speaking from Austria, for example.

      – Jörg W Mittag
      Jun 17 at 12:47













    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "253"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );






    Matthias is a new contributor. Be nice, and check out our Code of Conduct.









    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fgerman.stackexchange.com%2fquestions%2f52765%2fordering-german-special-characters-and-those-from-other-languages-when-sorting%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    23














    Short answer:



    Take a look at MySQL and different character-collations. Choose one and follow its rules. Or as @RHa and @cbeleites suggest find a library that provides locale-dependent sorting.



    Long answer:



    There are 3 different solutions for your problem (actually there are 4, but believe me, you don't want to realize the 4th ;) )




    1. Rewrite every Umlaut to its base (German dictionary rules - DIN 5007-1 var. 1)



      Every Umlaut and Diacritic results in the same char.
      e.g.




      áàâäã = a



      ß = ss




      and so on.



      Sort them.




    2. Rewrite every Umlaut by adding an e, diacritics are removed (German phone book rules - DIN 5007-1 var. 2)




      ä = ae



      áàâã = a



      ü = ue



      ß = ss




      Sort them.




      1. Umlaute are new chars added to the alphabet (Swedish/Finnish collation rules)



        Every Umlaut and å are treated like new chars, which are added after the z of the alphabet. Other chars with diacretics are converted to their base.



        So sort like




        aáàâãbcd [...] xyzåäü





      2. Umlaute are new chars added to the alphabet (Austrian phone book order - kudos to @rexkogitans from the comments)



        It's as 3.1., but Umlaute and chars with Diacretics are appended to their base characters.




        aäáà [...] bcdeèé [...] uü ...




        But watch out, wiki says this is true for the Austrian White Pages, but the Austrian Yellow Pages are sorted like DIN 5007-1 var. 1.





    In addition. According to EN 13710 the order of diacritics is



    1. Acute accent (á)

    2. Grave accent (à)

    3. Breve (ă)

    4. Circumflex (â)

    5. Hacek (háček) (š)

    6. Ring (å)

    7. Trema (ä)

    8. Double acute accent (ő)

    9. Tilde (ã)

    10. Dot (ż)

    11. Cedilla (ş)

    12. Ogonek (ą)

    13. Macron (ā)

    14. With stroke through (ø)

    15. Modified letter(s) (æ)


    For further information take a look at



    • European ordering rules (EOR / EN 13710)


    • Unicode collation algorithm


    • ISO 14651 (download)



    One very last comment:



    There are a lot of countries and there are a lot of different languages and a lot of different characters. Some are frequently used in one country, but unknown in another.



    Therefore there are a lot of different standards how to compare strings alphabetically. Even though there are international standards, you have to choose which one to follow.



    How? Well, there's a saying: When in Rome, do as the Romans do.



    Take a look at your target group and their expectations. Then choose the collation most of them will accept.



    • Spanish people? Well, ñ is an independent character sorted in
      between n and o.


    • Germans? Take a look above.


    • Russians and western europeans? Oh boy, you are in trouble sorting
      all these cyrillic and latin characters ;)


    • Or choose the one you are most comfortable with.


    Also ... one of my professors once said: "The first thing I do before coding is searching the internet if there is already a solution." And as @RHa and @cbeleites said in the comments there are solutions (C, Java, PHP, etc.), so unless you insist ... use one of them and you only have to worry about choosing the right locale (and looking up which sorting rules they follow).



    Last and least: The 4th sorting method



    As I mentioned ealier there is a 4th (horrible) solution. German Wikipedia describes it as




    Gleichordnung von Grundbuchstaben, Doppelbuchstaben und Umlaut, wenn Doppelbuchstabe wie Umlaut gesprochen wird. Mull wird wie Muell oder Müll sortiert. Duell dagegen zwischen Duden und Dugast.




    To explain it in English I would like to quote @FabioTurati from the comments




    It says that an approach is treating plain letters, double letters and Umlauts the same way, but only when the double letters are pronounced as an Umlaut. For example, "Mull", "Muell" and "Müll" are treated the same way (note that Mull and Müll are different words!), and the "ue" in Muell here is a way to avoid typing "ü". In other cases, the letters "ue" are pronounced separately: a "u" followed by an "e", for example in "Duell", which is why Duell is placed between Duden and Dugast.



    Why is this a problem?



    Consider it from the other point of view: if you find a word containing "ue", how do you treat it? For Muell you should pretend the "e" is not there, and treat it as "Mull". For "Duell", doing it would lead to "Dull", which would be sorted after "Dugast", which would be wrong. So, to know how to sort words you need to have some more info (that is, how they are pronounced), which a sorting algorithm normally doesn't have. That's why this approach is troublesome!







    share|improve this answer




















    • 1





      Another sorting rule: Treat ä, ö, ü as additional characters, but not appended after z, but instead after a, o, u respectively: a, ä, b, c, ..., m, n, o, ö, p, ... This is called Austrian Order.

      – rexkogitans
      Jun 17 at 6:11







    • 12





      So... what's the fourth?

      – sgf
      Jun 17 at 7:47






    • 2





      @sgf from wikipedia: "Gleichordnung von Grundbuchstaben, Doppelbuchstaben und Umlaut, wenn Doppelbuchstabe wie Umlaut gesprochen wird. Mull wird wie Muell oder Müll sortiert. Duell dagegen zwischen Duden und Dugast." which is horrifying to code and gets nasty with names like "Schröder / Schroeder", etc

      – mtwde
      Jun 17 at 9:50







    • 2





      Your description of the Swedish collation order is incorrect. You are right that å, ä and ö are treated as full-worthy letters and sorted after z, but all other diacritics are ignored. E.g. á is sorted as a and not after a. Until some 10 years ago, v and w were also considered equal when sorting, but w is now usually considered a 'proper' letter in itself and sorted after v. AFAIK, the same collation rules are used in Finnish.

      – jarnbjo
      Jun 17 at 13:06






    • 1





      @mtwde My German isn't good enough to understand that Wikipedia excerpt. Could you translate it, please?

      – Nzall
      Jun 18 at 7:08
















    23














    Short answer:



    Take a look at MySQL and different character-collations. Choose one and follow its rules. Or as @RHa and @cbeleites suggest find a library that provides locale-dependent sorting.



    Long answer:



    There are 3 different solutions for your problem (actually there are 4, but believe me, you don't want to realize the 4th ;) )




    1. Rewrite every Umlaut to its base (German dictionary rules - DIN 5007-1 var. 1)



      Every Umlaut and Diacritic results in the same char.
      e.g.




      áàâäã = a



      ß = ss




      and so on.



      Sort them.




    2. Rewrite every Umlaut by adding an e, diacritics are removed (German phone book rules - DIN 5007-1 var. 2)




      ä = ae



      áàâã = a



      ü = ue



      ß = ss




      Sort them.




      1. Umlaute are new chars added to the alphabet (Swedish/Finnish collation rules)



        Every Umlaut and å are treated like new chars, which are added after the z of the alphabet. Other chars with diacretics are converted to their base.



        So sort like




        aáàâãbcd [...] xyzåäü





      2. Umlaute are new chars added to the alphabet (Austrian phone book order - kudos to @rexkogitans from the comments)



        It's as 3.1., but Umlaute and chars with Diacretics are appended to their base characters.




        aäáà [...] bcdeèé [...] uü ...




        But watch out, wiki says this is true for the Austrian White Pages, but the Austrian Yellow Pages are sorted like DIN 5007-1 var. 1.





    In addition. According to EN 13710 the order of diacritics is



    1. Acute accent (á)

    2. Grave accent (à)

    3. Breve (ă)

    4. Circumflex (â)

    5. Hacek (háček) (š)

    6. Ring (å)

    7. Trema (ä)

    8. Double acute accent (ő)

    9. Tilde (ã)

    10. Dot (ż)

    11. Cedilla (ş)

    12. Ogonek (ą)

    13. Macron (ā)

    14. With stroke through (ø)

    15. Modified letter(s) (æ)


    For further information take a look at



    • European ordering rules (EOR / EN 13710)


    • Unicode collation algorithm


    • ISO 14651 (download)



    One very last comment:



    There are a lot of countries and there are a lot of different languages and a lot of different characters. Some are frequently used in one country, but unknown in another.



    Therefore there are a lot of different standards how to compare strings alphabetically. Even though there are international standards, you have to choose which one to follow.



    How? Well, there's a saying: When in Rome, do as the Romans do.



    Take a look at your target group and their expectations. Then choose the collation most of them will accept.



    • Spanish people? Well, ñ is an independent character sorted in
      between n and o.


    • Germans? Take a look above.


    • Russians and western europeans? Oh boy, you are in trouble sorting
      all these cyrillic and latin characters ;)


    • Or choose the one you are most comfortable with.


    Also ... one of my professors once said: "The first thing I do before coding is searching the internet if there is already a solution." And as @RHa and @cbeleites said in the comments there are solutions (C, Java, PHP, etc.), so unless you insist ... use one of them and you only have to worry about choosing the right locale (and looking up which sorting rules they follow).



    Last and least: The 4th sorting method



    As I mentioned ealier there is a 4th (horrible) solution. German Wikipedia describes it as




    Gleichordnung von Grundbuchstaben, Doppelbuchstaben und Umlaut, wenn Doppelbuchstabe wie Umlaut gesprochen wird. Mull wird wie Muell oder Müll sortiert. Duell dagegen zwischen Duden und Dugast.




    To explain it in English I would like to quote @FabioTurati from the comments




    It says that an approach is treating plain letters, double letters and Umlauts the same way, but only when the double letters are pronounced as an Umlaut. For example, "Mull", "Muell" and "Müll" are treated the same way (note that Mull and Müll are different words!), and the "ue" in Muell here is a way to avoid typing "ü". In other cases, the letters "ue" are pronounced separately: a "u" followed by an "e", for example in "Duell", which is why Duell is placed between Duden and Dugast.



    Why is this a problem?



    Consider it from the other point of view: if you find a word containing "ue", how do you treat it? For Muell you should pretend the "e" is not there, and treat it as "Mull". For "Duell", doing it would lead to "Dull", which would be sorted after "Dugast", which would be wrong. So, to know how to sort words you need to have some more info (that is, how they are pronounced), which a sorting algorithm normally doesn't have. That's why this approach is troublesome!







    share|improve this answer




















    • 1





      Another sorting rule: Treat ä, ö, ü as additional characters, but not appended after z, but instead after a, o, u respectively: a, ä, b, c, ..., m, n, o, ö, p, ... This is called Austrian Order.

      – rexkogitans
      Jun 17 at 6:11







    • 12





      So... what's the fourth?

      – sgf
      Jun 17 at 7:47






    • 2





      @sgf from wikipedia: "Gleichordnung von Grundbuchstaben, Doppelbuchstaben und Umlaut, wenn Doppelbuchstabe wie Umlaut gesprochen wird. Mull wird wie Muell oder Müll sortiert. Duell dagegen zwischen Duden und Dugast." which is horrifying to code and gets nasty with names like "Schröder / Schroeder", etc

      – mtwde
      Jun 17 at 9:50







    • 2





      Your description of the Swedish collation order is incorrect. You are right that å, ä and ö are treated as full-worthy letters and sorted after z, but all other diacritics are ignored. E.g. á is sorted as a and not after a. Until some 10 years ago, v and w were also considered equal when sorting, but w is now usually considered a 'proper' letter in itself and sorted after v. AFAIK, the same collation rules are used in Finnish.

      – jarnbjo
      Jun 17 at 13:06






    • 1





      @mtwde My German isn't good enough to understand that Wikipedia excerpt. Could you translate it, please?

      – Nzall
      Jun 18 at 7:08














    23












    23








    23







    Short answer:



    Take a look at MySQL and different character-collations. Choose one and follow its rules. Or as @RHa and @cbeleites suggest find a library that provides locale-dependent sorting.



    Long answer:



    There are 3 different solutions for your problem (actually there are 4, but believe me, you don't want to realize the 4th ;) )




    1. Rewrite every Umlaut to its base (German dictionary rules - DIN 5007-1 var. 1)



      Every Umlaut and Diacritic results in the same char.
      e.g.




      áàâäã = a



      ß = ss




      and so on.



      Sort them.




    2. Rewrite every Umlaut by adding an e, diacritics are removed (German phone book rules - DIN 5007-1 var. 2)




      ä = ae



      áàâã = a



      ü = ue



      ß = ss




      Sort them.




      1. Umlaute are new chars added to the alphabet (Swedish/Finnish collation rules)



        Every Umlaut and å are treated like new chars, which are added after the z of the alphabet. Other chars with diacretics are converted to their base.



        So sort like




        aáàâãbcd [...] xyzåäü





      2. Umlaute are new chars added to the alphabet (Austrian phone book order - kudos to @rexkogitans from the comments)



        It's as 3.1., but Umlaute and chars with Diacretics are appended to their base characters.




        aäáà [...] bcdeèé [...] uü ...




        But watch out, wiki says this is true for the Austrian White Pages, but the Austrian Yellow Pages are sorted like DIN 5007-1 var. 1.





    In addition. According to EN 13710 the order of diacritics is



    1. Acute accent (á)

    2. Grave accent (à)

    3. Breve (ă)

    4. Circumflex (â)

    5. Hacek (háček) (š)

    6. Ring (å)

    7. Trema (ä)

    8. Double acute accent (ő)

    9. Tilde (ã)

    10. Dot (ż)

    11. Cedilla (ş)

    12. Ogonek (ą)

    13. Macron (ā)

    14. With stroke through (ø)

    15. Modified letter(s) (æ)


    For further information take a look at



    • European ordering rules (EOR / EN 13710)


    • Unicode collation algorithm


    • ISO 14651 (download)



    One very last comment:



    There are a lot of countries and there are a lot of different languages and a lot of different characters. Some are frequently used in one country, but unknown in another.



    Therefore there are a lot of different standards how to compare strings alphabetically. Even though there are international standards, you have to choose which one to follow.



    How? Well, there's a saying: When in Rome, do as the Romans do.



    Take a look at your target group and their expectations. Then choose the collation most of them will accept.



    • Spanish people? Well, ñ is an independent character sorted in
      between n and o.


    • Germans? Take a look above.


    • Russians and western europeans? Oh boy, you are in trouble sorting
      all these cyrillic and latin characters ;)


    • Or choose the one you are most comfortable with.


    Also ... one of my professors once said: "The first thing I do before coding is searching the internet if there is already a solution." And as @RHa and @cbeleites said in the comments there are solutions (C, Java, PHP, etc.), so unless you insist ... use one of them and you only have to worry about choosing the right locale (and looking up which sorting rules they follow).



    Last and least: The 4th sorting method



    As I mentioned ealier there is a 4th (horrible) solution. German Wikipedia describes it as




    Gleichordnung von Grundbuchstaben, Doppelbuchstaben und Umlaut, wenn Doppelbuchstabe wie Umlaut gesprochen wird. Mull wird wie Muell oder Müll sortiert. Duell dagegen zwischen Duden und Dugast.




    To explain it in English I would like to quote @FabioTurati from the comments




    It says that an approach is treating plain letters, double letters and Umlauts the same way, but only when the double letters are pronounced as an Umlaut. For example, "Mull", "Muell" and "Müll" are treated the same way (note that Mull and Müll are different words!), and the "ue" in Muell here is a way to avoid typing "ü". In other cases, the letters "ue" are pronounced separately: a "u" followed by an "e", for example in "Duell", which is why Duell is placed between Duden and Dugast.



    Why is this a problem?



    Consider it from the other point of view: if you find a word containing "ue", how do you treat it? For Muell you should pretend the "e" is not there, and treat it as "Mull". For "Duell", doing it would lead to "Dull", which would be sorted after "Dugast", which would be wrong. So, to know how to sort words you need to have some more info (that is, how they are pronounced), which a sorting algorithm normally doesn't have. That's why this approach is troublesome!







    share|improve this answer















    Short answer:



    Take a look at MySQL and different character-collations. Choose one and follow its rules. Or as @RHa and @cbeleites suggest find a library that provides locale-dependent sorting.



    Long answer:



    There are 3 different solutions for your problem (actually there are 4, but believe me, you don't want to realize the 4th ;) )




    1. Rewrite every Umlaut to its base (German dictionary rules - DIN 5007-1 var. 1)



      Every Umlaut and Diacritic results in the same char.
      e.g.




      áàâäã = a



      ß = ss




      and so on.



      Sort them.




    2. Rewrite every Umlaut by adding an e, diacritics are removed (German phone book rules - DIN 5007-1 var. 2)




      ä = ae



      áàâã = a



      ü = ue



      ß = ss




      Sort them.




      1. Umlaute are new chars added to the alphabet (Swedish/Finnish collation rules)



        Every Umlaut and å are treated like new chars, which are added after the z of the alphabet. Other chars with diacretics are converted to their base.



        So sort like




        aáàâãbcd [...] xyzåäü





      2. Umlaute are new chars added to the alphabet (Austrian phone book order - kudos to @rexkogitans from the comments)



        It's as 3.1., but Umlaute and chars with Diacretics are appended to their base characters.




        aäáà [...] bcdeèé [...] uü ...




        But watch out, wiki says this is true for the Austrian White Pages, but the Austrian Yellow Pages are sorted like DIN 5007-1 var. 1.





    In addition. According to EN 13710 the order of diacritics is



    1. Acute accent (á)

    2. Grave accent (à)

    3. Breve (ă)

    4. Circumflex (â)

    5. Hacek (háček) (š)

    6. Ring (å)

    7. Trema (ä)

    8. Double acute accent (ő)

    9. Tilde (ã)

    10. Dot (ż)

    11. Cedilla (ş)

    12. Ogonek (ą)

    13. Macron (ā)

    14. With stroke through (ø)

    15. Modified letter(s) (æ)


    For further information take a look at



    • European ordering rules (EOR / EN 13710)


    • Unicode collation algorithm


    • ISO 14651 (download)



    One very last comment:



    There are a lot of countries and there are a lot of different languages and a lot of different characters. Some are frequently used in one country, but unknown in another.



    Therefore there are a lot of different standards how to compare strings alphabetically. Even though there are international standards, you have to choose which one to follow.



    How? Well, there's a saying: When in Rome, do as the Romans do.



    Take a look at your target group and their expectations. Then choose the collation most of them will accept.



    • Spanish people? Well, ñ is an independent character sorted in
      between n and o.


    • Germans? Take a look above.


    • Russians and western europeans? Oh boy, you are in trouble sorting
      all these cyrillic and latin characters ;)


    • Or choose the one you are most comfortable with.


    Also ... one of my professors once said: "The first thing I do before coding is searching the internet if there is already a solution." And as @RHa and @cbeleites said in the comments there are solutions (C, Java, PHP, etc.), so unless you insist ... use one of them and you only have to worry about choosing the right locale (and looking up which sorting rules they follow).



    Last and least: The 4th sorting method



    As I mentioned ealier there is a 4th (horrible) solution. German Wikipedia describes it as




    Gleichordnung von Grundbuchstaben, Doppelbuchstaben und Umlaut, wenn Doppelbuchstabe wie Umlaut gesprochen wird. Mull wird wie Muell oder Müll sortiert. Duell dagegen zwischen Duden und Dugast.




    To explain it in English I would like to quote @FabioTurati from the comments




    It says that an approach is treating plain letters, double letters and Umlauts the same way, but only when the double letters are pronounced as an Umlaut. For example, "Mull", "Muell" and "Müll" are treated the same way (note that Mull and Müll are different words!), and the "ue" in Muell here is a way to avoid typing "ü". In other cases, the letters "ue" are pronounced separately: a "u" followed by an "e", for example in "Duell", which is why Duell is placed between Duden and Dugast.



    Why is this a problem?



    Consider it from the other point of view: if you find a word containing "ue", how do you treat it? For Muell you should pretend the "e" is not there, and treat it as "Mull". For "Duell", doing it would lead to "Dull", which would be sorted after "Dugast", which would be wrong. So, to know how to sort words you need to have some more info (that is, how they are pronounced), which a sorting algorithm normally doesn't have. That's why this approach is troublesome!








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jun 19 at 9:20

























    answered Jun 16 at 16:32









    mtwdemtwde

    4,4521 gold badge3 silver badges20 bronze badges




    4,4521 gold badge3 silver badges20 bronze badges







    • 1





      Another sorting rule: Treat ä, ö, ü as additional characters, but not appended after z, but instead after a, o, u respectively: a, ä, b, c, ..., m, n, o, ö, p, ... This is called Austrian Order.

      – rexkogitans
      Jun 17 at 6:11







    • 12





      So... what's the fourth?

      – sgf
      Jun 17 at 7:47






    • 2





      @sgf from wikipedia: "Gleichordnung von Grundbuchstaben, Doppelbuchstaben und Umlaut, wenn Doppelbuchstabe wie Umlaut gesprochen wird. Mull wird wie Muell oder Müll sortiert. Duell dagegen zwischen Duden und Dugast." which is horrifying to code and gets nasty with names like "Schröder / Schroeder", etc

      – mtwde
      Jun 17 at 9:50







    • 2





      Your description of the Swedish collation order is incorrect. You are right that å, ä and ö are treated as full-worthy letters and sorted after z, but all other diacritics are ignored. E.g. á is sorted as a and not after a. Until some 10 years ago, v and w were also considered equal when sorting, but w is now usually considered a 'proper' letter in itself and sorted after v. AFAIK, the same collation rules are used in Finnish.

      – jarnbjo
      Jun 17 at 13:06






    • 1





      @mtwde My German isn't good enough to understand that Wikipedia excerpt. Could you translate it, please?

      – Nzall
      Jun 18 at 7:08













    • 1





      Another sorting rule: Treat ä, ö, ü as additional characters, but not appended after z, but instead after a, o, u respectively: a, ä, b, c, ..., m, n, o, ö, p, ... This is called Austrian Order.

      – rexkogitans
      Jun 17 at 6:11







    • 12





      So... what's the fourth?

      – sgf
      Jun 17 at 7:47






    • 2





      @sgf from wikipedia: "Gleichordnung von Grundbuchstaben, Doppelbuchstaben und Umlaut, wenn Doppelbuchstabe wie Umlaut gesprochen wird. Mull wird wie Muell oder Müll sortiert. Duell dagegen zwischen Duden und Dugast." which is horrifying to code and gets nasty with names like "Schröder / Schroeder", etc

      – mtwde
      Jun 17 at 9:50







    • 2





      Your description of the Swedish collation order is incorrect. You are right that å, ä and ö are treated as full-worthy letters and sorted after z, but all other diacritics are ignored. E.g. á is sorted as a and not after a. Until some 10 years ago, v and w were also considered equal when sorting, but w is now usually considered a 'proper' letter in itself and sorted after v. AFAIK, the same collation rules are used in Finnish.

      – jarnbjo
      Jun 17 at 13:06






    • 1





      @mtwde My German isn't good enough to understand that Wikipedia excerpt. Could you translate it, please?

      – Nzall
      Jun 18 at 7:08








    1




    1





    Another sorting rule: Treat ä, ö, ü as additional characters, but not appended after z, but instead after a, o, u respectively: a, ä, b, c, ..., m, n, o, ö, p, ... This is called Austrian Order.

    – rexkogitans
    Jun 17 at 6:11






    Another sorting rule: Treat ä, ö, ü as additional characters, but not appended after z, but instead after a, o, u respectively: a, ä, b, c, ..., m, n, o, ö, p, ... This is called Austrian Order.

    – rexkogitans
    Jun 17 at 6:11





    12




    12





    So... what's the fourth?

    – sgf
    Jun 17 at 7:47





    So... what's the fourth?

    – sgf
    Jun 17 at 7:47




    2




    2





    @sgf from wikipedia: "Gleichordnung von Grundbuchstaben, Doppelbuchstaben und Umlaut, wenn Doppelbuchstabe wie Umlaut gesprochen wird. Mull wird wie Muell oder Müll sortiert. Duell dagegen zwischen Duden und Dugast." which is horrifying to code and gets nasty with names like "Schröder / Schroeder", etc

    – mtwde
    Jun 17 at 9:50






    @sgf from wikipedia: "Gleichordnung von Grundbuchstaben, Doppelbuchstaben und Umlaut, wenn Doppelbuchstabe wie Umlaut gesprochen wird. Mull wird wie Muell oder Müll sortiert. Duell dagegen zwischen Duden und Dugast." which is horrifying to code and gets nasty with names like "Schröder / Schroeder", etc

    – mtwde
    Jun 17 at 9:50





    2




    2





    Your description of the Swedish collation order is incorrect. You are right that å, ä and ö are treated as full-worthy letters and sorted after z, but all other diacritics are ignored. E.g. á is sorted as a and not after a. Until some 10 years ago, v and w were also considered equal when sorting, but w is now usually considered a 'proper' letter in itself and sorted after v. AFAIK, the same collation rules are used in Finnish.

    – jarnbjo
    Jun 17 at 13:06





    Your description of the Swedish collation order is incorrect. You are right that å, ä and ö are treated as full-worthy letters and sorted after z, but all other diacritics are ignored. E.g. á is sorted as a and not after a. Until some 10 years ago, v and w were also considered equal when sorting, but w is now usually considered a 'proper' letter in itself and sorted after v. AFAIK, the same collation rules are used in Finnish.

    – jarnbjo
    Jun 17 at 13:06




    1




    1





    @mtwde My German isn't good enough to understand that Wikipedia excerpt. Could you translate it, please?

    – Nzall
    Jun 18 at 7:08






    @mtwde My German isn't good enough to understand that Wikipedia excerpt. Could you translate it, please?

    – Nzall
    Jun 18 at 7:08














    6














    If it's not names you are dealing with, it would be best to ignore all diacritics when sorting (and count ß as ss).



    The only reason to deviate from this simple system lies in the unfortunate fact that German names show unpredictable variation between ä, ö, ü and ae, oe, ue. This has lead to phone books and library catalogues sorting e.g. Räder as Raeder, Örtel as Oertel, Hüber as Hueber.



    Wikipedia has a good write-up.






    share|improve this answer



























      6














      If it's not names you are dealing with, it would be best to ignore all diacritics when sorting (and count ß as ss).



      The only reason to deviate from this simple system lies in the unfortunate fact that German names show unpredictable variation between ä, ö, ü and ae, oe, ue. This has lead to phone books and library catalogues sorting e.g. Räder as Raeder, Örtel as Oertel, Hüber as Hueber.



      Wikipedia has a good write-up.






      share|improve this answer

























        6












        6








        6







        If it's not names you are dealing with, it would be best to ignore all diacritics when sorting (and count ß as ss).



        The only reason to deviate from this simple system lies in the unfortunate fact that German names show unpredictable variation between ä, ö, ü and ae, oe, ue. This has lead to phone books and library catalogues sorting e.g. Räder as Raeder, Örtel as Oertel, Hüber as Hueber.



        Wikipedia has a good write-up.






        share|improve this answer













        If it's not names you are dealing with, it would be best to ignore all diacritics when sorting (and count ß as ss).



        The only reason to deviate from this simple system lies in the unfortunate fact that German names show unpredictable variation between ä, ö, ü and ae, oe, ue. This has lead to phone books and library catalogues sorting e.g. Räder as Raeder, Örtel as Oertel, Hüber as Hueber.



        Wikipedia has a good write-up.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jun 16 at 16:11









        David VogtDavid Vogt

        6,8431 gold badge6 silver badges36 bronze badges




        6,8431 gold badge6 silver badges36 bronze badges





















            0














            I can answer you only regarding the German characters. "Ä" is considered equivalent to "Ae", "Ö" to "Oe", "Ü" to "Ue" and "ß" to "ss". This is how those characters are sorted in a phonebook.






            share|improve this answer








            New contributor



            ziganotschka is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.



















            • Thank you for your answer. Sadly I cannot implement this behaviour. I'm sorry. I removed the phone book reference.

              – Matthias
              Jun 16 at 16:06






            • 1





              This does mean that it is sorted like this not that it is written like this.

              – Kami Kaze
              Jun 17 at 6:43











            • This is how German users would expect it.

              – Simon Richter
              Jun 17 at 8:56






            • 3





              @SimonRichter: There are (at least) two different widely-used collation orders in Germany, and even more if you also consider other German-speaking countries such as Austria and Switzerland. So, whether or not "German users would expect it" this way depends very much on context and on whether those German users are actually from Germany or German-speaking from Austria, for example.

              – Jörg W Mittag
              Jun 17 at 12:47















            0














            I can answer you only regarding the German characters. "Ä" is considered equivalent to "Ae", "Ö" to "Oe", "Ü" to "Ue" and "ß" to "ss". This is how those characters are sorted in a phonebook.






            share|improve this answer








            New contributor



            ziganotschka is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.



















            • Thank you for your answer. Sadly I cannot implement this behaviour. I'm sorry. I removed the phone book reference.

              – Matthias
              Jun 16 at 16:06






            • 1





              This does mean that it is sorted like this not that it is written like this.

              – Kami Kaze
              Jun 17 at 6:43











            • This is how German users would expect it.

              – Simon Richter
              Jun 17 at 8:56






            • 3





              @SimonRichter: There are (at least) two different widely-used collation orders in Germany, and even more if you also consider other German-speaking countries such as Austria and Switzerland. So, whether or not "German users would expect it" this way depends very much on context and on whether those German users are actually from Germany or German-speaking from Austria, for example.

              – Jörg W Mittag
              Jun 17 at 12:47













            0












            0








            0







            I can answer you only regarding the German characters. "Ä" is considered equivalent to "Ae", "Ö" to "Oe", "Ü" to "Ue" and "ß" to "ss". This is how those characters are sorted in a phonebook.






            share|improve this answer








            New contributor



            ziganotschka is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            I can answer you only regarding the German characters. "Ä" is considered equivalent to "Ae", "Ö" to "Oe", "Ü" to "Ue" and "ß" to "ss". This is how those characters are sorted in a phonebook.







            share|improve this answer








            New contributor



            ziganotschka is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.








            share|improve this answer



            share|improve this answer






            New contributor



            ziganotschka is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.








            answered Jun 16 at 15:40









            ziganotschkaziganotschka

            271 bronze badge




            271 bronze badge




            New contributor



            ziganotschka is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.




            New contributor




            ziganotschka is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.














            • Thank you for your answer. Sadly I cannot implement this behaviour. I'm sorry. I removed the phone book reference.

              – Matthias
              Jun 16 at 16:06






            • 1





              This does mean that it is sorted like this not that it is written like this.

              – Kami Kaze
              Jun 17 at 6:43











            • This is how German users would expect it.

              – Simon Richter
              Jun 17 at 8:56






            • 3





              @SimonRichter: There are (at least) two different widely-used collation orders in Germany, and even more if you also consider other German-speaking countries such as Austria and Switzerland. So, whether or not "German users would expect it" this way depends very much on context and on whether those German users are actually from Germany or German-speaking from Austria, for example.

              – Jörg W Mittag
              Jun 17 at 12:47

















            • Thank you for your answer. Sadly I cannot implement this behaviour. I'm sorry. I removed the phone book reference.

              – Matthias
              Jun 16 at 16:06






            • 1





              This does mean that it is sorted like this not that it is written like this.

              – Kami Kaze
              Jun 17 at 6:43











            • This is how German users would expect it.

              – Simon Richter
              Jun 17 at 8:56






            • 3





              @SimonRichter: There are (at least) two different widely-used collation orders in Germany, and even more if you also consider other German-speaking countries such as Austria and Switzerland. So, whether or not "German users would expect it" this way depends very much on context and on whether those German users are actually from Germany or German-speaking from Austria, for example.

              – Jörg W Mittag
              Jun 17 at 12:47
















            Thank you for your answer. Sadly I cannot implement this behaviour. I'm sorry. I removed the phone book reference.

            – Matthias
            Jun 16 at 16:06





            Thank you for your answer. Sadly I cannot implement this behaviour. I'm sorry. I removed the phone book reference.

            – Matthias
            Jun 16 at 16:06




            1




            1





            This does mean that it is sorted like this not that it is written like this.

            – Kami Kaze
            Jun 17 at 6:43





            This does mean that it is sorted like this not that it is written like this.

            – Kami Kaze
            Jun 17 at 6:43













            This is how German users would expect it.

            – Simon Richter
            Jun 17 at 8:56





            This is how German users would expect it.

            – Simon Richter
            Jun 17 at 8:56




            3




            3





            @SimonRichter: There are (at least) two different widely-used collation orders in Germany, and even more if you also consider other German-speaking countries such as Austria and Switzerland. So, whether or not "German users would expect it" this way depends very much on context and on whether those German users are actually from Germany or German-speaking from Austria, for example.

            – Jörg W Mittag
            Jun 17 at 12:47





            @SimonRichter: There are (at least) two different widely-used collation orders in Germany, and even more if you also consider other German-speaking countries such as Austria and Switzerland. So, whether or not "German users would expect it" this way depends very much on context and on whether those German users are actually from Germany or German-speaking from Austria, for example.

            – Jörg W Mittag
            Jun 17 at 12:47










            Matthias is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            Matthias is a new contributor. Be nice, and check out our Code of Conduct.












            Matthias is a new contributor. Be nice, and check out our Code of Conduct.











            Matthias is a new contributor. Be nice, and check out our Code of Conduct.














            Thanks for contributing an answer to German Language Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fgerman.stackexchange.com%2fquestions%2f52765%2fordering-german-special-characters-and-those-from-other-languages-when-sorting%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Get product attribute by attribute group code in magento 2get product attribute by product attribute group in magento 2Magento 2 Log Bundle Product Data in List Page?How to get all product attribute of a attribute group of Default attribute set?Magento 2.1 Create a filter in the product grid by new attributeMagento 2 : Get Product Attribute values By GroupMagento 2 How to get all existing values for one attributeMagento 2 get custom attribute of a single product inside a pluginMagento 2.3 How to get all the Multi Source Inventory (MSI) locations collection in custom module?Magento2: how to develop rest API to get new productsGet product attribute by attribute group code ( [attribute_group_code] ) in magento 2

            Category:9 (number) SubcategoriesMedia in category "9 (number)"Navigation menuUpload mediaGND ID: 4485639-8Library of Congress authority ID: sh85091979ReasonatorScholiaStatistics

            Magento 2.3: How do i solve this, Not registered handle, on custom form?How can i rewrite TierPrice Block in Magento2magento 2 captcha not rendering if I override layout xmlmain.CRITICAL: Plugin class doesn't existMagento 2 : Problem while adding custom button order view page?Magento 2.2.5: Overriding Admin Controller sales/orderMagento 2.2.5: Add, Update and Delete existing products Custom OptionsMagento 2.3 : File Upload issue in UI Component FormMagento2 Not registered handleHow to configured Form Builder Js in my custom magento 2.3.0 module?Magento 2.3. How to create image upload field in an admin form