I have some strings with all kinds of different emojis/images/signs in them.
Not all the strings are in English — some of them are in other non-Latin languages, for example:
β railway??
β Cats and dogs
I'm on π₯
Apples β
β
Vi sign
β I'm the king β
CorΓ©e β¦ du Nord β (French)
gjΓΈr at bΓ₯de ββ (Norwegian)
Star me β
Star β once more
ζ©δΈε₯½ β (Chinese)
ΞαλημΞΟΞ± β (Greek)
another β sign β
Π΄ΠΎΠ±ΡΠ°ΠΉ ΡΠ°Π½ΡΡΡ βͺ (Belarus)
β ΰ€Άΰ₯ΰ€ ΰ€ͺΰ₯ΰ€°ΰ€ΰ€Ύΰ€€ β (Hindi)
βͺ β° β β§ Let's get together β
. We shall meet at 12/10/2018 10:00 AM at Tony's.β
…and many more of these.
I would like to get rid of all these signs/images and to keep only the letters (and punctuation) in the different languages.
I tried to clean the signs using the EmojiParser library:
String withoutEmojis = EmojiParser.removeAllEmojis(input);
The problem is that EmojiParser is not able to remove the majority of the signs. The β¦ sign is the only one I found till now that it removed.
Other signs such as βͺ β β
β° β β§ β β β‘ βΏ β π₯ are not removed.
Is there a way to remove all these signs from the input strings and keeping only the letters and punctuation in the different languages?