I have some strings with all kinds of different emojis/images/signs in them.

Not all the strings are in English — some of them are in other non-Latin languages, for example:

β–“ railway??
β†’ Cats and dogs
I'm on πŸ”₯
Apples βš› 
βœ… Vi sign
β™› I'm the king β™› 
CorΓ©e ♦ du Nord ☁  (French)
 gjΓΈr at bΓ₯de β—„β•— (Norwegian)
Star me β˜…
Star ⭐ once more
ζ—©δΈŠε₯½ β™› (Chinese)
Καλημέρα βœ‚ (Greek)
another βœ“ sign βœ“
Π΄ΠΎΠ±Ρ€Π°ΠΉ Ρ€Π°Π½Ρ–Ρ†Ρ‹ βœͺ (Belarus)
β—„ ΰ€Άΰ₯ΰ€­ ΰ€ͺΰ₯ΰ€°ΰ€­ΰ€Ύΰ€€ β—„ (Hindi)
βœͺ ✰ ❈ ❧ Let's get together β˜…. We shall meet at 12/10/2018 10:00 AM at Tony's.❉

…and many more of these.

I would like to get rid of all these signs/images and to keep only the letters (and punctuation) in the different languages.

I tried to clean the signs using the EmojiParser library:

String withoutEmojis = EmojiParser.removeAllEmojis(input);

The problem is that EmojiParser is not able to remove the majority of the signs. The ♦ sign is the only one I found till now that it removed.
Other signs such as βœͺ ❉ β˜… ✰ ❈ ❧ βœ‚ ❋ β“‘ ✿ β™› πŸ”₯ are not removed.

Is there a way to remove all these signs from the input strings and keeping only the letters and punctuation in the different languages?

7 Answers
7

Leave a Reply

Your email address will not be published. Required fields are marked *