Remove ✅, 🔥, ✈ , ♛ and other such emojis/images/signs from Java strings

I have some strings with all kinds of different emojis/images/signs in them.

Not all the strings are in English — some of them are in other non-Latin languages, for example:

▓ railway??
→ Cats and dogs
I'm on 🔥
Apples ⚛ 
✅ Vi sign
♛ I'm the king ♛ 
Corée ♦ du Nord ☁  (French)
 gjør at både ◄╗ (Norwegian)
Star me ★
Star ⭐ once more
早上好 ♛ (Chinese)
Καλημέρα ✂ (Greek)
another ✓ sign ✓
добрай раніцы ✪ (Belarus)
◄ शुभ प्रभात ◄ (Hindi)
✪ ✰ ❈ ❧ Let's get together ★. We shall meet at 12/10/2018 10:00 AM at Tony's.❉

…and many more of these.

I would like to get rid of all these signs/images and to keep only the letters (and punctuation) in the different languages.

I tried to clean the signs using the EmojiParser library:

String withoutEmojis = EmojiParser.removeAllEmojis(input);

The problem is that EmojiParser is not able to remove the majority of the signs. The ♦ sign is the only one I found till now that it removed.
Other signs such as ✪ ❉ ★ ✰ ❈ ❧ ✂ ❋ ⓡ ✿ ♛ 🔥 are not removed.

Is there a way to remove all these signs from the input strings and keeping only the letters and punctuation in the different languages?

7 Answers
7

Leave a Comment