基于Java编写emoji表情处理工具类
更新时间:2024年03月15日 14:09:58 作者:大哈在此...
这篇文章主要为大家详细介绍了如何基于Java编写一个emoji表情处理工具类,文中的示例代码讲解详细,感兴趣的小伙伴可以跟随小编一起学习一下
emoji表情也是使用Unicode编码的,但UTF8编码是不支持的。我们如果想存储emoji到数据库,一般有两种方法,以mysql为例,将数据库编码从 utf8 改为 utf8mb4,第二种就是做一个转换,将emoji表情转换成另一个字符,所以本文主要来和大家讲讲如何使用Java实现处理emoji表情吧
xml
<!-- https://mvnrepository.com/artifact/com.vdurmont/emoji-java --> <dependency> <groupId>com.vdurmont</groupId> <artifactId>emoji-java</artifactId> <version>5.1.1</version> </dependency>
java代码
import java.io.UnsupportedEncodingException; import java.net.URLDecoder; import java.net.URLEncoder; import java.util.regex.Matcher; import java.util.regex.Pattern; import org.apache.commons.lang.StringUtils; import com.vdurmont.emoji.EmojiParser; /** * emoji表情过滤 * @author madaha * */ public class EmojiFilter { /** * 判断是否存在Emoji * @author madaha * * @param codePoint * @return */ private static boolean isEmojiCharacter(char codePoint) { return (codePoint == 0x0) || (codePoint == 0x9) || (codePoint == 0xA) || (codePoint == 0xD) || ((codePoint >= 0x20) && (codePoint <= 0xD7FF)) || ((codePoint >= 0xE000) && (codePoint <= 0xFFFD)) || ((codePoint >= 0x10000) && (codePoint <= 0x10FFFF)); } /** * 过滤emoji 或者 其他非文字类型的字符 * @author madaha * * @param source 待过滤字符串 * @return */ public static String filterEmoji(String source) { if (StringUtils.isBlank(source)) { return source; } StringBuilder buf = null; int len = source.length(); for (int i = 0; i < len; i++) { char codePoint = source.charAt(i); if (isEmojiCharacter(codePoint)) { if (buf == null) { buf = new StringBuilder(source.length()); } buf.append(codePoint); } } if (buf == null) { return source; } else { if (buf.length() == len) { buf = null; return source; } else { return buf.toString(); } } } /** * @Description 将字符串中的emoji表情转换成可以在utf-8字符集数据库中保存的格式(表情占4个字节,需要utf8mb4字符集) * @param str 待转换字符串 * @return 转换后字符串 * @throws UnsupportedEncodingException * exception */ public static String emojiConvert1(String str) throws UnsupportedEncodingException { String patternString = "([\\x{10000}-\\x{10ffff}\ud800-\udfff])"; Pattern pattern = Pattern.compile(patternString); Matcher matcher = pattern.matcher(str); StringBuffer sb = new StringBuffer(); while (matcher.find()) { try { matcher.appendReplacement(sb, "[[" + URLEncoder.encode(matcher.group(1), "UTF-8") + "]]"); } catch (UnsupportedEncodingException e) { System.out.println(e); throw e; } } matcher.appendTail(sb); // System.out.println("emojiConvert " + str + " to " + sb.toString() + ", len:" + sb.length()); return sb.toString(); } /** * @Description 还原utf8数据库中保存的含转换后emoji表情的字符串 * @param str 转换前的字符串 * @return 转换后的字符串 * @throws UnsupportedEncodingException * exception */ public static String emojiRecovery2(String str) throws UnsupportedEncodingException { String patternString = "\\[\\[(.*?)\\]\\]"; Pattern pattern = Pattern.compile(patternString); Matcher matcher = pattern.matcher(str); StringBuffer sb = new StringBuffer(); while (matcher.find()) { try { matcher.appendReplacement(sb, URLDecoder.decode(matcher.group(1), "UTF-8")); } catch (UnsupportedEncodingException e) { System.out.println(e); throw e; } } matcher.appendTail(sb); // System.out.println("emojiRecovery " + str + " to " + sb.toString()); return sb.toString(); } /** * emoji表情处理测试方法 * @author madaha * * @param args */ public static void main(String[] args) { String str = "emoji表情😁😁输入测试😁😂"; System.out.println("原始字符为:\n" + str); System.out.println("to aliases 之后:\n" + EmojiParser.parseToAliases(str)); str = EmojiParser.parseToAliases(str); System.out.println("还原:\n" + EmojiParser.parseToUnicode(str)); } }
到此这篇关于基于Java编写emoji表情处理工具类的文章就介绍到这了,更多相关Java处理emoji表情内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家!
最新评论