Working with UTF-8 encoding in Python source [duplicate]

This question already has answers here: Correct way to define Python source code encoding (6 answers) Closed 6 years ago. Consider: $ cat bla.py u = unicode(‘d…’) s = u.encode(‘utf-8’) print s $ python bla.py File “bla.py”, line 1 SyntaxError: Non-ASCII character ‘\xe2’ in file bla.py on line 1, but no encoding declared; see http://www.python.org/peps/pep-0263.html … Read more

How to convert an entire MySQL database characterset and collation to UTF-8?

How can I convert entire MySQL database character-set to UTF-8 and collation to UTF-8? 20 s 20 Use the ALTER DATABASE and ALTER TABLE commands. ALTER DATABASE databasename CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; Or if you’re still on MySQL 5.5.2 or older which didn’t support … Read more

What is the difference between UTF-8 and Unicode?

I have heard conflicting opinions from people – according to the Wikipedia UTF-8 page. They are the same thing, aren’t they? Can someone clarify? 18 s 18 Let me use an example to illustrate this topic: A Chinese character: 汉 its Unicode value: U+6C49 convert 6C49 to binary: 01101100 01001001 Nothing magical so far, it’s … Read more

How do I see what character set a MySQL database / table / column is?

What is the (default) charset for: MySQL database MySQL table MySQL column 15 s 15 Here’s how I’d do it – For Schemas (or Databases – they are synonyms): SELECT default_character_set_name FROM information_schema.SCHEMATA WHERE schema_name = “schemaname”; For Tables: SELECT CCSA.character_set_name FROM information_schema.`TABLES` T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA WHERE CCSA.collation_name = T.table_collation AND T.table_schema = “schemaname” AND … Read more

What’s the difference between UTF-8 and UTF-8 without BOM?

What’s different between UTF-8 and UTF-8 without a BOM? Which is better? 2Best Answer 21 The UTF-8 BOM is a sequence of bytes at the start of a text stream (0xEF, 0xBB, 0xBF) that allows the reader to more reliably guess a file as being encoded in UTF-8. Normally, the BOM is used to signal … Read more

How do I get a consistent byte representation of strings in C# without manually specifying an encoding?

How do I convert a string to a byte[] in .NET (C#) without manually specifying a specific encoding? I’m going to encrypt the string. I can encrypt it without converting, but I’d still like to know why encoding comes to play here. Also, why should encoding even be taken into consideration? Can’t I simply get … Read more