It's that simple! Seconds mention Chinese and English text

In daily work, we often deal with mixed data containing both Chinese and English characters. This can be a challenge when you need to separate the two for better organization or processing. The traditional approach is manually copying and pasting text into different cells, which becomes inefficient when dealing with large datasets. For example, in a foreign trade company, employees may have names that are entered as a mix of Chinese and English, such as "方丽深(Alex)". In such cases, how can we efficiently extract the Chinese and English parts separately?

1711A-NRTQ-1

1. Quick Separation of Chinese and English

When the data contains a clear delimiter, such as a half-width “(”, you can use Excel’s "Text to Columns" feature to split the content. First, copy the original data from column A to column B. Then, go to "Data → Text to Columns", choose "Other" as the delimiter, and enter “(” (Figure 2).

1711A-NRTQ-2

After splitting, the Chinese name will appear in column C, but it might still include the closing parenthesis “)”. To remove this, use the "Find and Replace" tool to eliminate the unwanted character (Figure 3).

1711A-NRTQ-3

Tip: If there's no consistent delimiter, you can insert a space between Chinese and English characters and use that as a separator.

2. Using Built-in Functions for Automatic Extraction

If there is no common delimiter, you can use Excel functions like LENB and LEN to distinguish between Chinese and English characters. For instance, in cell B2, enter =LEFT(A2, LENB(A2)-LEN(A2)), and in cell C2, enter =RIGHT(A2, 2*LEN(A2)-LENB(A2)). Drag the formula down to apply it to the entire column (Figure 4).

1711A-NRTQ-4

Formula Explanation:

The LENB function returns the number of bytes in a cell, while LEN counts the number of characters. Since one Chinese character takes 2 bytes and one English character takes 1 byte, you can calculate the number of Chinese and English characters using these functions. For example, if the cell contains "方丽深(Alex)", LENB would return 12, and LEN would return 9. Subtracting them gives 3, which represents the number of Chinese characters. The RIGHT function then extracts the remaining part, which includes the English name and parentheses.

To clean up the extracted English name, you can use the SUBSTITUTE function, like =SUBSTITUTE(SUBSTITUTE(D2,"(", ""),")",""), to remove any parentheses (Figure 5).

1711A-NRTQ-5

If the data is more complex and contains multiple mixes of Chinese and English, you can sort the data first and then apply the same extraction methods (Figure 6).

1711A-NRTQ-6

3. Custom Extraction Using VBA Scripts

If you frequently handle mixed data, consider using a VBA script for faster and more flexible extraction. You can download the required code from this link. Open the Visual Basic Editor by pressing Alt + F11, insert a new module, and paste the code (Figure 7).

1711A-NRTQ-7

Save the file as a macro-enabled workbook and enable macros. Then, in cell B2, enter =SplitStringChs(A2), and in cell C2, enter =SplitStringEng(A2). These custom functions will automatically separate Chinese and English characters regardless of their order or complexity (Figure 8).

1711A-NRTQ-8

UL Wiring Material

UCOAX specializes in custom drawing, flattening, stranding and insulating, cutting and stripping of fine wire and cable .0009" (AWG #48) and larger.

UL iQ for appliance wiring materials

Single-Conductor, Themoplastic Insulation

1007 1015 1061 1185 1126 1227 1330 etc.

Multiple-Conductor, Thermoset Insulation

2464 2468 2725 2835 2990 20276 20379 21100 21118

Single-Conductor, Thermoset Insulation

3173 3265 3266 3271 3272 3302 3346 3347 3363 3383 3385 3386 3619


For more details, please feel free to contact us, thank you.


Ul Wiring Material,Ul Wiring Material Components,Electrical Wire Materials,Wiring Material

UCOAX , https://www.ucoax.com