Java - split string by new line character
In this article, we're going to have a look at the problem how to split string to separated lines in Java.
Quick solution:
- It works on Linux and Windows:
String text = "1\r\n2\r\n3";
String[] lines = text.split("\\r?\\n"); // \r\n or \n
- It works on all operating systems:
String text = "1\r\n2\r\n3";
String[] lines = text.split("\r\n|\n\r|\n|\r"); // \r\n , \n\r , \n or \r
or:
// import java.util.regex.Pattern;
Pattern PATTERN = Pattern.compile("\\r\\n|\\n\\r|\\n|\\r"); // \r\n , \n\r , \n or \r
String text = "1\r\n2\r\n3";
String[] lines = PATTERN.split(text);
// go to last section to see practical example
Â
Look at below problem description and examples to see how it works in practice.
1. Problem overview
Different operating systems have different newline symbols.
There are a few most commonly used new line separations:
\n | Multics, Unix and Unix-like systems (Linux, macOS, FreeBSD, AIX, Xenix, etc.), BeOS, Amiga, RISC OS, and others. |
\r\n | Atari TOS, Microsoft Windows, DOS (MS-DOS, PC DOS, etc.), DEC TOPS-10, RT-11, CP/M, MP/M, OS/2, Symbian OS, Palm OS, Amstrad CPC, and most other early non-Unix and non-IBM operating systems. |
\r | Commodore 8-bit machines (C64, C128), Acorn BBC, ZX Spectrum, TRS-80, Apple II series, Oberon, the classic Mac OS, MIT Lisp Machine and OS-9 |
\n\r | Acorn BBC and RISC OS spooled text output. |
Source:Â https://en.wikipedia.org/wiki/Newline
2. Universal new lines splitting example
This approach works with all operating systems. The presented example shows splitting to separated lines on mixed text.
package com.example;
public class Program {
public static void main(String[] args) {
String text = "line 1\n" +
"line 2\r" +
"line 3\r\n" +
"line 4\n\r" +
"line 5";
String[] lines = text.split("\\r\\n|\\n\\r|\\n|\\r"); // expression symbols order is very important
for (String line : lines) {
System.out.println(line);
}
}
}
Output:
line 1
line 2
line 3
line 4
line 5
Note:
above expression symbols order is very important to try to:
- split text by two characters as newline symbol at first (
\r\nor\n\r),- and later with single newline symbol (
\norÂ\r).let's suppose we have HTTP protocol response:
HTTP/1.1 200 OK\r\nContent-Length: 25\r\nContent-Type: text/html\r\n\r\nHello world!\nSome text...for
\norÂ\rat begining of the expression we could get different number of newlines after splitting.after splitting we should get:
HTTP/1.1 200 OK Content-Length: 25 Content-Type: text/html Hello world! Some text...Second important thing is newline symbol unification per operationg system that makes posible to use below expression.Â
3. Microsoft Windows new lines splitting example
package com.example;
public class Program {
public static void main(String[] args) {
String text = "line 1\r\n" +
"line 2\r\n" +
"line 3\r\n" +
"line 4\r\n" +
"line 5";
String[] lines = text.split("\\r\\n"); // expression symbols order is very important
for (String line : lines) {
System.out.println(line);
}
}
}
Output:
line 1
line 2
line 3
line 4
line 5
Systems:Â Atari TOS, Microsoft Windows, DOS (MS-DOS, PC DOS, etc.), DEC TOPS-10, RT-11, CP/M, MP/M, OS/2, Symbian OS, Palm OS, Amstrad CPC, and most other early non-Unix and non-IBM operating systems.
4. Unix/Linux new lines splitting example
package com.example;
public class Program {
public static void main(String[] args) {
String text = "line 1\n" +
"line 2\n" +
"line 3\n" +
"line 4\n" +
"line 5";
String[] lines = text.split("\\n");
for (String line : lines) {
System.out.println(line);
}
}
}
Output:
line 1
line 2
line 3
line 4
line 5
Systems:Â Multics, Unix and Unix-like systems (Linux, macOS, FreeBSD, AIX, Xenix, etc.), BeOS, Amiga, RISC OS, and others.
5.  The classic Mac OS/OS-9 new lines splitting example
package com.example;
public class Program {
public static void main(String[] args) {
String text = "line 1\r" +
"line 2\r" +
"line 3\r" +
"line 4\r" +
"line 5";
String[] lines = text.split("\\r");
for (String line : lines) {
System.out.println(line);
}
}
}
Output:
line 1
line 2
line 3
line 4
line 5
Systems:Â Commodore 8-bit machines (C64, C128), Acorn BBC, ZX Spectrum, TRS-80, Apple II series, Oberon, the classic Mac OS, MIT Lisp Machine and OS-9.Â
Â
6. The optimal way to split text by newline
Some split operations are executed many times in a source code. That makes sense to do not compile patterns inside String split() function each time when we call it - check split() function body. The improvement for the code can be to use Pattern class and create an object for it only once.
Example:
package com.example;
import java.util.regex.Pattern;
public class Program {
private static Pattern PATTERN = Pattern.compile("\\r\\n|\\n\\r|\\n|\\r");
public static void main(String[] args) {
String text = "HTTP/1.1 200 OK\r\n" +
"Content-Length: 25\r\n" +
"Content-Type: text/html\r\n" +
"\r\n" +
"Hello world!\n" +
"Some text...";
String[] lines = PATTERN.split(text);
for (String line : lines) {
System.out.println(line);
}
}
}
Output:
HTTP/1.1 200 OK
Content-Length: 25
Content-Type: text/html
Hello world!
Some text...
Â