Languages
[Edit]
EN

Java - split string by new line character

9 points
Created by:
Explosssive
309

In this article, we're going to have a look at the problem how to split string to separated lines in Java.

Quick solution:

  • It works on Linux and Windows:
String text = "1\r\n2\r\n3";

String[] lines = text.split("\\r?\\n");  //  \r\n  or  \n
  • It works on all operating systems:
String text = "1\r\n2\r\n3";

String[] lines = text.split("\r\n|\n\r|\n|\r");  //  \r\n  ,  \n\r  ,  \n  or  \r

or:

// import java.util.regex.Pattern;

Pattern PATTERN = Pattern.compile("\\r\\n|\\n\\r|\\n|\\r");  //  \r\n  ,  \n\r  ,  \n  or  \r

String text = "1\r\n2\r\n3";
String[] lines = PATTERN.split(text);

// go to last section to see practical example

 

Look at below problem description and examples to see how it works in practice.

1. Problem overview

Different operating systems have different newline symbols.

There are a few most commonly used new line separations:

\nMultics, Unix and Unix-like systems (Linux, macOS, FreeBSD, AIX, Xenix, etc.), BeOS, Amiga, RISC OS, and others.
\r\nAtari TOS, Microsoft Windows, DOS (MS-DOS, PC DOS, etc.), DEC TOPS-10, RT-11, CP/M, MP/M, OS/2, Symbian OS, Palm OS, Amstrad CPC, and most other early non-Unix and non-IBM operating systems.
\rCommodore 8-bit machines (C64, C128), Acorn BBC, ZX Spectrum, TRS-80, Apple II series, Oberon, the classic Mac OS, MIT Lisp Machine and OS-9
\n\rAcorn BBC and RISC OS spooled text output.

Source: https://en.wikipedia.org/wiki/Newline

2. Universal new lines splitting example

This approach works with all operating systems. The presented example shows splitting to separated lines on mixed text.

package com.example;

public class Program {

    public static void main(String[] args) {

        String text = "line 1\n" +
                      "line 2\r" +
                      "line 3\r\n" +
                      "line 4\n\r" +
                      "line 5";

        String[] lines = text.split("\\r\\n|\\n\\r|\\n|\\r");  // expression symbols order is very important

        for (String line : lines) {
            System.out.println(line);
        }
    }
}

Output:

line 1
line 2
line 3
line 4
line 5

Note:

above expression symbols order is very important to try to:

  • split text by¬†two characters as newline symbol at¬†first¬†(\r\n or \n\r),
  • and later¬†with single newline symbol (\n or¬†\r).

let's suppose we have HTTP protocol response:

HTTP/1.1 200 OK\r\nContent-Length: 25\r\nContent-Type: text/html\r\n\r\nHello world!\nSome text...

for \n or \r at begining of the expression we could get different number of newlines after splitting.

after splitting we should get:

HTTP/1.1 200 OK
Content-Length: 25
Content-Type: text/html

Hello world!
Some text...

Second important thing is newline symbol unification per operationg system that makes posible to use below expression. 

3. Microsoft Windows new lines splitting example

package com.example;

public class Program {

    public static void main(String[] args) {

        String text = "line 1\r\n" +
                      "line 2\r\n" +
                      "line 3\r\n" +
                      "line 4\r\n" +
                      "line 5";

        String[] lines = text.split("\\r\\n");  // expression symbols order is very important

        for (String line : lines) {
            System.out.println(line);
        }
    }
}

Output:

line 1
line 2
line 3
line 4
line 5

Systems: Atari TOS, Microsoft Windows, DOS (MS-DOS, PC DOS, etc.), DEC TOPS-10, RT-11, CP/M, MP/M, OS/2, Symbian OS, Palm OS, Amstrad CPC, and most other early non-Unix and non-IBM operating systems.

4. Unix/Linux new lines splitting example

package com.example;

public class Program {

    public static void main(String[] args) {

        String text = "line 1\n" +
                      "line 2\n" +
                      "line 3\n" +
                      "line 4\n" +
                      "line 5";

        String[] lines = text.split("\\n");

        for (String line : lines) {
            System.out.println(line);
        }
    }
}

Output:

line 1
line 2
line 3
line 4
line 5

Systems: Multics, Unix and Unix-like systems (Linux, macOS, FreeBSD, AIX, Xenix, etc.), BeOS, Amiga, RISC OS, and others.

5.  The classic Mac OS/OS-9 new lines splitting example

package com.example;

public class Program {

    public static void main(String[] args) {

        String text = "line 1\r" +
                      "line 2\r" +
                      "line 3\r" +
                      "line 4\r" +
                      "line 5";

        String[] lines = text.split("\\r");

        for (String line : lines) {
            System.out.println(line);
        }
    }
}

Output:

line 1
line 2
line 3
line 4
line 5

Systems: Commodore 8-bit machines (C64, C128), Acorn BBC, ZX Spectrum, TRS-80, Apple II series, Oberon, the classic Mac OS, MIT Lisp Machine and OS-9. 

 

6. The optimal way to split text by newline

Some split operations are executed many times in a source code. That makes sense to do not compile patterns inside String split() function each time when we call it - check split() function body. The improvement for the code can be to use Pattern class and create an object for it only once.

Example:

package com.example;

import java.util.regex.Pattern;

public class Program {

    private static Pattern PATTERN = Pattern.compile("\\r\\n|\\n\\r|\\n|\\r");

    public static void main(String[] args) {

        String text = "HTTP/1.1 200 OK\r\n" +
                      "Content-Length: 25\r\n" +
                      "Content-Type: text/html\r\n" +
                      "\r\n" +
                      "Hello world!\n" +
                      "Some text...";

        String[] lines = PATTERN.split(text);

        for (String line : lines) {
            System.out.println(line);
        }
    }
}

Output:

HTTP/1.1 200 OK
Content-Length: 25
Content-Type: text/html

Hello world!
Some text...

 

Native Advertising
ūüöÄ
Get your tech brand or product in front of software developers.
For more information Contact us
Dirask - we help you to
solve coding problems.
Ask question.

‚̧ԳŹūüíĽ ūüôā

Join