Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
826 views
in Technique[技术] by (71.8m points)

delphi - String.Split works strange when last value is empty

I'd like to split my string to array but it works bad when last "value" is empty. See my example please. Is it bug or feature? Is there any way how to use this function without workarounds?

var
  arr: TArray<string>;

  arr:='a;b;c'.Split([';']); //length of array = 3, it's OK
  arr:='a;b;c;'.Split([';']); //length of array = 3, but I expect 4
  arr:='a;b;;c'.Split([';']); //length of array = 4 since empty value is inside
  arr:=('a;b;c;'+' ').Split([';']); //length of array = 4 (primitive workaround with space)
question from:https://stackoverflow.com/questions/65937523/string-helpers-split-function-doesnt-consider-empty-spaces-after-the-last-deli

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This behaviour can't be changed. There's no way for you to customise how this split function works. I suspect that you'll need to provide your own split implementation. Michael Erikkson helpfully points out in a comment that System.StrUtils.SplitString behaves in the manner that you desire.

The design seems to me to be poor. For instance

Length('a;'.Split([';'])) = 1

and yet

Length(';a'.Split([';'])) = 2

This asymmetry is a clear indication of poor design. It's astonishing that testing did not identify this.

The fact that the design is so clearly suspect means that it may be worth submitting a bug report. I'd expect it to be denied since any change would impact existing code. But you never know.

My recommendations:

  1. Use your own split implementation that performs as you require.
  2. Submit a bug report.

Whilst System.StrUtils.SplitString does what you want, its performance is not great. That very likely does not matter. In which case you should use it. However, if performance matters, then I offer this:

{$APPTYPE CONSOLE}

uses
  System.SysUtils, System.Diagnostics, System.StrUtils;

function MySplit(const s: string; Separator: char): TArray<string>;
var
  i, ItemIndex: Integer;
  len: Integer;
  SeparatorCount: Integer;
  Start: Integer;
begin
  len := Length(s);
  if len=0 then begin
    Result := nil;
    exit;
  end;

  SeparatorCount := 0;
  for i := 1 to len do begin
    if s[i]=Separator then begin
      inc(SeparatorCount);
    end;
  end;

  SetLength(Result, SeparatorCount+1);
  ItemIndex := 0;
  Start := 1;
  for i := 1 to len do begin
    if s[i]=Separator then begin
      Result[ItemIndex] := Copy(s, Start, i-Start);
      inc(ItemIndex);
      Start := i+1;
    end;
  end;
  Result[ItemIndex] := Copy(s, Start, len-Start+1);
end;

const
  InputString = 'asdkjhasd,we1324,wqweqw,qweqlkjh,asdqwe,qweqwe,asdasdqw';

var
  i: Integer;
  Stopwatch: TStopwatch;

const
  Count = 3000000;

begin
  Stopwatch := TStopwatch.StartNew;
  for i := 1 to Count do begin
    InputString.Split([',']);
  end;
  Writeln('string.Split: ', Stopwatch.ElapsedMilliseconds);

  Stopwatch := TStopwatch.StartNew;
  for i := 1 to Count do begin
    System.StrUtils.SplitString(InputString, ',');
  end;
  Writeln('StrUtils.SplitString: ', Stopwatch.ElapsedMilliseconds);

  Stopwatch := TStopwatch.StartNew;
  for i := 1 to Count do begin
    MySplit(InputString, ',');
  end;
  Writeln('MySplit: ', Stopwatch.ElapsedMilliseconds);
end.

The output of a 32 bit release build with XE7 on my E5530 is:

string.Split: 2798
StrUtils.SplitString: 7167
MySplit: 1428

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...