Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
896 views
in Technique[技术] by (71.8m points)

delphi - Finding common elements in two arrays

I’ve declared a type similar to the following.

type
  TLikes = record
    Name            : string[20];
    favColours  : array of string[20];

  faves     = array of TLikes;

Once the records are populated I save them to a binary file so the structure is like that shown below.

[John],     [Green]     [White]     [Blue]
[Paul],     [Blue]      [Red]       [White]     [Green]
[David],    [Red]       [Blue]      [Green]
[Bob],      [White]     [Blue]
[Peter],    [Blue]      [Green]     [Red]

It’s easy to find out what colours David, for example, likes. A small problem occurs when I want the to know who likes blue. So what I’ve done is build a second file, like so …

[Blue],     [John]      [Paul]      [David]     [Peter]         [Bob]
[Red],      [David]     [Paul]      [Peter]
[White],    [Bob]       [David]     [John]      [Paul]
[Green],    [John]      [David]     [Paul]      [Peter]

But something is telling me, I shouldn’t really need to create a second file / data structure, it just seems inefficient.

Here’s a bigger issue ….

What if I need to find who likes any combination of what David likes? My results would be …

Blue and red and green  =   Paul, David, Peter
Blue and red            =   Paul, David, Peter
Blue and green          =   John, Paul, David, Peter
Red and Green           =   Paul, David, Peter

My question is.

Is there a better way to structure the data / records so I can figure out what Bob and Paul have in common (Blue and White) or what red and white have in common (David and Paul) ?

I guess I need to point out that I have tried to simplify the example above. In reality the data for Tlikes.Name will be strings like …

‘decabbadc’
‘bacddbcad’
‘eebadeaac’

There are something in the order of 200k+ of these strings. And the Tlikes.FavColours data is a filename (there are around 2k of these files). The file name indicates a file that contains the Tlikes.Name string.

I want to be able to retrieve a list of file names given a Tlikes.Name string or a list of strings given a file name.

NB – Something is drawing me to ‘sets’ but from the little I understand, I’m limited in the number of elements in sets, am I on the right track ?

Thank you for taking the time to read the post.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here is a generic set, TSet<T> which could be used as a tool to get relations between your data.

TSet<T> can hold data of simple types, not restricted to byte size as the normal Set type.

Supports:

  • Include (addition)
  • Exclude (subtraction)
  • Intersect (mutual inclusion, multiplication)
  • Symmetrical difference (mutual exclusion, xor)
  • Test for contains (in operator)
  • Test for equality (equal operator)
  • Test for superset of (>= operator)
  • Test for subset of ( <= operator)
  • Sorting
  • BinarySearch

Use TSet<T> to benchmark your application.


unit GenericSet;

interface

Uses
  System.Generics.Defaults,
  System.Generics.Collections;

Type
  TSet<T> = record
    // Include (union)
    class operator Add(const aSet: TSet<T>; aValue: T) : TSet<T>; overload;
    class operator Add(const aSet: TSet<T>; const aTArr: TArray<T>) : TSet<T>; overload;
    class operator Add(const aSet1: TSet<T>; const aSet2: TSet<T>) : TSet<T>; overload;
    // Exclude
    class operator Subtract(const aSet: TSet<T>; aValue: T): TSet<T>; overload;
    class operator Subtract(const aSet: TSet<T>; const aTArr: TArray<T>) : TSet<T>; overload;
    class operator Subtract(const aSet1: TSet<T>; const aSet2: TSet<T>) : TSet<T>; overload;
    // left in right, i.e right.Contains(left)
    class operator In(aValue: T; const aSet: TSet<T>): Boolean; overload;
    class operator In(const aTArr: TArray<T>; const aSet: TSet<T>): Boolean; overload;
    class operator In(const aSet1: TSet<T>; const aSet2: TSet<T>): Boolean; overload;
    // Intersect, mutually common, A and B
    class operator Multiply(const aSet: TSet<T>; aValue: T): TSet<T>; overload;
    class operator Multiply(const aSet: TSet<T>; const aTArr: TArray<T>): TSet<T>; overload;
    class operator Multiply(const aSet1,aSet2 : TSet<T>): TSet<T>; overload;
    // Symmetric difference, A xor B = (A+B) - A.Intersect(B)
    class operator LogicalXor(const aSet: TSet<T>; aValue: T): TSet<T>; overload;
    class operator LogicalXor(const aSet: TSet<T>; aTArr: TArray<T>): TSet<T>; overload;
    class operator LogicalXor(const aSet1,aSet2 : TSet<T>): TSet<T>; overload;
    //
    class operator Equal(const aSet: TSet<T>; aValue: T): Boolean; overload;
    class operator Equal(const aSet: TSet<T>; aTArr: TArray<T>): Boolean; overload;
    class operator Equal(const aSetLeft,aSetRight: TSet<T>): Boolean; overload;
    // SubsetOf (Left <= Right)
    class operator LessThanOrEqual(const aSet: TSet<T>; aValue: T): Boolean; overload;
    class operator LessThanOrEqual(const aSet: TSet<T>; aTArr: TArray<T>): Boolean; overload;
    class operator LessThanOrEqual(const aSetLeft,aSetRight: TSet<T>): Boolean; overload;
    // SupersetOf (Left >= Right)
    class operator GreaterThanOrEqual(const aSet: TSet<T>; aValue: T): Boolean; overload;
    class operator GreaterThanOrEqual(const aSet: TSet<T>; aTArr: TArray<T>): Boolean; overload;
    class operator GreaterThanOrEqual(const aSetLeft,aSetRight: TSet<T>): Boolean; overload;
    // Set creator
    class function Create(const aTArr: array of T; checkDuplicates: Boolean = False): TSet<T>; static;
  private
    FSetArray : array of T;
    FSorted : String; // !! Will be predefined as '' (=False) by compiler.
    function GetEmpty: Boolean; inline;
    function GetItem(index: Integer): T; inline;
    function GetItemCount: Integer; inline;
    function GetSorted: Boolean; inline;
    procedure SetSorted( sorted: Boolean); inline;
  public
    // Add
    procedure Include(aValue: T); overload;
    procedure Include(const aTArr: TArray<T>); overload;
    procedure Include(const aTArr: array of T); overload;
    procedure Include(const aSet: TSet<T>); overload;
    // Subtract; A=[1,2,3]; B=[2,3,4]; B.Exclude(A) = B-A = [4]
    procedure Exclude(aValue: T); overload;
    procedure Exclude(const aTArr: TArray<T>); overload;
    procedure Exclude(const aTArr: array of T); overload;
    procedure Exclude(const aSet: TSet<T>); overload;
    // Multiply (A and B) A=[1,2,3]; B=[2,3,4]; B.Intersect(A) = B*A = [2,3]
    function Intersect(aValue: T): TSet<T>; overload;
    function Intersect(const aTArr: TArray<T>): TSet<T>; overload;
    function Intersect(const aTArr: array of T): TSet<T>; overload;
    function Intersect(const aSet: TSet<T>): TSet<T>; overload;

    // A xor B; A=[1,2,3]; B=[2,3,4]; (A+B)-A.Intersect(B) = [1,4]
    function SymmetricDiff(aValue: T): TSet<T>; overload;
    function SymmetricDiff(const aTArr: TArray<T>): TSet<T>; overload;
    function SymmetricDiff(const aTArr: array of T): TSet<T>; overload;
    function SymmetricDiff(const aSet: TSet<T>): TSet<T>; overload;
    // Identical set
    function Equal(aValue: T): Boolean; overload;
    function Equal(const aTArr: array of T; checkDuplicates: Boolean = False): Boolean; overload;
    function Equal(const aTArr: TArray<T>; checkDuplicates: Boolean = False): Boolean; overload;
    function Equal(const aSet: TSet<T>): Boolean; overload;
    //  Self <= aSet
    function SubsetOf(aValue: T): Boolean; overload;
    function SubsetOf(const aTArr: array of T; checkDuplicates: Boolean = False): Boolean; overload;
    function SubsetOf(const aTArr: TArray<T>; checkDuplicates: Boolean = False): Boolean; overload;
    function SubsetOf(const aSet: TSet<T>): Boolean; overload;
    // Self >= aSet
    function SupersetOf(aValue: T): Boolean; overload;
    function SupersetOf(const aTArr: array of T; checkDuplicates: Boolean = False): Boolean; overload;
    function SupersetOf(const aTArr: TArray<T>; checkDuplicates: Boolean = False): Boolean; overload;
    function SupersetOf(const aSet: TSet<T>): Boolean; overload;
    // Is included
    function Contains(aValue: T): Boolean; overload;
    function Contains(const aTArr: array of T): Boolean; overload;
    function Contains(const aTArr: TArray<T>): Boolean; overload;
    function Contains(const aSet: TSet<T>): Boolean; overload;

    procedure Sort; // QuickSort
    function Search( aValue: T): Boolean; // BinarySearch (Set must be sorted)
    procedure Clear;
    property IsSorted: Boolean read GetSorted;
    property IsEmpty: Boolean read GetEmpty;
    property Items[index: Integer]: T read GetItem; default;
    property ItemCount: Integer read GetItemCount;
  end;

implementation

class function TSet<T>.Create(const aTArr: array of T; checkDuplicates: Boolean = False): TSet<T>;
var
  i,j,elements : Integer;
  duplicate : Boolean;
  c : IEqualityComparer<T>;
begin
  if checkDuplicates then
  begin
    c := TEqualityComparer<T>.Default;
    // This will remove duplicates
    SetLength(Result.FSetArray,Length(aTArr));
    elements := 0;
    for i := 0 to High(aTArr) do
    begin
      duplicate := False;
      for j := 0 to Pred(elements) do
      begin
        duplicate := c.Equals(Result.FSetArray[j],aTArr[i]);
        if duplicate then
          Break;
      end;
      if not duplicate then
      begin
        Result.FSetArray[elements] := aTArr[i];
        Inc(elements);
      end;
    end;
    SetLength(Result.FSetArray,elements);
  end
  else
  begin
    SetLength(Result.FSetArray, Length(aTArr));
    for i := 0 to High(aTArr) do
      Result.FSetArray[i] := aTArr[i];
  end;
end;

class operator TSet<T>.Add(const aSet: TSet<T>; aValue: T): TSet<T>;
begin
  Result := aSet;
  Result.Include(aValue);
end;

class operator TSet<T>.Add(const aSet: TSet<T>; const aTArr: TArray<T>): TSet<T>;
begin
  Result := aSet;
  Result.Include(aTArr);
end;

class operator TSet<T>.Add(const aSet1, aSet2: TSet<T>): TSet<T>;
begin
  Result := aSet1;
  Result.Include(aSet2);
end;

procedure TSet<T>.Include(aValue: T);
begin
  if not Contains(aValue) then begin
    SetLength(FSetArray,Length(FSetArray)+1);
    FSetArray[High(FSetArray)] := aValue;
    SetSorted(False);
  end;
end;

procedure TSet<T>.Include(const aSet: TSet<T>);
begin
  if Self.IsEmpty then
    Self := aSet
  else
    Include(aSet.FSetArray);
end;

procedure TSet<T>.Include(const aTArr: TArray<T>);
var
  i : Integer;
begin
  for i := 0 to High(aTArr) do
    Self.Include(aTArr[i]);
end;

procedure TSet<T>.Include(const aTArr: array of T);
var
  i : Integer;
begin
  for i := 0 to High(aTArr) do
    Self.Include(aTArr[i]);
end;

procedure TSet<T>.Exclude(const aTArr: TArray<T>);
var
  i : Integer;
begin
  for i := 0 to High(aTArr) do
    Exclude(aTArr[i]);
end;

procedure TSet<T>.Exclude(const aTArr: array of T);
var
  i : Integer;
begin
  for i := 0 to High(aTArr) do
    Exclude(aTArr[i]);
end;

procedure TSet<T>.Exclude(const aSet: TSet<T>);
begin
  Exclude(aSet.FSetArray);
end;

procedure TSet<T>.Exclude(aValue: T);
var
  i : Integer;
  c : IEqualityComparer<T>;
begin
  c := TEqualityComparer<T>.Default;
  for i := 0 to High(FSetArray) do
  begin
    if c.Equals(FSetArray[i],aValue) then
    begin
      SetLength(FSetArray,Length(FSetArray)); // Ensure unique dyn array
      if (i < High(FSetArray)) then
      begin
        FSetArray[i] := FSetArray[High(FSetArray)]; // Move last element
        Self.SetSorted(False);
      end;
      SetLength(FSetArray,Length(FSetArray)-1);
      Break;
    end;
  end;
end;

class operator TSet<T>.Subtract(const aSet1, aSet2: TSet<T>): TSet<T>;
begin
  Result := aSet1;
  Result.Exclude(aSet2.FSetArray);
end;

class operator TSet<T>.Subtract(const aSet: TSet<T>;
  const aTArr: TArray<T>): TSet<T>;
begin
  Result := aSet;
  Result.Exclude(aTArr);
end;

class operator TSet<T>.Subtract(const aSet: TSet<T>; aValue: T): TSet<T>;
begin
  Result := aSet;
  Result.Exclude(aValue);
end;

class operator TSet<T>.In(aValue: T; const aSet: TSet<T>): Boolean;
begin
  Result := aSet.Contains(aValue);
end;

class operator TSet<T>.In(const aTArr: TArray<T>; const aSet: TSet<T>): Boolean;
begin
  Result := aSet.Contains(aTArr);
end;

class operator TSet<T>.In(const aSet1: TSet<T>; const aSet2: TSet<T>): Boolean;
begin
  Result := aSet2.Contains(aSet1.FSetArray);
end;

function TSet<T>.Contains(aValue: T): Boolean;
var
  i : Integer;
  c : IEqualityComparer<T>;
begin
  if IsSorted then
  begin
    Result := Search(aValue);
  end
  else
  begin
    Result := false;
    c := TEqualityComparer<T>.Default;
    for i := 0 to High(FSetArray) do
      if c.Equals(FSetArray[i],aValue) then
        Exit(True);
  end;
end;

function TSet<T>.Contains(const aTArr: array of T): Boolean;
var
  i: Integer;
begin
  Result := High(aTArr) >= 0;
  for i := 0 to High(aTArr) do
  begin
    if IsSorted then
      Result := Search(aTArr[i])
    else
      Result := Contains(aTArr[i]);
    if not Result then
      Exit(false);
  end;
end;

function TSet<T>.Contains(const aTArr: TArray<T>): Boolean;
var
  i : Integer;
begin
  Result := High(aTArr) >= 0;
  for i := 0 to High(aTA

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...