Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
168 views
in Technique[技术] by (71.8m points)

azure data lake - how to combine different schemas

I'm using a custom OUTPUTTER to generate XML from my "flat data" like so:

SELECT *..
OUTPUT @all_data
TO "/patient/{ID}.tsv"
USING new Microsoft.Analytics.Samples.Formats.Xml.XmlOutputter("Patient");

Which generates individual files that look like this:

<Patient>
    <ID>5283293478</ID>
    <ANESTHESIA_START>09/06/2019 11:52:00</ANESTHESIA_START>
    <ANESHTHESIA_END>09/06/2019 14:40:00</ANESHTHESIA_END>
    <SURGERY_START_TIME>9/6/2019 11:52:00 AM</SURGERY_START_TIME>
    <SURGERY_END_TIME>9/6/2019 2:34:00 PM</SURGERY_END_TIME>
    <INCISION_START>9/6/2019 12:45:00 PM</INCISION_START>
    <INCISION_END>9/6/2019 2:18:00 PM</INCISION_END>
</Patient>

A separate script is generating data like this:

SELECT *..
OUTPUT @other_data
TO "/charge/{ID}.tsv"
USING new Microsoft.Analytics.Samples.Formats.Xml.XmlOutputter("Patient");

Yielding files that look like this:

<Charge>
    <ID>5283293478</ID>
    <PROVIDER_TYPE>CRNA</PROVIDER_TYPE>
</Charge>
<Charge>
    <ID>5283293478</ID>
    <PROVIDER_TYPE>Student Nurse Anesthetist</PROVIDER_TYPE>
</Charge>

As you can see, the files that are being created are:

/patient/{ID}.tsv
/charge/{ID}.tsv

How do I concatenate the two sets of files based on ID?

The result I'd like is:

<Patient>
    <ID>5283293478</ID>
    <ANESTHESIA_START>09/06/2019 11:52:00</ANESTHESIA_START>
    <ANESHTHESIA_END>09/06/2019 14:40:00</ANESHTHESIA_END>
    <SURGERY_START_TIME>9/6/2019 11:52:00 AM</SURGERY_START_TIME>
    <SURGERY_END_TIME>9/6/2019 2:34:00 PM</SURGERY_END_TIME>
    <INCISION_START>9/6/2019 12:45:00 PM</INCISION_START>
    <INCISION_END>9/6/2019 2:18:00 PM</INCISION_END>
</Patient>
<Charge>
    <ID>5283293478</ID>
    <PROVIDER_TYPE>CRNA</PROVIDER_TYPE>
</Charge>
<Charge>
    <ID>5283293478</ID>
    <PROVIDER_TYPE>Student Nurse Anesthetist</PROVIDER_TYPE>
</Charge>
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you have the 2 files, you can simple extract both (using id)

DECLARE @patient string ="/patient/{Id}.tsv";
DECLARE @charge string ="/charge/{Id}.tsv";

@patients =
EXTRACT Id string, content string FROM @patient USING Extractors.Text();

@charges =
EXTRACT Id string, content string FROM @charge USING Extractors.Text();

Then you can simple join by id and concatenate patients and charges and output it.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...